0% found this document useful (0 votes)

40 views

Algorithms in Advanced AI - RNV Jagan Mohan

Algorithms in Advanced Artificial Intelligence is a compilation of research papers addressing contemporary challenges and methodologies in various fields of AI, including Machine Learning, Deep Learning, and Blockchain technology. The book emphasizes the application of AI in diverse areas such as healthcare, finance, and security, utilizing strategies like Thinking Humanly and Acting Rationally. It features contributions from multiple authors and covers a wide range of topics from disease detection to network security and fake news detection.

Uploaded by

RAVI KUMAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

Algorithms in Advanced AI - RNV Jagan Mohan

Uploaded by

RAVI KUMAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 547

ALGORITHMS IN

ADVANCED ARTIFICIAL INTELLIGENCE

Algorithms in Advanced Artificial Intelligence is a collection of papers on emerging issues, challenges, and new methods in
Artificial Intelligence, Machine Learning, Deep Learning, Cloud Computing, Federated Learning, Internet of Things, and
Blockchain technology. The book addresses the growing attention to advanced technologies due to their ability to provide
“paranormal solutions” to problems associated with classical Artificial Intelligence frameworks. AI is used in various subfields,
including learning, perception, and financial decisions. It uses four strategies: Thinking Humanly, Thinking Rationally,
Acting Humanly, and Acting Rationally. The authors address various issues in ICT, including Artificial Intelligence, Machine
Learning, Deep Learning, Data Science, Big Data Analytics, Vision, Internet of Things, Security and Privacy aspects in AI, and
Blockchain and Digital Twin Integrated Applications in AI.
ALGORITHMS IN
ADVANCED ARTIFICIAL INTELLIGENCE

Edtitors
Dr. R. N. V. Jagan Mohan
Dr. Vasamsetty Chandra Sekhar
Dr. V. M. N. S. S. V. K. R. Gupta
First edition published 2024
by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

and by CRC Press

2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431

© 2024 selection and editorial matter, Dr. R. N. V. Jagan Mohan, Dr. Vasamsetty Chandra Sekhar and Dr. V. M. N. S. S. V. K.
R. Gupta; individual chapters, the contributors

CRC Press is an imprint of Informa UK Limited

The right Dr. R. N. V. Jagan Mohan, Dr. Vasamsetty Chandra Sekhar and Dr. V. M. N. S. S. V. K. R. Gupta to be identified as the
authors of the editorial material, and of the authors for their individual chapters, has been asserted in accordance with sections
77 and 78 of the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical,
or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or
retrieval system, without permission in writing from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright
Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on
CCC please contact mpkbookspermissions@tandf.co.uk

Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification
and explanation without intent to infringe.

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-1-032-86798-4 (pbk)

ISBN: 978-1-003-52923-1 (ebk)

DOI: 10.1201/9781003529231

Typeset in Times LT Std

by Aditiinfosystems
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Contents

1. Convolutional Neural Networks Detect Alzheimer’s disease by Analyzing Facial Expressions and
Eye Movements 1
S. V. Swamy Kadali, R. N. V. Jagan Mohan and Lakshmi M.
2. Self-caring Autonomous Medicinal and Aromatic Plants (MAP) Nursery Using Arduino Microcontroller 8
Gidla Sudheer Babu, A. V. S. S. Varma, B. V. Ramana and Srilali Siragam
3. Segment Anything: GPT-3 and Logistic Regression Approach for Skin Cancer Detection 20
V. N. V. Sri Harsha, S. Rao Chintalapudi and V. S. Manoj Kumar Chenna
4. Enhancing Metric Learning Reliability for Pose-Oriented Face Recognition by Visual
Assessment of Tendency 25
Pinisetty Rajasekhar and V. Ravindranath
5. Verifiable Secure Vehicle Connectivity Using Machine Learning Framework for Internet of Vehicles 30
Lanka Divya, Priyadarshini Voosala, R. Shiva Shankar, Ch. Ravi Swaroop
6. Disease Detection In Dental Patients Using Machine Learning Algorithms Through Image Analysis 36
Khadar Alisha Sheik and V. Kiran Kumar
7. Early Disease Diagnosis in Tomato Crops Using AI-Based Deep CNN 44
T. V. K. P. Prasad, V Dilip Kumar, T. Srinivasa Rao, Gude Sujatha and T. K. Priyanka
8. Improvement Over K-Means Algorithm Over Complex Data 49
D. D. D. Suribabu, T. Hitendra Sarma and B. Eswara Reddy
9. Visual Representation of Lung Cancer Image Classification Using Artificial Neural Network 54
B. Nandana Kumar, K. Surya Ram Prasad and G. V. Satya Sriram
10. Machine Learning Improve Predictive Analysis of Diabetes Disease 59
K. Durga Bhavani, CH. Vinod Varma and B. Mounika
11. Tackle Comorbid Obesity in T2DM by Applying New Strategies to Optimize Glycaemic Control and
Weight Management 64
Yugandhar Bokka, R. N. V. Jagan Mohan and M. Chandra Naik
12. A Literature Survey on Deep Learning Approach Used for Audio-to-Sign Conversion with
Gesture Recognition for the Deaf and Dumb 68
B. Veerendra and D. Ramakrishna
13. Federated Learning Approach Based on the MFCC for Speech Emotion Recognition 77
Banda SNV Ramana Murthy and Veluri Ravi Kishore
14. Automated Object Recognition with IoT for Visually Impaired Users 82
JMSV Ravi Kumar, M. Babu Reddy, M. Srikanth and D. Ratna Giri
15. Deep Learning Approach for Early Detection and Diagnosis of Teenager Interstitial Lung Disease 88
Ramesh Alladi, R. N. V. Jagan Mohan and K. V. Ramana
16. Robust Object Detection in Medical Imaging: Cross-Measure Refinement with Edge Detection and SSD 94
Bhanurangarao M. and Mahaveerakannan R.
17. AI-Based Breast Cancer X-Ray Image Detection Using Generative Adversarial Attacks 104
V. S. R. K. Raju Dandu, R. N. V. Jagan Mohan and M. Chandra Naik
vi Algorithms in Advanced Artificial Intelligence

18. Promotion of Graduate Placement Through Academics by Improving Performance Using

Artificial Neural Networks 112
Chandra Sekhar K., K. Satyanarayana Raju, P. Subba Raju, M. Krishna Satya Varma and
K. Laxmipathi Raju
19. Open AI’s Large Language Model to Improve Payroll and HR Processes 118
Lokesh Sai Kiran Vatsavai and Srihari Varma Mantena
20. A Novel Blockchain-Based Approach for Secure and Efficient Electronic Medical Record Sharing 124
Hussein EL Ghor, Mohamed Daher and Bilal Nakhal
21. A Classifying Gender Crimes with AdaBoost and Back Propagation Algorithms 133
Dileep Kumar Kadali, R. N. V. Jagan Mohan and M. Chandra Naik
22. Identifying Tremor Disease in Neurological Disorders Using Finger Gesture Images 140
P. Sumithabhashini, M. V. Vijaya Saradhi, Ramesh Alladi and Swajan Reddy
23. An Effective Machine Learning Technique that uses Emotive Faces in order to Study Crimes 145
C. Syamsundar Reddy and G. Anjan Babu
24. Increasing the Reliability of Intercropping in Agriculture Using Machine Learning 150
M. Srikanth, R. N. V. Jagan Mohan and M. Chandra Naik
25. Retrieval Augmented Generation Classification Algorithm for Fake News Detection 158
Ravisankar Malladi, V. T. Ram Pavankumar, M. Arulselvi and Konatham Sumalatha
26. Predictive AI Treatment for Kidney Tumors with Privacy Protection 162
K. V. Nageswari, R. N. V. Jagan Mohan and Bhramara Bar Biswal
27. Developing a Hybrid Approach to Assess Changes in Pomegranate Quality 168
Sai Prapulla Seshank Adivi, V. M. N. S. S. V. K. R. Gupta and A. Bala Krishna
28. Artificial Intelligence-Based Communication through Cat Facial Expressions 176
K. Bhargavi, Ch. V. Phani Krishna and Bandla Srinivasa Rao
29. Convolutional Neural Networks for the Identification of Skin Disorders 180
A. Aswini Priyanka
30. Machine Learning-Based Approach for Detecting Online Payment Fraud 183
V. S. Naresh, G. Venkata Sridevi, P. Srinivasarao, N. Hema Kiran,
CH. Sai Babu and P. Lazar Dan
31. Secure Loan Approval Prediction: A Privacy-Preserving Machine Learning Approach 190
V. S. Naresh, K. Sushmadi Lakshmi, S. Swathi Rathnam, G. Lakshmi Ishwarya,
D. Kirankumar and T. Swathi Ratnam
32. AI with Edge Computing-Driven Development in Healthcare Analysis 196
K. Vijaya Naga Valli and L. Sujihelen
33. Big Image: Large-Scale Skin Disease Image Classification in Medical Imaging and
Healthcare Using CNN and Transformers 203
K. Satyanarayana Raju, K. Chandra Shekar, K. Laxmipathi Raju, M. Krishna Satya Varma,
P. Subba Raju and Sumitra Srinivas Kotipalli
34. AI Driven Load Distribution for Federated Network on Electronic Health Records 210
S. Suryanarayanaraju, M. Chandra Naik and R. N. V Jagan Mohan
35. Smartphone-based Deep Learning Models for the Early Detection of Bubonic Plague and
Skin Diseases: A Safer, More Accessible, and Affordable Approach 217
N. V. Ratnakishor Gade and Mahaveerakannan R.
36. Kids Affected by Uncommon Illnesses Like Autism: Pregnant Women’s Identification
through Lasso Regression 225
P. Jahnavi, M. Chandra Naik and P. Bharat Siva Varma
37. Blind People Assistant: Real-Time Objects Detection and Distance Estimation with Voice Feedback 230
Hemalatha Indukuri, K. Kishore Raju, P. KavyaSri, M. Srija, K. Srujana and P. SivaPriya
Contents vii

38. Standard Encryption Methodologies to Process Multi-Modality Medical Images for

Diagnosing in Telemedicine 238
P. Shyamala Madhuri, B. Amutha and D. J. Nagendrakumar
39. Enhancing Dyslexia Detection and Intervention through Deep Learning: A Comprehensive
Review and Future Directions 249
Pavan Kumar Varma Kothapalli, Cheepurupalli Raghuram and Boddu LV Siva Rama Krishna
40. A Study of YOLO (You Only Look Once) to YOLOv8 257
Immidisetty V. Prakash and M. Palanivelan
41. Prediction of Endangered Species Using Artificial Intelligence 267
Yallamati Prakasa Rao, M. V. V. S. Subrahmanyam and Tvramana
42. Early Detection of Alzheimer’s Disease through Tau-PET Image Analysis Using CNN 273
M. Janakidevi, Ramalinga Swamy Cheruku and Ch. Rami Naidu
43. Computational Analysis and Identification of Specific MMP Targets in Tumours at Multiple Stages 277
G. Nirmala, Deepak Nedunuri, K. Satyanarayana, Ch. Madhava Rao and Y. Butchi Raju
44. Exploring the Rise of Cryptocurrencies with Blockchain Technology 282
V. Priyadarshini, R. Shiva Shankar, P. Neelima, N. Deshai and D. Ravibabu
45. Mitigating Misinformation: An Advanced Analytics Framework for Proactive Detection of
Fake News to Minimize Misrepresentation Risks 289
R. Shiva Shankar, G. Mahesh, V. Maheswararao, N. Silpa and K V S Murthy
46. Summarization of Legal Texts by Using Deep Learning Approaches 299
Nilambar Sethi, V. Sivarama Raju Vetukuri, R. Shiva Shankar and R. Rajender
47. Optimizing Diabetes Prediction through Intelligent Feature Selection: A Comparative Analysis of
Grey Wolf Optimization with AdaBoost and Ant Colony Optimization with XGBoost 311
Chigurupati Ravi Swaroop, Vemuri Jayamanasa, R. Shiva Shankar, M. Ganesh Babu,
Vahiduddin Shariff and N S Koti Mani Kumar
48. Real-Time Sign Language Translation through Deep Learning 319
Sujatha B., Leelavathy N., K. Navya Sri, G. Jagan Mohan and K. Bosu Babu
49. Ensuring Data Privacy in the Cloud: Authprivacychain’s Blockchain Access Control 329
R. Tamilkodi, K. Surya Kala, T. Durga Sukanthika, B. Aanantha Sai Datta Kiran,
V. Hemanth Reddy and K. Srimani Neha
50. Optimizing Cloud Load Balancers for Reduced Network Latency 337
V. Murali Mohan, Radha Yaraguti, Silpa Sharon Chinta and Bhargavi Jonnavithula
51. Boosting Precision: Strategies for Improving Spam Detection in Cloud-Based Email Services 343
V Murali Mohan, Rohitha papolu, Sowjanya Malleboina and Sravya Madiraju
52. Crafting Personalized Film Suggestions 350
R. Tamilkodi, A. Harika, Ch. Rohith, G. Nithin, K. Mahesh, A. Anvitha and N. Lohitha
53. A Comprehensive Approach to Detect SQL Injection Attacks Using Enhanced Snort Rules 357
T. Srinivasarao, Shrija Madhu, K. Kalyani Vishalakshi, Preetish Madhu,
K. Satya Sai DurgaManikanta and P. Sumanth Yadav
54. ARP and DNS Spoofing Detection with Attacker IP Capturing 363
T. Srinivasarao, N. Leelavathy, S. Kailash Chandra Sri Satya Dev, I. Om Ganesh,
P. Sai Aditya and P. Sai Krishna
55. A Comprehensive Review of Advanced Artificial Intelligence Integration in ICT Systems:
Methodologies, Applications, and Future Directions 369
Gopisetty Pardhavika and Prisicilla.R
56. Enhanced Network Security: Machine Learning-Based DDOS Detection 376
R. Tamilkodi, A. Harika, B. S. L. D. V. Mythili, G. KarunaKumar,
B. Dileep Kumar and S. Sri Harshitha
viii Algorithms in Advanced Artificial Intelligence

57. Enhancing Network Security: Deep Ensemble-Based Attack Detection Framework 382
R. Tamilkodi, S. Ratalu, Gandham Santoshi, Vysyaraju Sarath Raju, Allampalli V M Mukesh Rao, and
Rampa Aditya Raghava Koundinya
58. Early-Stage Chronic Kidney Disease Detection using Machine Learning with Bigdata 388
Mamatha B and Sujatha P Terdal
59. An MDB-KMC and Firefly-Based Clustering Approach for Energy Optimization in
Wireless Sensor Networks 396
Veeraiah T., Sudhamsu Mouli and M. P. Singh
60. Software Requirements Based Software Effort Estimation using RSLU-GNL-GRU in
Software Project Management 401
K. Harish Kumar and K. Srinivas
61. The Evolution and Impact of Large Language Models in Artificial Intelligence 410
Chaitanya K. and Krishna Jayanth Rolla
62. Several Machine Learning Techniques Used to Forecast Parkinson Disease 418
O. Sri Nagesh, B. Rajarao and Voore Subrahmanyam
63. Fungal Disease Risk Assessment using Data-Driven Methods: Impacts on Food Security and
Crop Devastation 426
Kamidi Jeswanth Kumar
64. Redefining Glaucoma Identification using State-of-the- Art Machine Learning 431
D. Ratna Giri, P. Syamala Rao, J. V. Rama Kumar and JMSV Ravi Kumar
65. Probe Method: A Dependable Economy Data Methodology Feature Selection for Machine Learning 438
Chiranjeevi S. P. Rao Kandula and Srinivas Rao Parnadi
66. Estimating Human Life Expectancy through Sentiment Analysis, Population-based Optimisation,
and Machine Learning Models 444
Meduri Raghu Chandra, G. Jaya Raju, Lanka Atri Datta Ravi Tez and K.Lakshmaji
67. A Distributed-Back Propagation Procedure that uses Climate while Predicting the Spread of
Mosquitoes Using Least Squares Estimation 452
K. Gopala Varma, M. Chandra Naik and R. N. V. Jagan Mohan
68. Unveiling the Efficacy of Machine Learning in Addressing Imbalances in Credit Card
Fraud Detection Data 456
Ch Siva Subrahmanyam, N. Deshai, K. Samatha and J. Tulasi Rajesh
69. Blockchain-driven Security Paradigm: A Robust System Harnessing the Internet of
Medical Things (IoMT) Network for Enhanced E-Healthcare Monitoring 462
Tulasi Rajesh Jonnapalli, N. Deshai K Samatha and B. V. D. S. Shekar
70. Estimating Foreign Export Volume Using Machine Learning for Big Data Business Analytics 471
Yendrapati Geetha
71. Unmasking Deceit: Pioneering Deep Learning Hybrids to Expose Fabricated Reviews in the
Digital Realm 477
N. Deshai and B. Bhaskara Rao
72. YOLO CNN Approach for Object Detection 481
Aluri Dev Ananth, Abhiram Seemakurthi, Sasank Tumma and Prasanthi Boyapati
73. Multi-Crop Analysis Using Multi-Regression via AI-based Federated Learning 487
Mouna Penmetsa and R.N.V. Jagan Mohan
74. Empowering Inclusive Communication: Advancements in Wearable Technology with
GloSign—A Glove-Based Solution for Seamless Sign Language Interaction 492
L V Srinivas, R. Shiva Shankar, N. Deshai, K. Sravani and V. Maheswararao
Contents ix

75. AI-Based Voice Assistant Application for B5G and 6G Free Space Optic Technology is
Competent of Detecting Fake Words 502
R. N. V. Jagan Mohan and Vasamsetty Chandra Sekhar
76. GenerativeAI in Personal Dairy Information Retrieval for Criminal Investigation 507
KVSS Murthy, J. Rajanikanth, R. Shiva Shankar, CH. Ravi Swaroop and D. Ravibabu
77. PCACSO Feature Selection for Prediction of Breast Cancer NAC Response 514
Susmitha Uddaraju, G. P. Saradhi Varma and I.Hemalatha
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

List of Figures

1.1 Facial expressions of alzheimer detection using diffusion models in machine learning 2
1.2 Facial and eye expression of alzheimer patients 3
1.3 Face database of images 4
1.4 The average facial image across all photos, excluding the test photo 4
1.5 Shows the query face (test face) and matching face from the database, together with h the number of
CNN that were used to make the best classification. Convolutional Neural Network is referred to as CNN 4
1.6 The multislice view shows the activation of the hippocampus region in Alzheimer’s disease (AD) stages,
while MCI refers to mild cognitive impairment 5
1.7 The red area depicts the hippocampus in coronal, axial, and sagittal views, aiding in the classification of
Alzheimer’s disease stages 5
1.8 The Montreal Neurological Institute (MCI) coordinate system has identified voxels with the distinguished
stage of Alzheimer’s disease 5
1.9 The proposed CNN Algorithm recognizes red, green, and blue-colored voxels in a brain cutout, indicating
different stages of Alzheimer’s disease (AD), mild cognitive impairment (MCI), and AD 5
1.10 Displays nearly 10 stages of CNN used for the optimal classification of Alzheimer’s disease stages, where
CNN stands for Convolutional Neural Network 5
1.11 Graph for comparing algorithms RGB, YcbCr and CNN 6
2.1 Flow diagram of the self-caring autonomous medicinal and aromatic plants (MAP) nursery using arduino
microcontroller 9
2.2 Block diagram of the Arduino microcontroller 11
2.3 Five plants: (a) lemon grass, (b) basil, (c) aloe vera,
(d) rosemary, and (e) ashwagandha 11
2.4 Circuit diagram of the self-caring autonomous medicinal and aromatic plants nursery using
Arduino Microcontroller 15
2.5 Actuator connection 15
2.6 Flow chart of the methodology 15
2.7 The suggested system’s screen captures of the serial monitor of the Arduino Microcontroller 18
3.1 Disease segment anything: The GPT-3 of computer vision 22
3.2 Relating algorithms RGB, YcbCr and segment anything: GPT-3 23
4.1 Pose oriented cluster images 27
4.2 Face recognition process 28
4.3 Performance in DCT of pose oriented images i.e., clockwise and anti-clockwise 29
5.1 The route is divided into segments connected to SMUs, where vehicle characteristics are transmitted via
Wi-Fi, and data is sent to the cloud for further computation 31
5.2 Illustrates the creation of an SMU using BBN. 32
5.3 Results 34
xii Algorithms in Advanced Artificial Intelligence

6.1 Proposed architecture work flow for the teeth disease 37

6.2 The foundation for feature extraction in YOLO V3 is the Darknet-53 architecture 38
6.3 Intersection is depicted instead of the union ratio 40
6.4 Displaying the model as a bar graph in relation to the four performance metrics 40
6.5 Sample dataset 40
6.6 LabelImg tool 40
6.7 Classes.txt 40
6.8 3.txt(metadata) 40
6.9 Clone Darknet, set it up, and put it together 41
6.10 Configure yolov3.cfg file 41
6.11 Creating obj.names and obj.dada 41
6.12 Create train.txt file 42
6.13 Download pre-trained weights 42
6.14 Training our model 42
6.15 Showing files in YOLOV3 folder 42
6.16 Dental cavities detection using YOLOV3 43
7.1 Image data processing workflow using machine learning 45
7.2 Process of leaf and tomato disease detection 46
7.3 Accuracy of different algorithms 47
7.4 Graph for comparing algorithms RGB, YcbCr, CNN and DeepCNN 47
8.1 Displays the various big data clustering strategies 50
8.2 Graph that compares different clustering algorithms 52
9.1 Layers of artificial neural networks 55
9.2 How use disease discovery to build machine images search engine 56
9.3 Same person different kinds of lung cancer image data set 57
9.4 Lung cancer images error rate for normalization 57
10.1 Artificial neural networks 60
10.2 Confusion matrix for predicted diabetes values 61
10.3 ROC curve for TPR and FPR 62
10.4 Accuracy arc for TPR and FPR 62
11.1 Block diagram for proposed work 65
11.2 Features of dataset 66
11.3 Varying number of neighbours 66
12.1 Automated object recognition with IoT for visually impaired users 70
12.2 Architecture diagram for CNN model 70
12.3 Sample dataset of hand gesture recognition 71
12.4 Trained dataset 71
12.5 Tested dataset 71
12.6 Sample dataset of audio to sign language conversion 72
12.7 Image dataset of Hindi handwritten characters 72
12.8 Sample dataset of the Hindi Handwritten Character ‘ka’ 73
12.9 CSV file of hindi handwritten characters 73
12.10 Flow of audio-to-sign conversion 75
12.11 Flow of hand gesture recognition 75
12.12 Hand gesture recognition of the letter ‘o’ 75
List of Figures xiii

12.13 Hand gesture recognition of the letter ‘x’ 75

12.14 Voice output of hand gesture 75
12.15 Conversion of audio to HandSign 76
13.1 Federated learning architecute 78
13.2 The application of federated learning architecture in a speech emotional 78
13.3 GeMAPS feature extraction part emotion recognition 80
13.4 SI Confusion Matrix of EMOTIONAL DATABASE 80
14.1 Shows block diagram 84
14.2 Bottle and bowl with its accuracy 85
14.3 Shows person and computer 85
14.4 Shows computer and moblie 85
15.1 Least fuzziness detection as a tool for threshold selection 89
15.2 Lung cancer test image 89
15.3 Threshold by fuzzy method of lung cancer image 90
15.4 Threshold by OTSU algorithm 90
15.5 A 16-year-old male patient’s chest radiographs showed diagnostic accuracy due to AI 90
15.6 Lung cancer knowledge distillation 91
15.7 Probe method: A reliable feature selection technique in machine learning 91
15.8 Architecture of VGG 92
15.9 Plotting the training and validation loss 92
15.10 Plotting the training and validation accuracy 92
16.1 AI-based breast cancer X-ray image detection using generative adversarial attacks 95
16.2 Architecture for object detection in medical imaging 96
16.3 Real-time object detection in medical imaging: improving precision with cross-measure refinement and
edge detection 97
16.4 Enhancing object detection in medical imaging: improving diagnoses and patient outcomes with
cross-measure refinement and edge detection 98
16.5 SSD Architecture of several steps 99
16.6 Obtained values for object detection in medical imaging with cross- measure refinement,
edge detection, and the SSD 101
17.1 Breast cancer classification using ELIXR 106
17.2 Real vs. artificial data: a sequence length considering traffic and ping synthesization 107
17.3 Correlation between -1 to 1 data 108
17.4 The data [corr_features].corr() statement creates a correlation matrix, with values displayed in
decimal format 109
17.5 A box plot is drawn using sns.boxplot, with hue set to “target” for visual comparison 109
17.6 Code generates a crossplot using KDE plots 110
18.1 Block diagram representation 113
18.2 Basic architecture of ANN 114
18.3 Architecture diagram of graduate prediction system 115
18.4 Pair plot representation of placement data 115
18.5 Placed vs non-placed students 116
18.6 CGPA Vs placement 116
18.7 Branch wise placement 116
18.8 Male vs female ratio 116
xiv Algorithms in Advanced Artificial Intelligence

18.9 Importance features 116

18.10 Accuracy obtained by ANN 116
19.1 Employee payroll using random forest classification 119
19.2 Optimizing employee payroll model: Different organization axes 120
19.3 Dataset description using Python code 121
19.4 Trained dataset description 121
19.5 Execution results 122
19.6 Execution results 122
20.1 Electronic medical record system model 126
21.1 Crime classification of AdaBoost algorithm 134
21.2 Gender data process using distributed backpropagation 135
21.3 Comparative analysis of crime persons 136
21.4 Training dataset of criminal labels 136
21.5 principal component analysis (PCA) on crime persons image 137
21.6 Independent component analysis (ICA) on crime persons 137
21.7 t-distributed stochastic neighbor embedding (t-SNE) on crime persons 138
22.1 Gesture image using diffusion models 141
22.2 The machine finger gesture images search engine is being developed to detect tremor diseases 142
22.3 Person with different kinds of gesture images data set 143
23.1 Emotional faces 146
23.2 Image classification using CNN 147
23.3 Increasing distance of feature vectors of emotional faces 147
23.4 The metrics obtained for CNN at optimal iterations (5) have been thoroughly examined 148
24.1 Types of multiple crops in agriculture 151
24.2 Optimizing multicrops hyperparameters 152
24.3 Optimizing machine learning model: The different axes 152
25.1 Represent the system architecture 159
25.2 Fake news with RAG 160
26.1 Treatment study of clinical trail process of renal in urologic cancer 164
26.2 T2a:9 cm size of kidney cancer test image 165
26.3 Different patients size of kidney cancer tumor 165
26.4 Reinforcement learning predictive treatment of renal in urologic cancer 166
26.5 (a) pair plot for various features and target variable, (b) Residual Plot on Linear regression,
(c) Feature importances on Random Forest Model 167
27.1 Evaluates the quality of pomegranate using non-destructive approaches, focusing on its (a) Complete fruit,
(b) fresh-cut fruit, (c) aril, (d) seed, (e) oil, and (f) juice 169
27.2 Image processing is used to identify pomegranates on a tree, as part of a case study on sustainable
closed loop supply chains in Iran [A. Source image (pomegranate tree). B. Black and white picture
(segmentation, color indexing). C. Picture after the applied threshold (figure analysis). D. Location
of geometric centers (estimated count of the pomegranates)]. 170
27.3 Harvest of pomegranate fruit’s virtual twin 170
27.4 Proposed architecture for the solution 171
27.5 Fruit images classification 172
27.6 Three layer feed forward neural network 173
28.1 Cat facial expressions using diffusion in machine learning 177
28.2 How to provide for cat expressions images data to a transformer 178
List of Figures xv

28.3 Graph for relative algorithms CNN and ViT 179

29.1 Skin disease patient image detection using CNN 181
29.2 The graph compares the performance of RGB, YcbCr, and CNN algorithms 182
30.1 Distribution of transaction type 186
30.2 Distribution of the step column using Plot 186
30.3 Random forest confusion matrix 187
30.4 False positive rate 188
30.5 Precision recall curve 188
31.1 Logistic regression sigmoid function 192
31.2 Training data privacy 192
31.3 Testing data privacy 193
31.4 Confusion matrix 194
31.5 ROC curve 194
32.1 Health data process of machine learning architecture 198
32.2 Health data process using RNN encoder-decoder architecture 199
32.3 RNNs on edge computing are being utilized to minimize loss value and health data 200
32.4 Edge computing in healthcare data accuracy 200
32.5 Linear regression line for Age VS Blood Pressure 201
33.1 Skin disease in medical image using CNN 204
33.2 Skin disease in medical Image using CNN 205
33.3 The ViT model divides skin disease medical images into patches, linearly embeds them, adds
position embeddings, and uses a transformer encoder for classification 206
33.4 Outcomes of training loops 207
33.5 Outcomes of training loops for 20 epochs 208
34.1 Federated iterative learning process 211
34.2 Architecture of federated learning 212
34.3 Coefficients for calculating respond time 216
35.1 Flow Diagram for proposed work 218
35.2 Deep learning using smartphones for early skin disease and bubonic plague detection 219
35.3 Values obtained for testing accuracy, training and validation accuracy, training and validation LOSS 220
35.4 Various images categorizations 221
35.5 Bar plot for symptom scores for patients 223
36.1 Workflow diagram of machine learning 226
36.2 Feature selection based on lasso 228
37.1 Cocodata set examples 234
37.2 Output snapshot of the precision of a object cup is 99% 235
37.3 Output snapshot of the precision of a object remote is 98% 235
37.4 Output snapshot of the precision of a object bed is 96% 235
37.5 Output snapshot of the precision of a object TV is 96% 235
37.6 Output snapshot of accuracy of cup over time 235
37.7 Output snapshot of accuracy of object remote over time 236
37.8 Output snapshot of accuracy of object bed over time 236
37.9 Output snapshot of accuracy of object TV over time 236
38.1 Types of attacks on medical images 240
38.2 Literature taxonomy: Medical image confidentiality technology 242
xvi Algorithms in Advanced Artificial Intelligence

39.1 Brain Image with Dyslexia 249

39.2 Characteristics of Dyslexia 250
39.3 Traditional way and deep learning-based Diagnosis of Dyslexia patient 252
40.1 Example of object detection images 258
40.2 Grid-based approach 258
40.3 Functioning of YOLO 259
40.4 Timeline of YOLO variants 260
40.5 YOLO v2 results in comparison to the original version and other modern models [1] 260
40.6 YOLO3 261
40.7 A comparison between the YOLO v4 and other cutting-edge object detectors [3] 261
40.8 A comparison between the YOLO v6 and other cutting-edge object detectors [4] 261
40.9 A comparison between the YOLO v7 and other cutting-edge object detectors [5] 262
40.10 YOLO v8’s performance in comparison to other cutting-edge models [6] 262
40.11 YOLO architecture 265
41.1 Screenshot for various observations 269
41.2 Screenshot for various species 269
41.3 Screenshot for combinations of various observations and species 269
41.4 Screennshot of proposed work model 270
41.5 Confusion matrix 270
41.6 Values obtained for random forest 271
41.7 Model accuracy for training and testing 271
41.8 Work flow of propoed model 271
42.1 Three samples of Tau-PET images 274
42.2 Alzheimer disease detection using CNN 274
42.3 The bar chart compares the performance of RGB, DCT, and CNN 276
43.1 Activation of MMP by cysteine switches mechanisms 278
43.2 Observed vs Predicted Activity of validation set obtained for equation number 7 280
43.3 Observed vs Predicted Activity of validation set obtained for equation number.8 281
44.1 Block structure 283
44.2 Send process with block chain 285
45.1 By sorting authentic news stories into newsgroups, the model-training methodology creates
ground-truth sources of knowledge 292
45.2 The following is the procedure for producing activities and subjects with news stories 294
46.1 Flow of abstractive summarization 300
46.2 Overview of semantic-based approach 300
46.3 System flow for LSTM 302
46.4 System model for extractive summarisation 303
46.5 System model for abstractive summarization 304
46.6 Graphs obtained for extractive summarization 306
46.7 Graphs for abstractive summarization 306
46.8 Screen short for test case: original data with extractive summary 307
46.9 Screen short for test case: original data with abstractive summary 307
47.1 GWO + AdaBoost performance metrics 315
47.2 ACO+ XGBoost Performance Metrics 316
47.3 Performance comparision of AdaBoost, HistGradientBoosting and CatBoost 316
List of Figures xvii

48.1 SLR taxonomy and performance factors 320

48.2 SLR modalities 321
48.3 Mediapipe’s hand landmarks: A visual overview [30] 322
48.4 Proposed model of real-time sign language translation through CNN 322
48.5 Sign language detection model training [31] 323
48.6 CNN model for gesture recognition 324
48.7 Exploring the 26 characters of the American sign language alphabet [32] 325
48.8 (a) Training and validation loss, (b) Training and validation accuracy 325
48.9 Evaluation of Model Performance: Precision, Recall, and F1-Score 327
49.1 System architecture 331
49.2 System architecture 332
50.1 Architecture of network latency 338
50.2 Effect of network latency 338
50.3 Flowchart for DNS 339
50.4 Round Robin algorithm 340
50.5 Flow chart for fundamental steps involved in load balancing 341
51.1 Spam emails over the years 344
51.2 Logistic regression algorithm 346
51.3 SVM algorithm 347
52.1 Working of the proposed model 353
52.2 Dataset of movies 354
52.3 Questionaries 354
52.4 Represents the feelings of the user while login 355
52.5 Gives the output after processing the user feelings 355
52.6 Comparison with other systems 355
53.1 Block diagram of methodology 358
53.2 Installation of snort 360
53.3 Rules in local.rules file 360
53.4 Classic SQL injection attack 360
53.5 Blind SQL injection attack 360
53.6 Time-based blind SQL injection attack 360
53.7 Error-based SQL injection attack 360
53.8 Union-based SQL injection attack 361
53.9 Second-order SQL injection attack 361
53.10 Out of band SQL injections attack 361
53.11 Boolean SQL injection attacks 361
53.12 Classic SQL injection detection 361
53.13 Blind SQL injection detection 361
53.14 Time based SQL injection detection 361
53.15 Error based SQL injection detection 361
53.16 Union SQL injection detection 361
53.17 Second order SQL injection detection 361
53.18 Out of band SQL injection detection 361
53.19 Boolean type SQL injection detection 361
54.1 Methodology 365
xviii Algorithms in Advanced Artificial Intelligence

54.2 Python script execution 366

54.3 ARP spoofing on target victim 366
54.4 DNS spoofing on target victim 366
54.5 Victim responses spoofing 366
54.6 ARP and DNS spoofing, Attacker IP detection 366
54.7 Resources utilization 367
54.8 CPU and memory consumption 367
55.1 Advantages of AI for knowledge management (KM) [modified after [4] 370
55.2 Integrating AI methodologies into agriculture 371
55.3 The AI perception-action loop within autonomous vehicles [11] 371
55.4 Bank card fraudulent detection process through ML/DL [12] 372
55.5 Guiding principles for ethical AI [16] 374
56.1 System architecture 378
56.2 Home page 379
56.3 Signup page 379
56.4 Signin page 379
56.5 Main page 379
56.6 Upload input values 379
56.7 Input values 380
56.8 Prediction result 380
56.9 Upload another input values 380
56.10 Prediction result 380
56.11 Accuracy comparison graph 380
56.12 Precision comparison graph 380
56.13 Recall comparison graph 380
56.14 F1 comparison graph 380
57.1 Types of network attacks 383
57.2 System architecture 385
58.1 Factors affecting CKD 389
58.2 CKD early detection proposed model 391
59.1 Graphical analysis of proposed FA based on Energy consumption 399
59.2 Graphical analysis of proposed FA based on throughput 399
60.1 Proposed methodology’s block diagram 402
60.2 RSLU-GNL-GRU 405
60.3 Performance analysis of the proposed RSLU-GNL-GRU 406
60.4 Graphical representation of the proposed RSLU-GNL-GRU (a) Training Time (b) Computational Time 406
60.5 Performance measure 406
60.6 Loss value of the proposed method during (a) training, and (b) testing 407
60.7 Performance Comparison 407
60.8 Performance measure of the proposed ZS-GTBOA 407
60.9 Comparative measure of the proposed framework 408
61.1 GPT-3 architecture 412
61.2 LLM finetuning process 413
62.1 Knowledge distributed data (KDD) 419
62.2 A man with PD displaying a flexed walking poster 419
List of Figures xix

62.3 Parallel coordinates 420

62.4 ROC for classification algorithms 421
62.5 Sieve graph 422
62.6 Hierarchal clustering for fundamental frequencies (Fo) attributes 423
62.7 Hierarchical clustering dendrites showing PD and non-PD data values0-Non PD (Blue colored) and
1-PD (Red colored) 423
62.8 Self organized maps (SOM) for fundamental frequency attributes 424
62.9 Comparision of different DM algorithm values 424
62.10 Time taken to Execute DM algorithms 425
63.1 Images are converted into multi-dimensional matrices for comparison. 427
63.2 Classification of fungal disease images using CNN 428
63.3 Graph for comparing algorithms K-NN, SVM, and CNN 429
64.1 Glaucoma disease using machine learning 432
64.2 Process of model selection for early detection of glaucoma 433
64.3 Normal vison vs glaucoma 433
64.4 Bar graph for various features 434
64.5 Visual acuity measurements on gender 435
65.1 Probe method: A reliable feature selection technique in ML 440
66.1 (a) Strategy of data (b) Data pre-processing 445
66.2 Hstorical diagram for various parmeters 447
66.3 Correlation matrix 447
66.4 Box plot for predictive classes 448
66.5 Heat map for various coorelation features 448
67.1 Trajectory of village nodes of dengue patients distributed back propagation 454
67.2 Dengue cases of patients S-shape in logistic regression 454
68.1 Imbalance data distribution 458
68.2 Distribution of data after random oversampling 458
68.3 Experimental workflow without and with balancing data 459
68.4 XGBoost classifier results after data balancing 460
68.5 Results before applying data balancing 460
68.6 Confusion matrix of XGBoost classifier after applying random oversampling 460
69.1 IoMT system design 463
69.2 Smart contracts for the internet of medical things-enabled electronic healthcare 465
69.4 Precision ratio 466
69.3 The computational framework BC-IoMT-SS 466
69.5 Efficiency ratio 468
69.6 Managing the upload timing of transactions (Tx) on IPFS’s encrypted storage layer for different sizes 468
70.1 Perceptron is trained export product system to respond to certain inputs with certain desired outputs 473
70.2 Large data model compression approach 474
70.3 Export large data process using distributed back propagation approach 474
70.4 Untested export product data 475
70.5 Graph for comparing algorithm 475
71.1 Proposed deep learning hybrid methodology 478
71.2 Proposed deep learning evaluation 478
72.1 One stage object detection 482
xx Algorithms in Advanced Artificial Intelligence

72.2 Transfer learning 482

72.3 YOLO-CNN algorithm 482
72.4 RESNET algorithm 483
72.5 Training graph 484
72.6 Validation graph 484
72.7 Person with mask 484
72.8 Person with mask, wearing incorrectly 484
72.9 Person with mask detected from sideways 485
72.10 Person with mask detected even if it is blocked with hand 485
72.11 Person with mask wearing incorrectly from sideways 485
72.12 People without mask 485
73.1 Federated learning architecture applied in an agriculture environment 488
73.2 Multi-crops multi-class federated learning (MC2FL) 489
74.1 Sign language in america 493
74.2 Flex sensor connection 495
74.3 Sensor placement and flow of data 495
74.4 Accuracy of the gestures 497
74.5 Mean error of K values 498
74.6 Accuracy of k values 499
74.7 GloSign glove 499
75.1 Process of voice assistants using NLP 504
75.2 Text classification using NLP 504
75.3 Transformers are encroaching on Machine Learning 505
75.4 Graph for different accuracy metrics on Transformers: Attention 506
76.1 Process of information retrieval 508
76.2 Uncertain crime investigation using bayes belief network 509
76.3 Retriever augmented generation using LLM 510
77.1 Proposed methodology 516
77.2 Comparison of existing and novel feature selection 518
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

List of Tables

1.1 Graph for comparing algorithms RGB, YcbCr and CNN 5

2.1 Temperature and Humidity levels of 5 plants 14
3.1 Graph for Likening algorithms RGB, YcbCr and Segment Anything: GPT-3 23
4.1 Performance in DCT of pose oriented images i.e., clockwise and anti-clockwise 29
5.1 Real time traffic data 33
6.1 Demonstrates the outcomes of object detection using the YOLOV3 algorithm 40
7.1 Graph for comparing algorithms RGB, YcbCr and DeepCNN 46
8.1 Clustering computational time (in Sec.) in Weka tool 52
10.1 Displays a confusion matrix for predicted diabetes values 61
10.2 Accuracy arc for TPR and FPR 62
12.1 Modules/APIs used 74
16.1 Perfromance of evaluation metrics 97
16.2 Various parameters and its descriptions on different stages 98
16.3 Various hyperparameters, values and its descriptions on various steps 99
16.4 Obtained values for various parameters and its descriptions 100
17.1 GAN of time series data 107
18.1 comprises the parameter for ANN representation 114
21.1 Training Data Set 135
23.1 Police records shows the type of crime in four regions of a west Godavari district 148
23.2 The metrics obtained for CNN at optimal iterations (5) have been thoroughly examined 148
26.1 Authors published works 163
26.2 Different patients size of kidney cancer tumor 165
26.3 Reinforcement learning predictive treatment of renal in urologic cancer 166
27.1 Summarizes training 174
28.1 Graph for relative algorithms CNN and ViT 179
29.1 The graph compares the performance of RGB, YcbCr, and CNN algorithms 182
30.1 Kaggile dataset of online payment 186
30.2 Obtained values for various metrics 187
32.1 Blood pressure of pregnancy women of Age > 45 201
34.1 Data set for load distribution in federated network 215
35.1 Deep learning using smartphones for early identification of skin diseases and the bubonic plague 219
37.1 The architecture of the single shot detector (ssd) with mobilenet as the base feature extractor 233
38.1 Comparison of multi-modality of diagnostic imaging 239
38.2 Types of attacks 241
38.3 Conventional algorithms for encrypting medical images 242
xxii Algorithms in Advanced Artificial Intelligence

38.4 Encryption of medical images using chaotic maps 243

38.5 Encryption methods, optimization methods, Feature 243
38.6 Encryption methods and retrieval methods/technologies 244
38.7 Algorithms being attacked, the attack methods employed, and their results 244
38.8 Type of secret sharing scheme used, and the proposed schemes 244
38.9 Performance metrics 244
38.10 Evaluation & best performance analysis 245
39.1 Examining diverse methods for predicting Dyslexia 251
39.2 Comparison of various datasets of Dyslexia using ML & DL approaches 253
39.3 Evaluation metrics for various models 254
41.1 Comparison of various exisitng works on different datsets 268
42.1 Graph compares the performance of RGB, DCT, and CNN algorithms 275
43.1 Molecular Descriptors data and statistical values of newly proposed model equations 279
43.2 Statistics for equation numbers 7-8 279
43.3 Test set data – Eq7 280
43.4 Test set data – Eq8 280
43.5 FIT Kubinyi data acquired all five QSAR model 281
44.1 Different types of block chain 286
45.1 Article cluster collection for ground truth 295
45.2 The initial filtering found a large percentage of bogus news items 296
47.1 GWO + AdaBoost performance metrics 315
47.2 ACO + XGBoost performance metrics 315
47.3 Performance comparision of GWO + AdaBoost and ACO + XGBoost 316
48.1 Comparing MAE, MSE, and R2 for Various Models 326
54.1 Performance metrics 366
58.1 Summary of research on the detection of CKD 390
58.2 State of the art of CKD detection models 392
58.3 Early studies in ML and CKD detection 392
58.4 Recent breakthroughs in ML and CKD detection 393
58.5 Comparative Analysis of ML Models for CKD Detectionalyzing time-series patient data [30]. 393
60.1 Comparative analysis of proposed RSLU-GNL-GRU 406
60.2 Comparative analysis of proposed PLSKCD-K-means 407
61.1 Comparison of large language models 414
61.2 Generalization capabilities of LLMs 416
62.1 Classified instances based on algorithms 422
63.1 Graph for comparing algorithms K-NN, SVM, and CNN 429
64.1 Glaucoma patient categorization according to several parameters 435
66.1 Machine Learning and Sentiment Analysis for Better Life Expectancy Estimation 446
66.2 Making Reliable Life Expectancy Predictions by Combining Sentiment Analysis with Machine Learning 448
66.3 Estimating Human Life Expectancy through Sentiment Analysis, Population-Based Optimisation, and
Machine Learning Models 448
70.1 Export product commodity 472
70.2 Graph for comparing algorithm 475
73.1 Multi-crops of agriculture 490
74.1 Average sensor values for each gesture 497
List of Tables xxiii

75.1 Graph for comparing algorithms Transformers: Attention 506

76.1 Comparative study with various datasets in DM 510
76.2 Comparative study with various datasets in ML 511
76.3 Personal dairy information of criminals 511
77.1 3 x 3 confusion matrix 516
77.2 Dataset description 517
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

About the Editors

Dr. R. N. V. Jagan Mohan working as Professor in Computer Science and Engineering Department from Sagi Rama Krishnam
Raju Engineering College, China Amiram, Bhimavaram. I have Ph.D completed from Acharya Nagarjuna University since 2015
under the esteemed guidance of Dr.Kurra Raja Sekhara Rao, M.Tech in CSE, University College of Engineering, Jawaharlal
Nehru Technological University, 2020. I have published papers around 43 in various international Journals and national journals.
I have published patents around 6 and 1 is Granted international. Published Books in various international publishers 2 and 6
National publishers. I have guidance in Ph.D from J.N.T.U, Kakinada as Supervisor since 2022 to till date. One Research Project
Completed Project on Dissecting Autism Trajectories in Longitudinal Electronic Health Records, collaboratively in India
and Israel, Govt of India, Ministry of Science and Technology, Dept of Science and Technology. DST-SERB Sponsored
International Conference on Algorithms in Advanced Artificial Intelligence, Organized dates at 22nd -24th December 2023, Dept
of CSE, SRKR Engineering College, Bhimavaram-534204. AICTE Sponsored National Conference on Productivity, Quality,
Reliability, Optimization and Computational Modelling, Organized dates at 18th – 20th December 2019, Dept of CSE & IT,
SRKR Engineering College, Bhimavaram-534204. Three Faculty programs organized Webinar on Blockchain Technology:
Insights and Applications,13th August, 2022 at Dept of CSE, SRKR Engineering College, Bhimavaram. Resource Person by
Dr. Hussein El Ghor, Professor in CSE, Lebanese University, Lebanon. Faculty Development Program on Data Science and Its
Application, Dept of CSE, Sponsored by SRKR Engineering College, June 10th – 15th, 2021. National Seminar Symposia DST
SERB Workshop on Machine Learning Evolve Predictive Data Analytics, Dept of IT, SRKR Engineering College, Sanction
Order No: SSY/2017/001121, Sanctioned Date: 13-12-2017, Organized Date:23rd to 28th, July, 2018. Attended many Faculty
Development Programs.
Dr. Vasamsetty Chandra Sekhar PhD is Professor and Head of the Department of Computer Science and Engineering
Department of Sagi Ramakrishnam Raju Engineering College, Andhra Pradesh, India. He has written and co written multiple
articles for IEEE and Elsevier, two peer-reviewed SCI journals for which he has also served as a reviewer. Additionally, he has
taken part in numerous international conferences. Software engineering and machine learning are two of his research interests.
His main area of study is investigating various IoT and software engineering techniques to address a number of difficult issues
in summarization, design, and analysis. Dr.V. Chandra Sekhar received his M.Tech (Computer Science and Technology) and
PhD degrees from Andhra University in Visakhapatnam. He has over 26 research papers, over book chapters, and one patent
published, one authored book published in peer-reviewed publications. Faculty Development Programs were arranged by him.
Software engineering, machine learning, and the Internet of Things are some of his research interests. Vice-Chair, Computer
Society, IEEE Vizag Bay Section.
Dr. V. M. N. S. S. V. K. R. Gupta PhD is Associate Professor of Computer Science and Engineering Department of Sagi
Ramakrishnam Raju Engineering College, Andhra Pradesh, India. He has written and co-written multiple articles for IEEE
and Elsevier, two peer-reviewed SCI journals for which he has also served as a reviewer. Additionally, he has taken part in
numerous international conferences. Data Mining and Healthcare are two of his research interests. His main area of study is
investigating various techniques to address a number of difficult issues in summarization, and analysis. Dr. Gupta received his
M.Tech (Computer Science and Technology) from Andhra University and PhD degrees from K. L. University in Guntur. He
has over 22 research papers, over three book chapters, and three patents published in peer-reviewed publications. He organized
faculty development programs. Among his areas of interest in research is machine learning.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Convolutional Neural Networks Detect

Alzheimer’s disease by Analyzing Facial
Expressions and Eye Movements
1

S. V. Swamy Kadali1
Research Scholar, School of Computing,
SRM Institute of Science and Technology, Kattankulathur, India
R. N. V. Jagan Mohan2
Associate Professor, Department of CSE,
Sagi Rama Krishnam Raju Engineering College(A),
Lakshmi M.3
Professor & HOD, School of Computing, Department of DSBS,
SRM Institute of Science and Technology, Kattankulathur, India

Abstract: The most common form of severe dementia, Alzheimer’s disease (AD), is a cumulative neurological disorder
because of the degradation and death of nerve cells in the brain tissue, intelligence steadily declines and most of its activities
are compromised in AD. Before diving into the level of AD diagnosis, it is essential to highlight the fundamental differences
between conventional machine learning (ML) and deep learning (DL). This work covers a number of photo-preprocessing
approaches that aid in learning because image processing is essential for the diagnosis of AD. The most crucial kind of neural
network for computer vision used in medical image processing is called a Convolutional Neural Network (CNN). The proposed
study will consider facial characteristics, including expressions and eye movements using the diffusion model, as part of CNN’s
meticulous approach to Alzheimer’s diagnosis. Convolutional neural networks were used in an effort to sense Alzheimer’s
disease in its early stages using a big collection of pictures of facial expressions.
Keywords: Alzheimer’s disease, Machine learning, Computer vision, Deep learning, Convolutional neural network etc.

1. Introduction thanks to computer vision, which teaches them to use data,

cameras and systems pretty than retinas, optic nerves, and the
Artificial intelligence (AI) computer vision allows computers visual brain, can complete these activities. Due to its ability to
and systems to extract meaningful data from digital photos, evaluate numerous objects or procedures per minute though
videos, and other visual inputs and to make recommendations finding hidden flaws or topics, a system that is trained to
or actions based on that data. Robots now have the same check possessions or display an operational asset may swiftly
capacity for comprehension, observation, and staring that beat people in performance. Computer vision is the study of
humans have thanks to computer vision. The longer history the model underlying artificial systems that excerpt statistics
of human eyesight gives it an advantage over computer from images. Building useful computer vision systems is how
vision. Lifetimes of context are beneficial to human sight the scientific discipline of computer vision strives toward put
since they teach the capacity to identify things, gauge their its models into practice.
distance from the observer, detect motion, and evaluate the It strives to develop systems that can automatically recognize,
accuracy of a image. Machines considerably more quickly analyze, and interpret visual data to address issues in a variety
1
sk5379@srmist.edu.in, 2mohan.rnvj@srkrec.edu.in, 3lakshmim2@srmist.edu.in

DOI: 10.1201/9781003529231-1
2 Algorithms in Advanced Artificial Intelligence

of areas. Various facial expressions and eye postures are used 2. Proposed Work
to recognize diseases. Computer vision has undergone a
significant shift because of the application of deep learning The Diffusion model process involves adding noise to a face
for Alzheimer recognition. It is widely used to show image, learning to remove it, and then training a machine
computers how to perceive and make judgments similarly learning model to produce a denoise.CNN is one of the most
to humans. There are instances when “computer vision” popular DL classification techniques. The right dataset and
and “machine vision” are used interchangeably. To speed up the following strategies for categorizing AD can be used
image processing, the technique is frequently coupled with Apply transfer learning after using the feature selector to
AI, machine learning, and deep learning. In-depth discussion separate the features, do it fairly, do it with abstract CNN
of the face and eye expression object recognition request of models, examine the output of the two included models, and
deep learning in computer vision will be provided in this then use the Hyper parameters optimizer with one model.
paper. The limitations of standard cognitive tests’ screening Use the CNN method to recognize facial expressions of
procedures may be reduced using eye-tracking-based emotion in pictures. To a network is delivered a pixilated
paradigms which could help in the early diagnosis of AD by image. Filters in the first convolutional layer enable a feature
Alexandra Wolf, 2023[1]. In AD, nerve cells die and brain map process to be applied to each image pixel. This map is
tissue deteriorates, which will significantly reduce the size of subjected to a second layer of filters to create a third map, and
the brain over a period of time and impairs its major functions. so on until the final layer generates the prediction.
Before analyzing the level of AD diagnosis, it is important
to emphasize the significant differences between classical 2.1 Diffusion Model Using Machine Learning
deep learning (DL) rather than machine learning (ML). The The Diffusion model process involves adding noise to an
inclusion variety of photo preprocessing approaches in this face image and learning to remove it, then training a machine
work helps to improve learning because image preparation is learning model to produce a denoise of face image. The process
essential for the diagnosis of AD. of learning the mean matrix involves assuming a normal noise
The following is how the paper is set up: In part II, a succinct distribution and parametrizing the distribution mean and
explanation of CNN’s Alzheimer detection is provided. standard deviation matrix. This can be divided into a forward
Section III covers the findings of the experiment. Then, in and reverse process. Mathematicians often use the jargon
part IV, they came to conclusions, and in section V, references of physical processes to formalize mathematical concepts,
are made. such as diffusion equations for Fick diffusion, heat diffusion,

Fig. 1.1 Facial expressions of alzheimer detection using diffusion models in machine learning
Convolutional Neural Networks Detect Alzheimer’s disease by Analyzing Facial Expressions and Eye Movements 3

and Brownian motion. The Langevin equation, based on the layer, Albawi [3] selects the class (or label) with the highest
Wiener process, is a stochastic formulation with infinitely probabilities.
small steps and normal distribution of time increments. This To perform well on the classification assignment, CNN
intertwines diffusion with white noise generation, making must be able to handle large datasets by Han [7]. In terms
Machine Learning models called diffusion models. The of transfer learning by Dishashree Gupta, 2023 [4], CNN
diffusion models, which utilize a Gaussian prior to generate includes a variety of models were trained on the Image Net
data, form the core of text-to-image generative models. dataset by Sing [13]. The model’s designer can adapt a pre-
trained CNN model’s parameters (i.e., weights) to the new
2.2 Alzheimer Detection by Convolutional
task.
Neural Network Using Facial Features
The Face and Eye Expressions Algorithm Utilizing
The Classification approach is a crucial role, even though
Convolutional Neural Networks for Image Classification
the quantity of the disease object database has a substantial
as follows: Image input a collection of [height, width, and
impact on how well disease objects are identified. Machine
channel] pixel values.
learning includes deep learning. Since the properties are
automatically extracted, deep learning is more effective Feature Extraction
than traditional machine learning techniques. Additionally, 1. To obtain a feature map, use a convolution neural
“end-to-end learning” using deep learning involves giving network.
the network both tasks and unprocessed data. The majority (a) Convergence (RELU).
of the time, advances in Alzheimer’s disease is made in
(i) Choose a kernel whose size is 4x4 and whose
facial features including face and eye expressions using the
depth matches that of the input array.
convolutional neural network technique.
(ii) Convolutional processing is used to get
Features of the facial and eye expression.
(b) Pooling i.e. max pooling.
(i) After applying the dimensionality reduction
procedure to reduce the feature map’s spatial
size, extract the 2x2 image.
2. To extract low-level features from the image, carry out
the steps stated earlier until the fourth layer, altering the
channel size to a value of 16, 32, 64, or 128.
Classification
1. Smooth output is fed to a feed-forward neural network
Fig. 1.2 Facial and eye expression of alzheimer patients with back propagation in each training phase iteration.
2. Using the SoftMax Classification approach, a trained
A CNN is one of the best neural network techniques for
model is utilized to categorize photos like face and
categorization and recognition of images. Figure 1.1 shows
Eye expressions by identifying their dominating
the layers that make up CNN, including the classifier layer,
characteristics.
the pooling layer, the activation layer, the convolution layer,
and more. According to a study from 2020 [2], the activation
function is then used to decide whether or not to excite the 3. Experimental Result
neuron. In order to learn and handle tasks that get harder, it
The CNN has frequently been applied to issues with facial
adapts the information in a nonlinear way. A critical step in the
recognition. The simplicity and speed of the CNN algorithm
process of extracting feature maps by Ebrahimighahnavieh
make it superior to other face-recognition systems. There
is the convolution layer, which bypasses the learnt filter or
were 100 photos of 10 images (ten images of each person)
kernel with a certain size of the input picture 2019 [5]. By
used in the CNN working out. In Fig. 1.3, the face database
Sharma, 2017 [11], activation functions of the sigmoid, Tanh,
is displayed.
and ReLU kinds can be used to make feature maps. The size
is reduced but the most crucial components are retained by To each grayscale image is transformed obsessed by an NxN
pooling the layers. According to Ebrahim, 2020 [6], they pixel vector. As a result, the data matrix used to apply CNN
belong to the downscale group. All neuron from the layer has 100 columns, each of which contains an image. Estimate
above is linked to each neuron from the layer below in a the mean image of face expressions, as illustrated in Fig. 1.2,
completely connected layer. The final phase of the classifier excluding one image for testing.
4 Algorithms in Advanced Artificial Intelligence

Fig. 1.3 Face database of images

Fig. 1.5 Shows the query face (test face) and matching face
from the database, together with h the number of
CNN that were used to make the best classification.
Convolutional Neural Network is referred to as CNN

The effects of cardiac and respiratory noise were removed

Fig. 1.4 The average facial image across all photos, excluding from the functional data by using a low-pass filter option
the test photo during preprocessing. The subject’s signal was then analyzed
using a generalized linear model.100 smoothed, realigned,
Next, compute the discrete matrix and use the CNN technique
and normalized images were used in the model definition
after subtracting the mean image from the data matrix.
after preprocessing to look at the activation of fMRI data in
Calculate the discrete values and discrete matrix since CNN
SPM. Some fMRI experiment parameters or situations were
used the discrete values and feature vectors from feature
included in the model specification in order to determine the
extraction for recognition. Choose the feature vector and
statistical significance of the brain data obtained from fMRI.
determine the feature value for each image based on the ten
Since the fMRI experiment did not include a unique task-
greatest discrete values. Subtract the mean picture from the
related condition and we used resting-state fMRI data, we
test image and use CNN to compare the result to the feature
assumed dummy contrast in the results phase, which comes
data matrix in order to identify the test image. As seen in
after, to measure brain activity. [1 0] is used to signify this.
Fig. 1.5, the suggested technique will perfectly match the test
The brain regions that were active during the resting-state
image.
experiment are shown in Fig. 1.6.
The ADNI (Alzheimer’s Disease Neuro imaging Initiative)
The hippocampus was chosen as the region of interest for
database was used to analyze real fMRI data. ADNI
the classification of AD phases based on the brain’s active
investigators were involved in the paper’s design, execution,
voxels, as shown in Fig. 1.6.
and data provision even if their writing or analysis was not
included in the final product. In the process of identification of phases of AD three-
dimensional grid clusters of hippocampi are used in CNN
The preprocessing and analysis of fMRI AD data were done
algorithm which classifies the grid similar to face images. The
using versions of MRIcron (version 2021), SPM (Statistical
process is same as mentioned for the face recognition. The
Parametric Mapping), and MATLAB (version 2021, Statistics
results are matched with the stages of AD shown in Fig. 1.7.
Toolbox: Math Works, Massachusetts). Preprocessing and
Figure 1.8 shows the grids recognized by the algorithm where
model specification are the two phases that make up the SPM.
Convolutional Neural Networks Detect Alzheimer’s disease by Analyzing Facial Expressions and Eye Movements 5

RGB grid belongs to MCI (Mild Cognitive Impairment),

stage 1, and stage 2 of AD correspondingly. The Screen plot
depicts CNN used for classification.

Fig. 1.6 The multislice view shows the activation of the

hippocampus region in Alzheimer’s disease (AD)
stages, while MCI refers to mild cognitive impairment

Fig. 1.10 Displays nearly 10 stages of CNN used for the optimal
classification of Alzheimer’s disease stages, where
CNN stands for Convolutional Neural Network

The experimental study compares RGB, YcbCr, and CNN,

in terms of accuracy, sensivity, specificity, precision, and
F1-score. Accuracy measures how well a model performs
Fig. 1.7 The red area depicts the hippocampus in coronal, across all image datasets, while sensitivity measures how
axial, and sagittal views, aiding in the classification of
well it recognizes positive ResNet image samples. Specificity
Alzheimer’s disease stages
measures the model’s capacity to predict true nativities
in each category. By dividing the total number of positive
image face and eye samples by the number of positively
identified positive samples, precision is determined. A deep
learning approach based on convolutional neural networks
(CNN) and trained on massive face and eye image expression
ResNet datasets for Alzheimer disease diagnosis is referred
to be a machine learning technology. These networks
consist of multiple layers of neurons, learning intricate, non
linear relationships between inputs and outputs due to their
Fig. 1.8 The Montreal Neurological Institute (MCI) coordinate
numerous parameters.
system has identified voxels with the distinguished
stage of Alzheimer’s disease
Table 1.1 Graph for comparing algorithms RGB, YcbCr and
CNN
Comparative Accuracy Sensitivity Specificity Precision F1-score
Methods
RGB 0.80 0.81 0.80 0.81 0.81
YCbCr 0.88 0.87 0.86 0.85 0.86
CNN 0.98 0.98 0.98 0.98 0.99

The convolutional neural network performs best among the

Fig. 1.9 The proposed CNN Algorithm recognizes red, green,
and blue-colored voxels in a brain cutout, indicating accuracy measurements for RGB, YcbCr, and CNN in the
different stages of Alzheimer’s disease (AD), mild graphs.
cognitive impairment (MCI), and AD
6 Algorithms in Advanced Artificial Intelligence

early AD diagnosis. This method, which uses a significant

1.2 number of CNNs, accurately identifies and classifies early
1 RGB
AD diagnoses, making it a valuable tool for early detection.
Percentages

0.8
YCbCr
0.6
0.4 CNN References
0.2
0 1. Alexandra Wolf, KornkanokTripanpitak, Satoshi Umeda, and
Mihoko Otake-Matsuura. 2023. “Eye-Tracking Paradigms for
the Assessment of Mild Cognitive Impairment: A Systematic
Review.” Frontiers Psychological 14 (June). https://doi.org/
doi.org/10.3389/fpsyg.2023.1197567.
Methods 2. Review of Activation Functions in Neural Networks.
n.d. Accessed https://www.geeksforgeeks.org/activation
Fig. 1.11 Graph for comparing algorithms RGB, YcbCr and functions-neural-networks/.
CNN 3. Albawi S, Mohammed TA, and Al-Zawi S. 2017.
“Understanding of a Convolutional Neural Network.” In
International Conference on Engineering and Technology
4. Discussion (ICET), 1–6. Institute of Electrical and Electronics Engineers.
An approach is made by using CNN algorithm for https://doi.org/10.1109/ICEngTechnol.2017.8308186.
classification of images as per the stages of Alzheimer’s 4. Gupta, Dishashree. 2017. “Transfer Learning and the Art of
Using Pre-Trained Models in Deep Learning,.” Analytics
disease as the algorithm is used for detecting problems
Vidhya. June 1, 2017. https://www.analyticsvidhya.com/
related to recognition of faces. Artificial Neural network is
blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre
used in classification and identification of three dimensional trained-model/.
grid of brain. Vector features of image processing is being 5. Ebrahimighahnavieh MA, and Chiong DR. 2019. “Deep
used in this paper. Diagnosis of AD in early stage and Learning to Detect Alzheimer’s Disease from Neuroimaginga
providing proper treatment is challenging as it effects the Systematic Literature Review 105242.” Computer-Methods
brain cells. The hippocampus, a crucial brain region affected And-Programs-In-Biomedicine, November. https://doi.
by Alzheimer’s disease, was analyzed using fMRI data and org/10.1016/j.cmpb.2019.105242.
SPM, achieving a 95% classification rate. The method uses 6. Ebrahim D, Ali-Eldin AM, Moustafa HE, and Arafat H. 2020.
voxels as independent variables and extracts feature vectors “Alzheimer Disease Early Detection Using Convolutional
based on the largest size of discrete cosine transform matrix Neural Networks.” In 15th International Conference on
Computer Engineering and Systems (ICCES), 1–16. Institute
values for Alzheimer’s disease classification. Fig 7 explains
of Electrical and Electronics Engineers. https://doi.org/https://
the stages of AD and face recognition approach used in CNN
doi.org /10.1109/ICCES51560.2020.9334594.
algorithm. This algorithm helps in diagnosis of AD at an 7. Donngmei, Han , Liu Qigang, and Fan Weiguo. 2018. “A New
early stage and provides scope for doctors to understand the Image Classification Method Using CNN Transfer Learning
patient’s condition and provide better treatment. and Web Data Augmentation.” . Expert System Application 95
(April): 43–56. https://doi.org/10.1016/j.eswa.2017.11.028.
5. Conclusion 8. Jinglin Sun, Yu Liu, Hao Wu, Peiguang Jing, and Yong Ji.
2022. “A Novel Deep Learning Approach for Diagnosing
Deep learning, a machine learning technique using Alzheimer’s Disease Based on Eye-Tracking Data.” Front
Convolutional Neural Networks, is trained on large face and Hum Neuroscience 16 (September). https://doi.org/10.3389/
eye expression datasets for identifying Alzheimer disease fnhum.2022.972773.
through intricate, non-linear relationships between inputs 9. Jonathan S. Talahua , Jorge Buele , P. Calvopiña , and José
Varela-Aldás. 2021. Review of Facial Recognition System for
and outputs. It concludes that CNN achieves the most
People with and without Face Mask in Times of the COVID-19
positive results. CNN is proposed for human face recognition
Pandemic. Sustainability 13 (12). https://doi.org/10.3390/
using feature vectors from feature extraction. This method su13126900.
separates stages of Alzheimer’s disease (AD) using fMRI 10. Odusami, Modupe, Rytis Maskeliunas, Robertas
data, using activated hippocampus voxels as input vectors Damasevicius, and Tomas Krilavicius. 2021. “Analysis of
and feature values to cover maximum data variability. The Features of Alzheimer’s Disease: Detection of Early Stage
CNN-based algorithm, based on fMRI data from patients’ from Functional Brain Changes in Magnetic Resonance
hippocampus voxels, offers the best classification rate for Images Using a Finetuned ResNet18 Network”, Diagnostics
11 (6): 1071. https://doi.org/10.3390/diagnostics11061071.
Convolutional Neural Networks Detect Alzheimer’s disease by Analyzing Facial Expressions and Eye Movements 7

11. Gupta, P, N Saxena, M Sharma, and J Tripathi. 2018. Review 14. SP, Singh, Wang L, Gupta S, Goli H, Padmanabhan P, and
of Deep Neural Network for Human Face Recognition. Gulyas B. n.d. Review of 3D Deep Learning on Medical
International Journal Engineering Manufacturing 8 (1): 63– Images: A Review. Sensors 20 (18): 5097. https://doi.
71. 10.5815/ijem.2018. org/10.3390/s20185097,2020.
12. S, Sharma, and Sharma S. 2017. Review of Activation 15. Duan, Y, J Lu, and J Zhou. 2019. Review of Learning Deep
Functions in Neural Networks. Towards Data Science 6 (12): Equidistributed Representation for Face Recognition. IEEE
310–16. Conf. On Computer Vision and Pattern Recognition (CVPR),
13. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis 3415–24.
in Health Care Using Hybrid Model Algorithms”,Journal of
Note: All figures and table in this chapter were designed by the
Artificial Intelligence, Machine Learning and Neural Network
author.
(JAIMLNN), Volume 3, 2023.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
8 Algorithms in Advanced Artificial Intelligence

Self-caring Autonomous Medicinal and

Aromatic Plants (MAP) Nursery Using
Arduino Microcontroller
2

Gidla Sudheer Babu

BVC College of Engineering, Odalarevu, East Godavari
A. V. S. S. Varma
SRKR Engineering College, Bhimavaram, West Godavari
B. V. Ramana
BVC Engineering College, Batlapalem, East Godavari
Srilali Siragam*
Swarnandhra College of Engineering and Technology,
Narsapuram, West Godavari

Abstract: Medicinal and aromatic plants (MAPs) are botanical raw materials, sometimes known as herbal pharmaceuticals,
that are generally utilized as ingredients in cosmetics, health, and medicinal products, as well as other natural health products
for therapeutic, aromatic, and/or culinary purposes. A nursery’s objective is to produce seedlings that are grown in optimum
conditions until they are ready for planting. MAPs make up a considerable component of natural vegetation, and all nurseries
fundamentally try to produce and offer enough high-quality seedlings to suit consumer demand. The primary goal of this
research is to use Arduino to maintain the MAP Nursery. MAPs require highly specific circumstances to flourish. Specific
accurate measurements of temperature, humidity, soil moisture, and sunshine must be kept in the MAP Nursery. Using the
Arduino, we can monitor all of these parameters within the plant to ensure that they are within the needed range for the
healthy growth of MAPs. Numerous MAP species are in high demand for both home use and commercial use in the herbal
sector. Resources for MAPs were abundant in forests, but anthropogenic pressure is rapidly destroying forests. In this article,
we looked at five distinct plant species that flourish under diverse atmospheric conditions: Lemon Grass, Basil, Aloe-Vera,
Rosemary, and Ashwagandha. An Arduino Microcontroller, combined with sensors and actuators, is utilized to keep the
appropriate circumstances for their growth.
Keywords: Medicinal and aromatic plants, Arduino, Nurseries, Microcontroller, Lemon grass

1. Introduction use and commercial use in the herbal sector. Resources for
MAPs were abundant in forests, but anthropogenic pressure is
A nursery’s objective is to produce seedlings that are grown rapidly destroying forests. As a result, growing healing plants
in optimum conditions until they are ready for planting. The in agricultural areas may be a straightforward technique of
basic purpose of all nurseries is to produce and provide enough obtaining progressively basic components. The relevance of
high-quality seedlings to meet client demand [1, 2]. MAPs cultivating medicinal and aromatic plants is fast increasing
constitute a significant percentage of the natural vegetation. due to the detrimental effects of chemical and artificial
Numerous MAP species are in high demand for both home medicines, which are rising awareness among people all

*Corresponding author: srilalisep7@gmail.com

DOI: 10.1201/9781003529231-2
Self-caring Autonomous Medicinal and Aromatic Plants (MAP) Nursery Using Arduino Microcontroller 9

required is influenced by factors such as soil type,

humidity, temperature, and plant size. Overwatering
and underwatering can both be detrimental to plants, so
it’s critical to monitor soil moisture and adjust watering
as needed.
(iii) Soil: The kind and quality of soil can have a significant
impact on plant growth. Plants require varying soil
types, pH levels, and nutritional content. It is critical to
select soil that is suited for the plants we are cultivating
and that is well-draining and nutrient-rich.
(iv) Temperature: Temperature can also influence
plant growth, and various plants require different
temperatures. Some plants prefer cooler temperatures,
while others flourish in higher temperatures.
Maintaining the proper temperature range for your
plants is critical for good growth.
(v) Humidity: Humidity refers to the amount of moisture
in the air, which can affect plant growth. Some plants
require high levels of humidity, while others may
withstand lower levels. To guarantee optimal growing
conditions, monitor the humidity in your growing area
and adjust as appropriate.
Fig. 2.1 Flow diagram of the self-caring autonomous (vi) Nutrients: Plants require nutrients like nitrogen,
medicinal and aromatic plants (MAP) nursery using phosphorous, and potassium to flourish. These nutrients
arduino microcontroller
can be supplied through fertilizers or other additions,
and it is critical to ensure that your plants are getting
over the world [3, 4]. It is becoming increasingly important the right nutrients in the right proportions [9].
to preserve and improve indigenous, medicinal, and fragrant We can help guarantee that our plants are developing in
plant species. The following is a list of the model nursery’s optimal conditions and promote healthy growth by monitoring
key MAP goals: To supply farmers with high-quality, genuine and modifying these conditions as needed.
planting materials. To educate the people about the medicinal
qualities of fragrant herbs. To provide bio-resources for (a) Planting: Cultivating medicinal plants in a nursery is
research into medicinal and aromatic plants. Conservation comparable to cultivating other kinds of plants, but
of rare and endangered medicinal plants in their natural there are some differences. The following are some
habitat [5-8]. To encourage growers and farmers to cultivate general actions that may be taken while growing
therapeutic plants. Plant multiplication and propagation for medicinal plants in a nursery.
medicinal purposes. (b) Select appropriate species: The first step in cultivating
medicinal plants is to select a species that will flourish
When growing plants, there are various characteristics to
in a nursery environment while also providing the
consider, and these might change depending on the plant
needed medical characteristics. Among the prominent
species and growing conditions. Here are some general
therapeutic plants planted in nurseries are aloe vera,
parameters to consider:
lavender, chamomile, Echinacea, and ginseng.
(i) Light: Photosynthesis, the process of transforming (c) Seed collection: Once the species has been determined,
light energy into chemical energy to drive growth, seeds or other propagation materials must be collected.
is required by plants. Because different plants have For some medicinal plants, seeds may be the best
varying light requirements, it is critical to deliver the alternative, while for others, cuttings, bulbs, or
right amount and intensity of light for our plants. Some rhizomes may be employed.
plants demand direct sunlight, while others prefer
(d) Soil preparation: The nursery soil must be prepared in
partial or complete shade.
order to give the best growing circumstances for the
(ii) Water: Water is necessary for plant growth, and different selected species. Medicinal plants frequently prefer
plants have varying water needs. The amount of water well-draining soils heavy in organic matter. The pH
10 Algorithms in Advanced Artificial Intelligence

of the soil should also be evaluated and, if necessary, soil moisture. They highlight the advantages of real-time
adjusted to ensure that it falls within the suitable range data collecting and remote control, which allow farmers
for the selected species. to make informed decisions and modify environmental
(e) Seed propagation: When the soil is ready, seeds conditions as needed [13]. Lee et al. report an automated
or propagation materials are placed in soil-filled greenhouse control system based on Arduino for temperature,
containers. To enhance germination and root humidity, and soil moisture adjustment. They concentrate on
development, the soil must be kept moist but not soggy. establishing feedback control algorithms in order to maintain
As seedlings grow, they must be checked and relocated optimal growing conditions and achieve energy efficiency.
to larger containers. They demanded that the testing findings illustrate the
(f) Fertilization: Specific nutrients may be required for system’s usefulness in enhancing plant growth and lowering
medicinal plants to develop properly and generate the energy consumption [14]. Gupta et al. present an Arduino
needed medicinal characteristics. Fertilizers can be based smart irrigation system that uses soil moisture sensors,
added to the soil to supply nutrients and keep plants weather data, and real-time feedback to optimize irrigation
healthy. Medical plants, like any other crop, are scheduling. They emphasize the significance of precision
susceptible to pests and illnesses. agriculture approaches in conserving water while increasing
crop yields [15].
(g) Pest and disease control: Pests and illnesses should be
recognized early and treated promptly to reduce the Finally, this study reveals the extensive usage of Arduino
risk of plant damage, such as employing organic pest based devices in plant cultivation to control temperature,
control methods. humidity, and soil moisture. These examples show the
(h) Harvesting: When the plants are ripe and ready to be potential and utility of utilizing Arduino as a platform for
harvested, they can be cut and prepared for use in a constructing smart, automated agricultural solutions. Real-
variety of medical applications [10]. time monitoring, data-driven decision-making, and enhanced
resource management are made possible by the integration
These are only a few of the general procedures for producing of sensors, actuators, and wireless communication. Further
medicinal plants in a nursery. The specific approaches and research and development in this subject have the potential
considerations will differ depending on the plant type and the to improve agricultural practices, increase crop productivity,
nursery’s goals. and contribute to more sustainable farming approaches.
The growth of technology has permitted the creation of Finally, this study reveals the extensive usage of Arduino
smart and automated systems for a variety of uses, including based devices in plant cultivation to control temperature,
agriculture. Maintaining ideal temperature, humidity, and humidity, and soil moisture. These examples show the
soil moisture levels is critical for plant health and growth potential and utility of utilizing Arduino as a platform for
in the field of plant cultivation. Arduino, an open-source constructing smart, automated agricultural solutions. Real-
microcontroller platform, has grown in popularity as a time monitoring, data-driven decision-making, and enhanced
low-cost and versatile alternative for building automated resource management are made possible by the integration
systems. This literature review intends to investigate of sensors, actuators, and wireless communication. Further
existing research and projects that use Arduino to control research and development in this subject have the potential
temperature, humidity, and soil moisture in plant cultivation. to improve agricultural practices, increase crop productivity,
Some of the authors reported numerous plant-related results and contribute to more sustainable farming approaches.
using the Arduino microcontroller. Smith et al. investigate
the use of Arduino-based sensors and actuators to monitor
and control temperature, humidity, and soil moisture in a 2. Proposed Work
greenhouse environment. They show how Arduino may The suggested system is depicted in the Fig. 2.1. An Arduino
be used to maintain ideal growing conditions, resulting board is linked to humidity, temperature, and soil moisture
in increased plant growth and productivity [11]. Johnson sensors, which continuously monitor the conditions of the
et al. demonstrate an Arduino-based automated plant watering MAP nursery. There will be 5 separate sensor sets for 5
system that includes sensors to assess soil moisture levels and different plant groups. Each group must be kept within a
actuators to control water flow. They illustrate the viability certain range of humidity, temperature, and soil moisture
of employing Arduino for efficient irrigation management, levels. When the humidity level drops, the humidifier will
ensuring plants receive appropriate moisture while avoiding use Arduino to supply enough humidity to the plant group.
overwatering [12]. Brown et al. investigate the application When Arduino detects a low humidity level, it activates the
of Arduino-based wireless sensor networks (WSNs) in humidifier. It will turn off the humidifier after the appropriate
agricultural situations to monitor temperature, humidity, and level has been attained. Similarly, if high humidity is detected
Self-caring Autonomous Medicinal and Aromatic Plants (MAP) Nursery Using Arduino Microcontroller 11

(a) (b)

Fig. 2.2 Block diagram of the Arduino microcontroller

during the rainy season, the Arduino will automatically

(c) (d)
activate the dehumidifier to maintain the correct level.
Similarly, in the case of temperature, two systems, namely
a cooling system and a ventilation system, are available to
maintain correct temperature levels. For proper growth, the
plants must be kept at a constant water level. This can be
accomplished with a soil moisture sensor. When we detect
dry soil, we activate the pump motor, which automatically
supplies water to the soil through Arduino. To correctly
maintain the nursery, this entire system requires a slew of
humidity, temperature, and soil moisture sensors.
For controlling the above parameters with an Arduino, we
evaluated five plants: lemon grass, basil, aloe vera, rosemary,
and ashwagandha. Their temperature, humidity, and soil
moisture levels should be as follows in order for plants to (e)
grow well. The corresponded images are as shown in the Fig. 2.3 Five plants: (a) lemon grass, (b) basil, (c) aloe vera,
Fig. 2.3. (d) rosemary, and (e) ashwagandha
1. Lemon grass: Lemongrass (Cymbopogon citratus) is a
tropical plant that thrives in warm, humid environments. need to enhance humidity by placing a tray of
(a) Temperature: Lemongrass grows well in water near the plants or using a humidifier to
temperatures ranging from 20°C to 35°C (68°F to keep the proper levels. Furthermore, adequate air
95°F), with an optimal temperature of around 25°C circulation can help prevent disease and sustain
to 30°C (77°F to 86°F). Temperatures of less than healthy plant growth.
15°C (59°F) or greater than 40°C (104°F) might (c) Soil Moisture: Soil Moisture: Lemongrass grows
stress or damage the plant. Lemongrass prefers well in well-drained soil that is continuously damp
high levels of humidity, preferably between 70% but not wet. Here are some general suggestions for
and 85%. It can, however, survive lower humidity cultivating lemongrass in different soil moisture
levels, if necessary, as long as it receives enough ranges:
water and the temperature is within the proper (i) At planting: The soil should be moist but not
range. soaked while planting lemongrass. This will
(b) Humidity: To keep humidity levels stable, spray aid in the establishment of the plant’s roots
the plants on a frequent basis or use a humidifier and decrease transplant shock.
in the growth area. Lemongrass can be grown both (ii) After planting: After planting, the soil
outdoors in warm, humid areas and indoors in a should be kept constantly moist for the first
greenhouse or other controlled environment. several weeks to support root growth and
If we’re growing lemongrass inside, you might establishment. This can be accomplished by
12 Algorithms in Advanced Artificial Intelligence

deeply watering the plant once or twice a week, cause root rot and other problems, while letting
depending on the weather circumstances. the soil to dry out too much can produce wilting
(iii) During the growing season: Lemongrass leaves and stress in the plant. As a general rule of
should be watered on a regular basis to keep thumb, water basil plants when the top inch of soil
the soil constantly moist. Watering frequency feels dry to the touch. Depending on temperature,
will vary depending on the weather, but it is humidity, and soil type, this could imply watering
normally recommended to water the plant every few days or once a week. To see if the soil is
deeply once a week, or more frequently if the moist enough, use a soil moisture meter or simply
weather is hot and dry. stick your finger up to the second knuckle. If the
(iv) Winter dormancy: Watering should be soil feels damp but not soggy, it is likely to be at the
reduced throughout the winter months when proper moisture level for basil. It is time to water if
the plant is dormant to avoid waterlogging it seems dry. It’s also worth noting that basil plants
the soil. The soil should be allowed to dry out can benefit from a layer of mulch around the base
slightly between waterings, but not fully dry of the plant, such as straw or leaves. This can assist
out. It is crucial to note that the particular soil to keep the soil moist and keep it from drying up
moisture requirements for lemongrass vary too rapidly.
based on climate, soil type, and other growing 3. Aloe Vera: Aloe vera is a succulent plant native to
factors. It’s always a good idea to keep an eye Africa’s hot, dry climates. It can thrive in a variety of
on the soil moisture levels and adjust watering situations and is well-adapted to high temperatures and
as needed to guarantee the plant’s best growth low humidity.
and health. (a) Temperature: Temperatures between 20°C and
2, Basil: Basil (Ocimum basilicum) is a culinary herb that 30°C (68°F and 86°F) are suitable for growing
comes in a variety of varietals with varying growing aloe vera. Temperatures less than 10°C (50°F) or
requirements. Basil, on the other hand, enjoys warm greater than 35°C (95°F) might stress or damage
temperatures and moderate humidity levels. The the plant. Although aloe vera may endure limited
following are some common temperature and humidity periods of cold temperatures, it is preferable to
ranges for basil cultivation: avoid exposing the plant to cold temperatures for
(a) Temperature: Basil thrives in warm climates, extended periods of time.
preferring temperatures ranging from 18°C to 27°C (b) Humidity: Aloe vera enjoys low humidity levels
(65°F to 80°F). Basil can withstand temperatures and can endure arid air. Aloe vera grows best
of up to 35°C (95°F), although it may begin to in humidity levels ranging from 30% to 50%.
exhibit indications of stress or limit its growth. However, if the humidity level is too high, it might
Temperatures below 10°C (50°F) can cause plant raise the danger of fungal illnesses and root rot.
damage or death. If the air is excessively dry, the plant may benefit
(b) Humidity: Basil enjoys humidity levels ranging from misting or placing a tray of water near it to
from 40% to 60%. Because high humidity can enhance humidity levels. It’s crucial to remember
promote fungal illnesses, it’s critical to allow that aloe vera is prone to frost damage, so if you’re
adequate air circulation and prevent crowding the growing it outside in a milder climate, bring it
plants. In dry or arid locations, we may need to inside or cover it during frost or freezing weather.
spritz the plants or use a humidifier to enhance (c) Soil Moisture: Soil Moisture: Aloe vera plants
humidity levels. It’s worth noting that different grow best in well-draining soil that is maintained
basil kinds may have slightly varied temperature slightly damp but not soggy. Allowing the soil to
and humidity requirements. Some kinds, for dry between waterings is critical to preventing
example, Thai basil, may prefer slightly higher root rot and other problems. Here are some broad
temperatures and humidity levels. It’s always a suggestions for aloe vera soil moisture ranges:
good idea to do some study on the variety of basil When watering aloe vera, keep the soil evenly
you’re growing to ensure you’re providing optimal moist but not wet. Allow the soil to dry slightly
growing conditions for maximum development between waterings, but not completely dry.
and output. Overwatering is a major concern with aloe vera, so
(c) Soil Moisture: Basil plants demand well-drained avoid watering it too regularly. Aloe vera prefers
soil that is regularly moist. Overwatering can soil moisture levels between 50 and 70% of field
capacity. This means the soil should be damp
Self-caring Autonomous Medicinal and Aromatic Plants (MAP) Nursery Using Arduino Microcontroller 13

but not wet. Insert your finger up to the second moist but not waterlogged throughout this
knuckle into the soil to assess the moisture level. period.
If the soil at that depth feels dry, it’s time to water (ii) During the growing season: Once established,
the plant. Aloe vera plants, in general, require less rosemary prefers a dry soil. In general, the
water throughout the winter months when they soil should be allowed to partially dry before
are dormant and more water during the growing being watered again. However, it is critical
season in the spring and summer. It’s also critical not to allow the soil to dry out completely, as
to make sure the soil has sufficient drainage to this might stress the plant and restrict growth
keep water from accumulating around the roots. A and output. During the growing season, the
well-draining soil mix with sand or perlite can aid soil moisture range for growing rosemary is
in drainage. Remember that the appropriate soil approximately 25-50% of the soil’s maximum
moisture range for aloe vera might vary depending water-holding capacity.
on factors such as temperature, humidity, plant (iii) During the winter months: In locations
size, and container size. As a result, it is critical to with cold winters, rosemary may go into
routinely monitor the plant’s soil moisture levels hibernation, requiring less water. The soil
and modify watering as needed to ensure that should be allowed to dry up more than
the soil maintains within the acceptable moisture usual during this period, but not completely.
range. Growing rosemary in the winter requires a
4. Rosemary: Rosemary (Rosmarinus officinalis) is a soil moisture range of 10-25% of the soil’s
perennial herb native to the Mediterranean region. maximum water-holding capacity. It’s crucial
(a) Temperature: It thrives in warm, sunny, and to note that these are only guidelines; the ideal
dry conditions and tolerates a wide variety of soil moisture range for producing rosemary
temperatures and humidity levels. Growing will vary depending on elements including
rosemary requires a temperature range of 15°C to soil type, climate, and growing conditions. It
30°C (59°F to 86°F), with an optimal temperature is critical to routinely monitor soil moisture
of around 20°C to 24°C (68°F to 75°F). levels and adjust watering as needed to
Temperatures less than 10°C (50°F) or greater than promote optimal growth and yield.
35°C (95°F) might stress or damage the plant. 5. Ashwagandha: Ashwagandha (Withania somnifera)
(b) Humidity: In terms of humidity, rosemary favors is a tropical plant native to India that thrives in warm,
lower humidity levels ranging from 30% to 50%. humid climates.
High humidity levels can cause fungal illnesses and (a) Temperature: Ashwagandha grows well in
other problems, therefore it’s critical to promote temperatures ranging from 20°C to 30°C (68°F
proper air circulation and prevent overcrowding to 86°F), with an optimal temperature of around
the plants. If we’re growing rosemary inside or in a 25°C (77°F). Temperatures less than 15°C (59°F)
greenhouse, we may control humidity by opening or greater than 35°C (95°F) might stress or damage
vents or using a dehumidifier if necessary. It should the plant.
also be noted that rosemary is drought-tolerant and (b) Humidity: In terms of humidity, Ashwagandha
favors well-drained soils. Overwatering can cause favors moderate to high humidity levels ranging
root rot and other problems, so water the plant from 50% to 85%. It can, however, survive
thoroughly but seldom, allowing the soil to dry lower humidity levels, if necessary, as long as
somewhat between waterings. it receives enough water and the temperature is
(c) Soil Moisture: Rosemary is a drought-tolerant herb within the proper range. To keep humidity levels
that likes well-drained soil with a modest amount stable, spray the plants on a frequent basis or use
of rainfall. Here are some broad suggestions a humidifier in the growth area. It is important
for growing rosemary in different soil moisture to note that Ashwagandha can be cultivated both
ranges: outside in warm, humid conditions and indoors in
(i) During the establishment phase: It is critical a greenhouse or other controlled environment. If
to maintain the soil continually moist we’re growing Ashwagandha indoors, you might
when planting rosemary until the plant has need to enhance humidity by placing a tray of
established itself. This usually takes around water near the plants or using a humidifier to keep
2-3 months. The soil should be kept uniformly the correct levels.
14 Algorithms in Advanced Artificial Intelligence

Furthermore, adequate air circulation can help prevent • Piston Pumps:

illness and sustain healthy plant growth. (c) Soil Moisture: • Gear Pumps
Ashwagandha (Withania somnifera) is a medicinal plant that • Rotary Vane Pumps
grows well in warm, dry areas with well-drained soil. The soil
• Solar-Powered Pumps
moisture requirements for ashwagandha will vary depending
on the growing conditions and stage of development. Here • Diesel or Gasoline Engine Pumps:
are some broad suggestions for ashwagandha soil moisture • Electric Turbine Pumps:
levels: (i) During seed germination: To germinate effectively, • Hydraulic Pumps:
ashwagandha seeds require continuous moisture. The soil The pump selected is determined by the unique needs of the
should be kept moist but not saturated. Overwatering can agricultural activity, such as the water source, needed flow
cause fungal illnesses as well as damping-off. (ii) During rate, pressure, and energy source. To ensure effective water
vegetative growth: As ashwagandha grows up, it prefers usage and a successful agricultural operation, it is critical
slightly drier soil conditions. Allow the soil to dry out to select the correct pump for the job. A pump’s power
slightly between waterings, but not so much that the plant consumption is determined by various factors, including the
wilts. Overwatering at this point can result in root rot. During pump’s design, size, flow rate, pressure requirements, and
flowering and fruiting: The plant may demand slightly more motor efficiency.
water when it begins to produce flowers and fruit. However,
it is still critical to avoid flooding the soil. The soil should Solar-Powered Pumps: Because they use renewable energy,
be allowed to dry slightly between waterings, but the plant solar-powered pumps can be extremely energy-efficient.
should not wilt. Ashwagandha grows best on well-drained Their efficiency is determined by the capacity of the solar
soil with moderate moisture levels. It is critical to routinely panels and the design of the pump.
monitor soil moisture levels and adjust watering as needed
3.2 Humidifier
to avoid both overwatering and underwatering. Table 2.1
displays the temperature and humidity levels of all five plants. Humidifiers are devices that add moisture to the air, boosting
indoor humidity levels. They are typically employed to fight
Table 2.1 Temperature and Humidity levels of 5 plants dry air, which can be generated by variables such as heating
systems, climate, or air conditioning. Here are some of the
S. Plant Temperature (°C) Humidity (%)
No most prevalent types of humidifiers.
LOW HIGH LOW HIGH
• Ultrasonic Humidifiers
1 Lemon Grass 20 35 70 85
• Cool Mist Ultrasonic Humidifiers
2 Basil 18 27 40 60
• Warm Mist Ultrasonic Humidifiers
3 Aloe-Vera 20 30 30 50
• Evaporative Humidifiers
4 Rosemary 15 30 30 50
• Steam Vaporizers
6 Ashwagandha 20 30 50 85
• Impeller Humidifiers
• Central Humidifiers
3. Choice of Components • Ultraviolet (UV) Humidifiers
• Aerosol Humidifiers
3.1 Motor Pump • Travel Humidifiers
A variety of motor pumps are used in agriculture to help with A refreshing mist Ultrasonic humidifiers are widely regarded
irrigation, water delivery, and other fluid transfer demands. as among the most energy-efficient humidifiers. They use
The motor pump used is determined by elements such as the ultrasonic vibrations to create a fine mist, and its operation is
supply of water, the required flow rate, the required pressure, both quiet and power efficient. They are an excellent solution
and the specific requirements of the agricultural application. for managing interior humidity levels while minimising the
Here are some examples of common agricultural motor impact on your energy cost. To save energy, choose a suitably
pumps: sized humidifier for the room we wish to humidify and use it
• Centrifugal Pumps: only when necessary to maintain acceptable humidity levels.
• Submersible Pumps
3.3 Solenoid
• Jet Pumps
• Diaphragm Pumps A solenoid valve is an electromechanical device that controls
the flow of fluids through a system or a pipeline, such as
Self-caring Autonomous Medicinal and Aromatic Plants (MAP) Nursery Using Arduino Microcontroller 15

liquids or gases. It works by employing an electromagnetic relay switch. By setting and resetting the digital pins, we may
coil to move a plunger or piston, which opens or closes a valve control all of the actuators (Fig. 2.5).
mechanism. A solenoid valve’s key components include:
• Solenoid Coil
• Plunger or Piston
• Valve Mechanism Fig. 2.5 Actuator connection

Solenoid valves come in various types The process begins with the sensors measuring the
• Normally Closed (NC) Solenoid Valve temperature, humidity, and soil moisture levels of the plants.
The sensed values are temporarily saved in separate variables
• Normally Open (NO) Solenoid Valve
in Arduino. If the values are less than the predefined values,
• Direct-acting Solenoid Valve the Relays will be turned off in the programme. If the values
• Pilot-operated Solenoid Valve surpass the restrictions, the relevant Relay attached to the
Solenoid valves are used to control the flow of water in actuator will go ON, allowing the actuator to activate. If the
irrigation systems, among other things. Temperature value reaches the value, for example, the relay
connected to the Cooling fan will turn on, allowing the Cooling
fan to reduce the temperature. If the temperature falls below
4. Methodology the set point, Arduino will activate the relay linked to the heat
The plant species mentioned above, namely Lemon Grass, light. When the temperature reaches the specified level, the
Basil, Aloe-Vera, Rosemary, and Ashwagandha, are grown relay turns off. Similar activities are taken for humidifiers
under various atmospheric circumstances. An Arduino and dehumidifiers to decrease and increase humidity, and
Microcontroller, combined with sensors and actuators, is for the water pump motor to decrease soil moisture value.
utilized to keep the appropriate circumstances for their The sensors update the parameter values every second. This
growth. The circuit diagram (Fig. 2.4) depicts the Arduino procedure is repeated continuously.
Microcontroller, as well as sensors and actuators.

Fig. 2.4 Circuit diagram of the self-caring autonomous

medicinal and aromatic plants nursery using Arduino
Microcontroller

The Arduino board has 14 digital pins (0 to 13) and 6 analogue

pins (A0 to A5) for connecting sensors and actuators. We
can link the temperature sensor (DHT22) to digital pins.
We can also read humidity data using the same sensor. Fig. 2.6 Flow chart of the methodology
This means that 5 temperature and humidity sensors can be
Multiple parameter adjustments may occur at times. In this
accommodated for Arduino. Soil moisture sensors, on the
scenario, multiple actuators are used to normalize the values.
other hand, can be attached to either analogue or digital pins.
Figure 2.6 depicts the overall mechanism of the Self-caring
Because the actuators we are employing, such as the cooling
Autonomous Medicinal & Aromatic Plants (MAP) nursery
fan, heat lamp, humidifier, dehumidifier, and pump motor,
using an Arduino Microcontroller.
require more voltage than 5V, we have connected them via a
16 Algorithms in Advanced Artificial Intelligence

Applications of Arduino for controlling Temperature: (iv) Use a Peltier module: Depending on the direction
Depending on your design and demands, there are a few of the electric current, a Peltier module can be
different ways to raise or lower the temperature in a plant utilized to cool one side and heat the other. To raise
nursery. Here are some general guidelines: the temperature, Arduino can be used to alter the
(a) Adjust the thermostat: If your nursery has one, we can direction of the electric current.
use it to raise or lower the temperature as needed. For (v) Control a heating system: Based on temperature
example, if the temperature is too low, we can activate data from sensors installed in various areas,
the heating system by turning up the thermostat. Arduino can be used to manage a heating system,
(b) Use heating or cooling systems to adjust the such as a central heating system or a space
temperature: Depending on your climate and the time heater. It is critical to remember that increasing
of year, we may need to use heating or cooling systems temperature with Arduino necessitates necessary
to regulate the temperature. For example, in colder safety precautions in order to avoid any harm or
months, we could use a heater or furnace to raise the risks. Furthermore, the specific components and
temperature, while in hotter months, we could use an programming required will vary depending on
air conditioning unit to lower the temperature. the application and requirements. In the proposed
system a colling fan has been incorporated for
(c) Provide shading: If our nursery is in a sunny location,
controlling the high temperature and a heat lamp
we may need to give shade to keep the plants cool. This
is incorporated as heating element.
may entail employing shade cloth or other materials
to decrease the amount of direct sunlight reaching the Applications of Arduino for controlling Humidity: There
plants. are several ways an Arduino could be used to raise humidity
(d) Use ventilation: By moving air and reducing heat levels in a certain location. Here’s a high-level overview of
buildup in specific regions, ventilation can assist one approach:
regulate temperature. To do this, we could employ fans, (a) Gather needed materials: will need an Arduino board,
vents, or other ventilation devices. a humidity sensor, a humidifier, and some wires.
(e) Insulate the nursery: If our nursery is not well-insulated, (b) Connect the humidity sensor: Use wires to connect the
heat can escape in the colder months or infiltrate in the humidity sensor to the Arduino board. To attach the
warm months, making temperature regulation difficult. sensor correctly, make sure to follow the instructions
Insulating the walls, ceiling, and/or floor can help to included with it.
keep the temperature steady. It is critical to continually (c) Write the code: Write a program in Arduino that reads
monitor the temperature and make adjustments as the humidity sensor and sends a signal to the humidifier
needed to ensure that the plants are developing in to turn on or off depending on the current humidity
optimal conditions. Arduino can be used to regulate level. If the humidity is too low, for example, the
a variety of electronic components that can aid in Arduino can activate the humidifier, releasing moisture
temperature growth in a controlled environment. Here into the air.
are some examples of how Arduino can be used to raise (d) Test and adjust: Run the setup and make any necessary
the temperature: changes to the code to verify that the humidifier turns
(i) Use a heating element: To raise the temperature, on and off according to the humidity levels.
Arduino can be programmed to operate a heating (e) Fine-tune: To fine-tune the system, you could add
element such as a resistor or a heating pad. The additional sensors or use more complex code to
heating element can be linked to a relay module, operate the humidifier based on other factors like as
which the Arduino can regulate. temperature or time of day.
(ii) Control a fan: Warm air can be circulated in a room It is crucial to note that there are numerous approaches
or enclosure using a fan. Based on temperature to controlling humidity levels using an Arduino, and the
readings from a sensor, such as a thermistor or a specific strategy will depend on our personal setup and
temperature sensor, Arduino can be used to control demands. Furthermore, if we are unfamiliar with electronics
the speed of a fan. or programming, it may be beneficial to seek advice from a
(iii) Use a heat lamp: A heat lamp can be used to professional or experienced enthusiast.
provide concentrated heat in one location. Based
(a) Humidity-Plant: We can use the following strategies to
on temperature readings from a sensor, Arduino
enhance or decrease humidity in a plant nursery:
can be programmed to control the on/off time of
the heat lamp. (i) Use a humidifier: A humidifier is a device that
raises the humidity in a room. Set your humidifier
Self-caring Autonomous Medicinal and Aromatic Plants (MAP) Nursery Using Arduino Microcontroller 17

to the ideal humidity level in your plant nursery. (e) Provide adequate drainage: Proper drainage is
This is especially beneficial during the dry winter essential for preventing soggy soil and root rot. Make
months or in low-humidity environments. sure the pots or planters have drainage holes and use
(ii) Provide appropriate ventilation: good air well-draining soil.
circulation is essential for sustaining healthy plants (f) Reduce watering frequency: If the soil is overly wet,
and can also aid with humidity regulation. Make reduce the frequency of watering. Allow the soil to dry
sure your plant nursery has proper ventilation somewhat before watering again.
to avoid stagnant air and unnecessary moisture (g) Increased air circulation: Improved air circulation can
buildup. help to reduce fungal growth and promote healthy plant
(iii) Use a dehumidifier: If the humidity in your plant growth. Make sure the nursery has appropriate airflow.
nursery is too high, we can remove extra moisture It is critical to frequently evaluate soil moisture levels
from the air with a dehumidifier. This is especially and alter watering and other practices as needed to
vital in humid climates or during the summer maintain the ideal soil moisture level for the plants
months. being cultivated. In the proposed system a pump motor
(iv) Water plants correctly: Overwatering plants can controlled by Arduino is provided to water the plants.
cause surplus moisture in the air and promote A continuous monitoring is done by connecting a soil
fungal growth. Check that you are not overwatering moisture sensor with the Arduino which connects soil
your plants. moisture information every instant.
(v) Use Mulch: Mulch your plants to keep moisture We examined five separate plant species that thrive in a
in the soil and prevent excess moisture from variety of atmospheric circumstances in this article: Lemon
evaporating into the air. Grass, Basil, Aloe-Vera, Rosemary, and Ashwagandha. An
(vi) Use a humidity tray: A humidity tray is a water- Arduino Microcontroller, along with sensors and actuators, is
filled tray that is placed beneath your plants. used to maintain the proper conditions for their growth. The
Water evaporates, forming a humid environment aforementioned plants’ corresponding results were obtained
surrounding your plants. using an Arduino microcontroller, as illustrated in Figure 6
(vii) Group plants together: Plants naturally release (a-d).
moisture into the air through a process known as
transpiration. Planting them in groups can help to 5. Results and Discussion
raise the humidity level in the surrounding region.
We may change the humidity levels in our plant Results of the 5-plants using Arduino controller as shown in
nursery using these ways to provide the best the Fig. 2.7(a-d). The status of parameters can be observed on
growing conditions for our plants. In this proposed serial monitor. These results are the screen captures of serial
system a Humidifier and a Dehumidifier has been monitor of Arduino controller of the proposed system.
used. Figure 2.7(a) show the case for Aloe vera temperature
Applications of Arduino for controlling Soil Moisture: To monitoring when its temperature value raised over the limit
increase or decrease soil moisture in a plant nursery, you can (i.e., 35 o C), the cooling system is activated and the colling
do the following: fan is ON. Figure 2.6(b) shows that after activating cooling
system temperature become normal. It took 15 to 20 seconds
(a) Check the soil moisture level: To determine the present to get the things normal. we can observe it on the time stamp.
moisture level, use a soil moisture meter or insert our
finger into the soil. Figure 2.6(c) shows the monitoring of all the three parameters
Temperature, Humidity and soil Moisture values for all the
(b) Water the plants: If the soil is too dry, thoroughly water
above sited plants. From the Fig. 2.6(d), we can clearly
the plants to increase moisture levels. Water should be
observe the activation of cooling system on the raise of
applied gradually so that the soil can absorb it equally.
temperature for one plant Aloe vera.
Overwatering can result in soggy soil and root rot.
(c) Use mulch: A layer of organic mulch, such as wood
chips or straw, applied to the soil around the plants can 6. Conclusions
help to retain moisture and reduce evaporation. This technique will limit human intervention with the nursery
(d) Increase humidity: Raising the humidity level in the and offer the former more time to focus on successful plant
nursery can aid in the preservation of soil moisture. To growing procedures. Depending on the size of the nursery,
increase the humidity level, use a humidifier or mist the the scope of this system can be expanded. This technology
plants on a regular basis. will significantly reduce the nursery’s upkeep costs. This can
18 Algorithms in Advanced Artificial Intelligence

(a) Reading the temperature of the plant

(b) Controlling the temperature by cooling system

(c) Reading temperature, humidity, soil moisture values

(d) Cooing system enabled for temperature raise

Fig. 2.7 The suggested system’s screen captures of the serial monitor of the Arduino Microcontroller
Self-caring Autonomous Medicinal and Aromatic Plants (MAP) Nursery Using Arduino Microcontroller 19

be improved by incorporating Internet of Things components 10. Anonymous, (1985). Wealth of India. Raw material, CSIR,
into the system to remotely monitor nursery conditions. We New Delhi. 11.
can use the Arduino to monitor all of these parameters within 11. R. Nandhini, S. Poovizhi, Priyanka jose, R. Ranjitha, S.
the plant to verify that they are within the required range for anila, (2017). Arduino based smart irrigation system using
IoT, 3rd National Conference on Intelligent Information and
MAP growth. Many MAP species are in high demand for
Computing Technologies.
both personal and commercial use in the herbal industry.
12. T. Saha, M. K. H. Jewel, M. N. Mostakim, N. H. Bhuiyan,
M. S. Ali and M. K. Rahman, (2017). Construction and
References Development of an Automated Greenhouse System Using
Arduino Uno, I.J. Information Engineering and Electronic
1. Deshpande D. J., (2005). Commercial cultivation of medicinal Business. 3, 1-8.
and aromatic plants. Himalayan publishing house, Mumbai. 13. Aishwarya Kagalkar, (2017). Smart Irrigation System,
2. Devi PR, Aparna D, Babu MR, Sekhar MR, Sunitha P et al., International Journal of Engineering Research & Technology.
(2017). Status, scope and promotion of medicinal and aromatic 6, 05.
plants in Andhra Pradesh. Journal of Pharmacognosy and 14. Vimal P V, K S Shivaprakasha, (2017). IOT Based Greenhouse
Phytochemical. 6(6):283-289. Environment Monitoring and Controlling System using
3. Farooqi AA, Sreeranu BS., (2001). Cultivation of Medicinal Arduino Platform, International Conference on Intelligent
and aromatic crops. University press (India limited), Computing, Instrumentation and Control Technologies
Hyderabad. (ICICICT).
4. Khare CP., (2007). Indian Medicinal Plants an Illustrated 15. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
Dictionary. Springer Publications, America. in Health Care Using Hybrid Model Algorithms”,Journal of
5. Kirtikar KR, Basu BD., Indian Medicinal Plants. Lalit Mohan Artificial Intelligence, Machine Learning and Neural Network
Basu, Allahabad, 1918;2 (I, VIII). (JAIMLNN), Volume 3, 2023.
6. Kurain A, Sankar AM., (2007). Medicinal Plants. New India 16. Gabriel Villarrubia, Juan F. De Paz, Daniel H. De La Iglesia
Publishing Agency, New Delhi-110088. and Javier Bajo, (2017). Combining Multi-Agent Systems and
7. Sharma R., (2013). Agro techniques of medicinal plants. Wireless Sensor Networks for Monitoring Crop Irrigation,
Daya Publishing House, New Delhi. Sensors. 17, 1775.
8. Anonymous, (2008). Trees of Gujarat. Gujarat Forest
Department, Gandhinagar. Note: All the figures and tables in this chapter were designed by
9. Trivedi PC., (2010). Medicinal Plants: Conservation and the author.
utilization. Edn 2, Aavishkar Publishers, Distributors, Jaipur.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
20 Algorithms in Advanced Artificial Intelligence

Segment Anything: GPT-3 and

Logistic Regression Approach for
Skin Cancer Detection
3

V. N. V. Sri Harsha1
Assistant Professor of CSE – AI&ML, CMR Technical Campus, Hyderabad, India
S. Rao Chintalapudi2
Professor of CSE - AI&ML, CMR Technical Campus, Hyderabad, India
V. S. Manoj Kumar Chenna3
Assistant Professor of CSE, CMR Technical Campus, Hyderabad, India

Abstract: The ability to identify and categorize computer vision analysis high dimensional data is crucial for machine learning.
Many Skin Cancer Diagnosis Disease Detection Systems based on artificial intelligence (AI) in Computer Vision Model use
the categorization of illnesses into benign and malignant groups, including the classification of skin cancer. Among the several
categorization methods, logistic regression stands out for its clarity, efficiency, and interpretability. This paper will cover the
theories underlying cancer diagnosis and illness recognition, the Hybrid model of Skin Cancer Detection operation of Segment
Anything: The GPT-3 of Computer Vision and logistic regression, as well as their potential Deep Learning application to the
classification complex associated with skin cancer.
Keywords: Computer vision, Logistic regression, Skin cancer, Segment anything, The GPT-3

1. Introduction customers. It aids production facilities in preventing defective

products from reaching consumers by identifying them
Computer vision is a branch of AI that allows computers early on in the manufacturing process. It makes it easier for
to interpret and analyze visual input. A method mimics the insurance adjusters to assess vehicle damage and minimizes
way humans take in and make sense of their environment. fraud at every stage of the claims process. X-rays, MRIs, and
It uses ML (machine learning) models to identify and label ultrasounds are all tools used by doctors to diagnose illness.
objects in digital imagery. Because of this, computers can One of the costliest medical problems in the world is now
now act on the information they find. Segmenting an image, thought to be related to cancer. Unrepaired DNA damage
identifying an object or a face, recognizing a pattern or an produces mutations, and skin cancer occurs when abnormal
edge, classifying an image based on one or more features, cells grow uncontrollably within the epidermis, the outermost
and matching features are all examples of computer vision layer of skin. Due to these alterations, skin cells multiply
techniques [1]. Using computer vision paves the way for a rapidly and tumors form. Merkel cell carcinoma (MCC),
plethora of new possibilities in the realm of technology. It has basal cell carcinoma (BCC), squamous cell carcinoma
enabled self-driving cars to go safely on highways and roads, (SCC), and melanoma are the four most common forms of
facial recognition software to identify persons in photos, and skin cancer. Due to individual differences in skin tone, size,
augmented reality software to superimpose virtual objects kind, and location on the body [3, 4, and 5], the appearance of
onto actual photographs [2]. Computer vision applications skin cancers can vary greatly from person to person. Sunlight
are used in many fields to boost safety, cut costs, and satisfy
1
sriharshavemparala@gmail.com, 2srao.chintalapudi@gmail.com, 3manojkumarchenna1996@gmail.com

DOI: 10.1201/9781003529231-3
Segment Anything: GPT-3 and Logistic Regression Approach for Skin Cancer Detection 21

and other forms of ultraviolet (UV) radiation are the leading drop precipitously after metastasis [9]. Artificial intelligence
causes of skin cancer. Sunlight and artificial UV rays in has the potential to improve skin cancer detection, lowering
tanning booths are the leading causes of skin cancer. New death rates from the condition [10]. Workload can be reduced
melanoma instances are expected to fall by 5.6% in 2023, and skin lesion diagnosis can be improved with the use of
while the mortality rate is expected to fall by 4.4% [6]. AI-based solutions [11, 12]. One form of deep learning AI
Dermatologists have a far better chance of entirely eradicating that can help with melanoma detection and sickness outcome
skin cancer if it is detected early and treated with minimal prediction is convolutional neural networks (CNN) [4, 13]. To
scarring. Before a tumor becomes malignant or grows deeper detect skin cancer, Hasan et al. [13] used a CNN model. After
into the skin, a doctor will often detect it in its precancerous the Dermoscopic pictures have been segmented, the features
stage. Artificial intelligence (AI) has several potential of the damaged skin cells are extracted using a feature-
applications in dermatology. To foretell the characteristics of extracting technique. To classify the collected features, they
future samples and carry out tasks, machine learning (ML), employed a CNN classifier and got an accuracy of 89.5%.
a subfield of AI, employs statistical techniques and models Ningrumet et al. [14] have developed a skin cancer diagnostic
[7]. Despite its importance in the detection of skin cancer, model that has an accuracy of 92.34% using CNN and ANN
dermatology lags behind radiology in the embrace of artificial techniques.
intelligence. As its use, expansion, and development continue
and as new technologies emerge, AI is becoming increasingly 3. Proposed Work
accessible to the general public. The early identification of
skin cancer is aided by AI. For the detection of skin cancer, The suggested initiative includes the detection and diagnosis
for instance, deep convolutional neural networks might be of skin cancer. The segmented hybrid model for skin cancer
employed to assess skin-imaging data [8]. diagnosis: Skin cancer classification problems are resolved
by the GPT-3 of computer vision and logistic regression, as
The essay is structured as follows: The first section serves well as its forthcoming deep learning applications.
as an introduction. In Part 2 of this essay, you’ll find some
relevant material. In Section 3, we discuss the proposed
methods for detecting skin cancer with the Segment Anything 4. Skin Cancer Detection using
Model and for comprehending cancer classification with the Segment Anything Model
Logistic Regression. Section 4 is when everything wraps up
Regular sun exposure is a major risk factor for the development
in the experiment. The end of Section 5 is stated briefly in
of skin cancer, also known as the abnormal growth of skin
Section 6.
cells. This common kind of cancer can, however, appear
even in sun-protected areas of the skin. The three most
2. Related Work common forms of skin cancer are melanomas, squamous
cell carcinomas, and basal cell carcinomas. Lessening one’s
Because of its visual complexity, dermatology is at the
exposure to ultraviolet light can help prevent skin cancer.
forefront of the current AI revolution. The field of study that
Changes in the skin can be used as an early warning system.
seeks to understand the workings of the human brain and how
When skin cancer is detected at an early stage, it is easier to
they might be replicated in machines is known as artificial
treat, and the patient has a greater chance of survival [4].
intelligence, or AI for short. As AI becomes increasingly
prevalent in the scientific community, this section educates The Segment Anything Model (SAM) is new NLP model
readers on the benefits and latest research findings of AI in that generalizes to zero and a few short learning tasks, a
the detection and treatment of skin cancer. Recent studies trend in computer vision and NLP that enhances their field
and significant ML results in a wide variety of tumor advancements. Segmentation models train with images and
areas relevant to dermatology have also been evaluated. masks, relying on data distribution at training time. SAM offers
The incidence of skin cancer, which includes malignant “prompts” for user input, allowing for interactive predictions,
melanoma and non-melanoma skin cancer (NMSC), is on point locations, boundary boxes, masks, and text prompts.
the rise [8]. While doctors with advanced training may spot The model captures point and box prompts, converts them
cancer, the difficulty of gaining access to them has increased to positional embeddings, and learns additional embeddings
the need for automated methods. This has the dual benefit of to distinguish between boxes and points. Mask prompts are
reducing healthcare costs and saving lives. Since melanoma converted to embeddings using a convolutional network.
covers such a broad spectrum of features, it can be difficult to The text prompt is trained using image encodings from the
differentiate benign skin lesions from skin cancers. Melanoma CLIP model, eliminating data labeling. The researchers
has the worst survival rate of all skin cancers. If caught early generated the largest segmentation dataset by adapting mask
enough, surgery is a certain cure. However, survival rates predictions based on prompts, collecting, and retraining the
22 Algorithms in Advanced Artificial Intelligence

Fig. 3.1 Disease segment anything: The GPT-3 of computer vision

model on newly annotated images. The following steps are each input cancer data point into one of two groups. A
shown in Fig. 3.1. supervised learning technique created specifically for these
situations is logistic regression. Logistic regression is built
5. Understanding Cancer on the underlying sigmoid function (or logistic function).
This S-shaped curve may convert any number between 0 and
Classification Using Logistic 1 into a value between those extremes. In mathematics, the
Regression logistic function is represented as follows:
Consider developing a system that can identify benign and 1
S( x ) = (1)
malignant growths based on characteristics such as color, 1 + e z
weight, and structure. The binarization approach divides
Segment Anything: GPT-3 and Logistic Regression Approach for Skin Cancer Detection 23

The input cancer traits and their corresponding weights are 7. Conclusion
combined linearly to form x, where e serves as the base for
the natural logarithm. The result of the logistic function may Skin cancer has been one of the most prevalent forms of cancer
be viewed as the likelihood that a certain data point belongs in the recent decade. Since the skin is the biggest organ in the
to a specific class. The model predicts class 0 if S (x) is close human body, it can be considered that skin cancer is the most
to 0, and class 1 if S (x) is close to 1. prevalent kind of cancer in people. As a result, this work
was started with the objective of detecting and classifying
6. Experimental Result whether a person is having skin cancer or not. Although it
looks simple, a machine learning model: logistic regression
Comparisons are made between the F1-score, accuracy, has a considerable impact on classification problems.
sensitivity, specificity, and F1-score of the GPT-3, YcbCr, Despite the complex nature of neural networks and ensemble
and RGB algorithms. Segment Anything GPT-3 accuracy techniques, logistic regression is still a reliable collaborator
measures are compared to the RGB and YcbCr accuracy in attempting to categorize cancer diagnostic data as you
metrics from the research, and the outcomes are as follows: delve deeper into the field of machine learning. Various deep-
Accuracy: Accuracy evaluates model performance across learning techniques also have been applied to computer-based
all classes, with equal weight for each. It is determined by skin cancer detection in prior studies. This paper applied an
dividing forecasts by the proportion of accurate estimations. NLP model named “Segment Anything”, which is used in
image segmentation tasks for detecting skin cancer. It works
Sensitivity: The sensitivity, also known as the genuine by indicating which areas of an image should be separated
positive rate or recall rate, measures an individual’s ability to and can be used for a range of segmentation tasks without
identify positive events. the requirement for additional training. It is concluded that,
Specificity: How effectively a model can forecast actual among different models, the Segment Anything: GPT-3
negatives in each of the categories that are accessible depends model performed well with a 97 percent accuracy rate. In
on its specificity. the future, the YOLO techniques, a deep learning algorithm
Precision: Precision is determined by dividing the total usedfor object detection can also be applied for skin cancer
number of positively recognized samples by the number of detection using image segmentation.
correctly identified positively identified samples.
F1-Score:The F1 score is a metric used to evaluate the References
performance of two classifiers by averaging their accuracy 1. Bishop, C. M. “Pattern Recognition and Machine Learning”,
and recall. Springer, 60, 1, 78–78, 2006.
2. Géron, A.: Hands-On Machine Learning with Scikit-Learn,
Table 3.1 Graph for Likening algorithms RGB, YcbCr and Keras, and Tensor Flow, O’Reilly Media, 2019.
Segment Anything: GPT-3 3. Kinnor Das, Clay J. Cockerell, Anant Patil, PawełPietkiewicz,
Algorithm Accuracy Sensitivity Specificity Precision Fl-score Mario Giulini, Stephan Grabbe, Mohamad Goldust.“Machine
Learning and Its Application in Skin Cancer”, International
RGB 0.75 0.79 0.79 0.80 0.80
Journal Environ Res Public Health, 18(24): 13409, 2021.
YCbCr 0.89 0.88 0.85 0.84 0.85 DOI: 10.3390/ijerph182413409.
GPT-3 0.97 0.96 0.97 0.97 0.98 4. Reza Ahmadi Mehr and Ali Ameri.„Skin Cancer Detection
Based on Deep Learning”, Journal Biomedical Physic
Engineering, 12(6): 559–568, 2022. DOI: 10.31661/jbpe.
v0i0.2207-1517.
5. Ferlay J., Colombet M., Soerjomataram I., Parkin D.M.,
Pineros M., Znaor A., Bray F. “Cancer statistics for the year
2020: An overview”. International journal of cancer. 2021.
doi: 10.1002/ijc.33588.
6. Skin Cancer Facts & Statistics [Internet]. The Skin Cancer
Foundation. [(accessed on 22 June 2023)]. Available online:
https://www.skincancer.org/skin-cancer-information/skin
cancer-facts.
7. Hastie, T., Tibshirani, R., & Friedman, J.“The Elements of
Statistical Learning: Data Mining, Inference, and Prediction”,
Fig. 3.2 Relating algorithms RGB, YcbCr and segment Springer, 2009.
anything: GPT-3
24 Algorithms in Advanced Artificial Intelligence

8. Apalla Z., Nashan D., Weller R.B., Castellsaque X. Skin 13. Haenssle, H.A.; Fink, C.; Schneiderbauer, R.; Toberer, F.;
cancer: Epidemiology, disease burden, pathophysiology, Buhl, T.; Blum, A.; Kalloo, A.; Hassen, A.B.H.; Thomas,
diagnosis, and therapeutic approaches. Dermatology and L.; Enk, A.; et al. “Man against machine: Diagnostic
therapy, 7((Suppl. 1)): 5–19, 2017. doi: 10.1007/s13555-016 performance of a deep learning convolutional neural network
0165-y. for Dermoscopic melanoma recognition in comparison to 58
9. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis dermatologists”,Annals of oncology, 29, 1836–1842, 2018.
in Health Care Using Hybrid Model Algorithms”,Journal of 14. Hasan, M., Barman, S. D., Islam, S., Reza, A. W. “Skin cancer
Artificial Intelligence, Machine Learning and Neural Network detection using convolutional neural network”, In Proceedings
(JAIMLNN), Volume 3, 2023. of the 2019 5th international conference on computing and
10. Davis L.E., Shalin S.C., Tackett A.J. Current state of melanoma artificial intelligence, 254–258, 2019.
diagnosis and treatment. Cancer Biol. Ther.; 20: 1366–1379, 15. Ningrum DNA, Yuan SP, Kung WM, Wu CC, Tzeng IS, Huang
2019. doi: 10.1080/15384047.2019.1640032. CY, Li JY, Wang YC. “Deep Learning Classifier with Patient’s
11. Jutzi T.B., Krieghoff-Henning E.I., Holland-Letz T., Utikal Metadata of Dermoscopic Images in Malignant Melanoma
J.S., Hauschild A., Schadendorf D., Sondermann W., Fröhling Detection”,Journal of Multidisciplinary Healthcare,14, 877
S., Hekler A., Schmitt M., et al. Artificial Intelligence in Skin 885, 2021. doi: 10.2147/JMDH.S306284. PMID: 33907414;
Cancer Diagnostics: The Patients’ Perspective. Frontiers in PMCID: PMC8071207.
medicine, 2020; 7:233. doi: 10.3389/fmed.2020.00233. 16. Malvehy J., Pellacani G. “Dermoscopic, confocal
12. Sengupta S., Mittal N., Modi M. Improved skin lesions microscopy and other non-invasive tools for the diagnosis
detection using color space and artificial intelligence of non-melanoma skin cancers and other skin conditions”,
techniques. Journal of Dermatological Treatment, 31, 511– ActaDermato-Venereologica, 97((Suppl. 218)): 22–30, 2017,
518, 2020. doi: 10.1080/09546634.2019.1708239. DOI: 10.2340/00015555-2720.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Enhancing Metric Learning Reliability for

Pose-Oriented Face Recognition by Visual
Assessment of Tendency
4

Pinisetty Rajasekhar*
Assistant Professor, Dept of Mathematics, J.N.T.University, Kakinada, A.P., India
V. Ravindranath
Professor, Dept of Mathematics, J.N.T.University, Kakinada, A. P., India

Abstract: A major challenge with face recognition is that it is dependent on pose-oriented considerations (Ziad Hafid, 2001)
[1]. In order to look at the problem with face recognition from many angles, as proposed by Abhishek Sharma in 2012 [2], This
study suggests fuzzy pose-based integral clusters for both clockwise and anticlockwise images. To improve facial recognition,
the authors propose utilizing a technique known as “cluster tendency” to identify groups of people based on their similarities.
Assessing the clustering tendency is a crucial first step in cluster analysis. The Visual Assessment of Tendency (VAT) method
is one useful tool for determining cluster tendency. Using DCT and Metric Learning, the tendency of clustering in relational
or object data may be visually examined by normalising and extracting features from the picture matrix generated by VAT for
person matching. Examining the likelihood of producing a particular set of angle picture data with the same allocation allows
analysts to ascertain the collection’s clustering tendency. When it comes to detecting criminal identities, it also determines how
much spatial unpredictability there is in the posture image data. Also, look at the investigation’s findings to see how trustworthy
face recognition technology is for police work.
Keywords: Discrete cosine transform, Fuzzy cluster analysis, Pose oriented face recognition, Visual assessment tendency,
Metric learning of neighborhood component analysis etc.

1. Introduction to Zhigang Yu, 2022 [14], computer vision is a technology that

processes images to allow computers to comprehend digital
AI-based face identification uses deep learning, computer photos or movies and automate tasks that are comparable to
vision algorithms, and image processing to locate, recognise, those performed by human visual systems. Algorithms for
and validate faces in digital images and videos (KH Teoh et facial recognition take markers from an image, including
al., 2021 [7]). Demand for the technology is growing quickly the nose, eyes, and mouth, to identify facial features jaw and
for a variety of applications, such as security systems, smart cheekbones. Additionally, they can reduce data and normalise
phones, and door unlocking. Applications in medicine also photos for face recognition, according to Ziad Hafid, 2001
make use of it. Algorithms that can discern emotions from a [1]. Geometric and photometric techniques distinguish
person’s facial expressions exist even today. Face detection recognition algorithms; holistic models identify the face as
and face recognition are two methods used in image or video a whole, whereas feature-based models examine the spatial
processing (Jason Brownlee, 2019 [6]. Face detection is a relationship between individual features. In CCTV imagery,
more user-friendly method that can be used for picture tagging face hallucination is utilised to improve low-resolution face
or altering photo perspectives, while face recognition uses images for distant human identification [17–18]. This method
complex processing methods to identify a person. Automated can overcome the drawbacks of super-resolution algorithms
systems and security checks require both processes. According and enhance the performance of high-resolution facial
*Corresponding author: rajasekharpinisetty@gmail.com

DOI: 10.1201/9781003529231-4
26 Algorithms in Advanced Artificial Intelligence

recognition systems. In order to pre-treat photos with faces lighting approaches described by R.N.V. Jagan Mohan et al.
hidden, face hallucination algorithms must first be trained (2012) [9]. The unique angle orientation algorithm proposed
on similar images, both with and without disguise. However, by R.N.V. Jagan Mohan et al. (2014) [10] should be applied
because of fleeting facial emotions, these algorithms may to all face recognition inputs to ensure they are in an upright
not be able to effectively map the whole state of the face and frontal posture before they are compared to the database
(Abhishek Sharma, 2012 [2]).Deep learning-based facial image. In order to compare an angularly oriented input with
recognition technology is the least intrusive and most quick a database image, it must first be rotated to the correct angle
biometric identification method (Xinyi Wang, 2022 [13]). in either a clockwise or anticlockwise direction, as stated by
It is helpful in security applications like criminal detection R.N.V. Jagan Mohan, 2016 [11]. In a similar vein, this will
since it compares acquired photographs to stored face prints put the face in an upright, frontal position. The spinning axis
using image processing and machine learning. According to makes it easy to identify the face in the shot. The face will
Lixiang Li (2020), face recognition is a technology that takes spin anticlockwise if the input image flips from horizontal
images from surveillance footage and classifies them. Face to vertical. The face will also rotate clockwise if the supplied
recognition technology finds applications in face tracking, image goes from vertical to horizontal. If the input picture is
forensic surveillance, criminal detection, and airport security. skewed in any way, we straighten it out using the rotational
Training and database comparisons of previously saved axis before comparing it.
images are required. Unsupervised learning uses unlabeled Rotate the supplied image clockwise or anticlockwise as it is
data to examine a data collection for patterns; clustering oriented at an angle. Use this in a pose-oriented recognition
or cluster analysis is required to find patterns among system. The fuzzy rule can be used to cluster images with an
unlabeled data pieces. Preprocessing images, extracting angle orientation, as shown below.
features, clustering images based on similarity, and selecting
q 90∞
the optimal number of clusters are all necessary. R.N.V.
Jagan Mohan, 2020 [12] states that feature extraction Ú x(q )dq = Ú cost dq (1)
0 0
from mathematical models or trained convolutional neural
networks improves clustering results for a range of photo- q 30∞ 60∞ 90∞

clustering problems. Abhishek Sharma’s study proposes Ú x(q )dq = Ú cost dq + Ú cosq dq + Ú cos q dq (2)
fuzzy pose-based integral clusters for photos taken in both 0 0 30∞ 60∞
0∞
clockwise and anticlockwise directions. Facial recognition is = (sinq )30 60∞ 90∞
0 + (sinq )30∞ + (sin q )60∞
(3)
a hurdle in pose-oriented topics. Cluster tendency is a method
used to determine likely clusters from individual person The outcome, if the ∫x(θ)dθ =∫cosθdθ = Sin θ process is
identifications. When evaluating clustering tendencies in valued, has a range of fuzzy variable values from 0 to 1,
relational or object data, the Visual Assessment of Tendency which is a range of values for the same function. Choose one
(VAT) methodology is employed. This aids in determining of the next three clusters, and then take it. We choose the first
the dependability of angle-oriented image data gathering for cluster.
law enforcement applications and assesses its likelihood to 30∞ 30∞
cluster.The format of the paper is as follows: Part 1 covers Ú x(q ) dq = Ú cos q dq (4)
the introduction. Part 2 of this work provides a brief overview 0 0
of fuzzy cluster analysis for pose-oriented facial photos. When analyzing the second cluster,
Section 3 covers the DCT transformation method for pose- 60∞ 60∞
oriented face images. The use of neighbourhood component Ú x(q ) dq = Ú cos q dq (5)
analysis for face recognition will be covered in Part 4, which 30∞ 30∞
is a machine learning component. Step-by-step procedure for The third cluster is what we take.
pose-oriented face detection in Section 5. Section 6 contains 90∞ 90∞
the outcomes of the experiment. The last one is the conclusion
of Section 7, which is mentioned in Section 8. Ú x(q ) dq = Ú cos q dq (6)
60∞ 60∞
Select the sequentially arranged clusters for identification
2. Fuzzy Pose Oriented Face Image after classifying the angle-oriented clusters.
Cluster Classification
The alignment of the faces in the images should be the 3. Discrete Cosine Transform
determining factor in sorting them. To fix images that don’t The discrete cosine transform (DCT) has been used as a feature
have an angle of 900 degrees, first rotate them to that angle extraction step in various face recognition investigations by
and then use normalisation methods like the geometric and Annadurai et, 2004[3]. Until now, discrete cosine transforms
Enhancing Metric Learning Reliability for Pose-Oriented Face Recognition by Visual Assessment of Tendency 27

have been applied either in a holistic appearance-based the precision of nearest-neighbour classification. The
sense or in a local appearance-based sense while largely approach employs a stochastic variation to directly optimise
discarding the spatial information. During the classification the leave-one-out k-nearest neighbours (KNN) score on the
phase, feed certain neural network types with local DCT training set. Also, it can learn a low-dimensional linear data
coefficients or simulate them statistically. Ahmed, Natarajan, transformation (Goldberger, 2005 [4]), which can be used
and Rao introduced the discrete cosine transform (DCT) by for fast data presentation and classification. They determine
R.N.V.Jagan Mohan, 2012[8]. The DCT have recommended the softmax probability of the Mahalanobis distance by
a number of modifications. decomposing it. M = LTL and define the probability pij that xi
N is the neighbor of yj:
p (2n 1)(k 1)
y(k, 1) = w(k )Â x(n) cos ,
n=1 2N exp( || Lxi Lx j ||22 )
pij = , pii = 0 (9)
where k = 1, º, N (7) Â exp( || Lxi Lxl ||22 )
l πi
Where
The likelihood that xi the stochastic closest neighbours rule
Ï 1 will correctly classify the data is then:
Ô , k =1
Ô N
w( k ) Ì (8) pi = Â pij (10)
Ô 2 j : j π i , y j = yi
ÔÓ , 2£k£N
N The optimization challenge is the search for matrix L that
The two matrices, x and y, have the same dimensions and maximizes the overall likelihood of being correctly classified.
length, N. The deterministic column transformation (DCT) L = argmax Â pi (11)
transforms the x-matrix. Since vectors truly run from 1 to N i
and not 0 to N-1, it is not customary to index the series from
n = 0 and k = 0, but rather from 1 and N. We use a discrete 5. Face Normalization Images with an
cosine transform to extract the feature vectors from the input
sequence in line with the previously reported formulas by
Angle Orientation Cluster Tendency
Aman (2012) [2]. Method
If you have an object vector or numerical dissimilarity values
4. Neighborhood Components for every pair in your object set, you can visually assess the
Analysis (NCA) clustering tendency of the object set. Before importing angle-
based photos into any clustering technique’s face recognition
An approach to distance-metric learning, in contrast to the system, it is imperative to confirm that the angle-oriented data
conventional use of the geometrical mean, seeks to improve sets contain meaningful clusters (i.e., non-random structures).

Fig. 4.1 Pose oriented cluster images

28 Algorithms in Advanced Artificial Intelligence

If so, tell me how many clusters there are. This approach faces that are supposedly the same size, orientation, position,
aims to evaluate the viability of the clustering analysis or and illumination after it has been normalised. Acquiring the
clustering tendencies. The propensity for clustering Abhishek attributes used in this comparison involved a transformation
Sharma et al. (2012) [1] examined an angle-oriented face procedure. The Discrete Cosine Transform (DCT), a widely
photo dataset to determine whether any meaningful, non used transformation in this field, is used for feature extraction
random groupings are present in the dataset. The following in several face recognition investigations. We split the input
approaches are examined to see if angle-oriented face images photographs into N × N portions to determine the regions that
can cluster, as suggested by Hui-Fuong Ng et al., 2006 [4]: will be processed locally. The data is transformed into the
• Images using homogeneous n-angles as samples frequency domain using a two-dimensional N × N Discrete
(a1 … an) from Angle Oriented Dataset D. Cosine Transform (DCT). According to Annadurai et al.
(2004), researchers can use statistical operators to compute
• Find the distance using, xi, between each angle oriented
various spatial frequency functions within the blocks
image and its closest neighbor: For each pose ai ∈ D,
and derive a DCT coefficient at the block level [2]. Face
find its nearest neighbor aj; then compute the distance
Recognition: The final stage is face matching. This technique
between ai and aj and denote it as xi = dist(ai, aj).
can recognise a specific input image by matching its feature
Because all of the input images in a clockwise or anti- vector to the feature vectors stored in the database, according
clockwise cluster assessment tendency have the same face to Zhang and Jin (2007) [14]. Several neighbourhood
but various angles, pose-oriented cluster pictures demonstrate component analysis classifiers for metric learning and
that each image is effective. distance learning are used. It is frequently necessary to
compute the averages for each column of the matrix after
6. Process of Face Recognition establishing the distances for the N × N matrix. The input
image and the database image are the same when the overall
System average is negative or zero, as stated by Goldberger et al.
Facial recognition is a challenging pattern recognition issue (2005) [3].
in computer science that aims to identify human faces in three
dimensions from two-dimensional images. The procedure
consists of four steps: face detection, face segmentation,
face alignment, face feature extraction, and face matching
against a database. The first step is to isolate the face from its
background; the second is to correct the image’s alignment
based on aspects including size, stance, and photographic
characteristics. The procedures for facial recognition with
a focus on poses are as follows: Getting a Consistent Face:
The initial stage of any face recognition system is face
normalisation. The face area is located as a first stage in the
face recognition process. The face normalisation technique
measures the difference between the size of the input image (N
x N) and the database picture. In order to fit, the input image
Fig. 4.2 Face recognition process
must be reduced if its size differs from the database image’s.
If the orientation of the selected image needs to be changed
to match the database image, rotate the angled face from 00 7. Experimental Result
to 900 until it aligns with the database image. According to
R.N.V. Jagan Mohan et al. (2012) [8], the rotation might The DCT with Neighborhood Component Analysis
be either clockwise or anti-clockwise, depending on the approaches are evaluated against the MIT and FERET face
chosen posture of the image. A measurement, or collection databases as well as the student population of the University
of measurements, is used in feature extraction to characterise Database in a typical execution setting by making use of the
a feature. Every measurement reveals the measurable quality input photos’ orientations (clockwise). Table 4.1 displays
of an object. Its calculation makes use of several important the percentage level of recognition for the two experimental
object-specific properties. All aspects can be categorised procedures. Neighbour Component Analysis reveals startling
using either high-level or low-level qualities. Extracting developments in DCT dependability. In Fig. 4.3, we can see
low-level traits from the source photographs derives them the results of the two experimental techniques’ recordings at
from high-level qualities. The face can be compared to other various recognition levels. The graph makes it quite evident
Enhancing Metric Learning Reliability for Pose-Oriented Face Recognition by Visual Assessment of Tendency 29

Table 4.1 Performance in DCT of pose oriented images i.e., 6. Hui-Fuang Ng: Pose-Invariant Face Recognition Security
clockwise and anti-clockwise System, Asian Journal of Health and Information Sciences,
2006.
S.No Scale of Pose Performance Performance in
7. Jason Brownlee: How to Perform Face Recognition with
Oriented Images in DCT for DCT for Anti-
VGGFace2 in Keras, Machine Learning Mastery, 2019.
Clockwise Clockwise
8. KH Teoh, RC Ismail, SZM Naziri, R Hussin, MNM Isa and
1 00-900 96.67 93.34 MSSM Basir: Face Recognition and Identification using
Deep Learning Approach, Journal of Physics: Conference
Series 1755(2021) 012006 IOP Publishing doi:10.1088/1742
100 Pose Oriented Images 6596/1755/1/012006, 2021.
9. Lixiang Li, Xiaohui Mu, Siying Li, Haipeng Peng: A Review
95 of Face Recognition Technology, IEEE Access (Volume-8),
Page(s): 139110 –139120, Electronic ISSN: 2169-3536,
90 INSPEC Accession Number: 19974172, DOI: 10.1109/
Performance in DCT for Performance in DCT for ACCESS.2020. 3011028, 2020.
Clockwise Anti-Clockwise 10. R.N.V. Jagan Mohan, R. Subbarao, K.Raja Sekhara Rao:
Similarity of Inference Face Matching on Angle Oriented Face
Fig. 4.3 Performance in DCT of pose oriented images i.e.,
Recognition, Published in Journal of Computer Engineering
clockwise and anti-clockwise
and Intelligent Systems from www.iiste.org, ISSN 2222-1719
(Paper) ISSN 2222-2863 (Online), Vol 3, No.2, 2012.
that the reliability performance of DCT for neighbouring 11. R.N.V. Jagan Mohan and K. Raja Sekhara Rao: Target
component analysis steadily improves with increasing data Inference on Evaluation of Angle Oriented Cluster, Computer
amounts. Science and Information Technology 2(3): 121-125, DOI:
10.13189/csit.2014.020301, Copyright © 2014 Horizon
Research publishing all rights reserved. http://www.hrpub.org,
8. Conclusion and Future Perspective 2014.
Fuzzy technique applications to nested clusters based on 12. R.N.V. Jagan Mohan: Angle Oriented Based Image Analysis
angles, specifically 0–300, 310–600, and 610–900. With Using L-Axial Semi-Circular Model, Published in Asian
Journal of Mathematics and Computer Research, ISSN No:
the aid of the visual assistance tendency model, an angle-
2395-4205(Print), 2395-4213(Online), Vol-10, ISSUE-4,
oriented approach is presented; the testing results show its
Page No-320-331, 2016.
effective implementation, and it can rotate in both clockwise 13. R.N.V.Jagan Mohan: Cluster Optimization Using Fuzzy
and anticlockwise directions. Based on average classifier Rough Images, International Journal of Multimedia and
values, Neighbourhood Component Analysis of Metric Image Processing,ISSN:2042-4647,DOI:10.20533 /
Learning suggests using this approach. Studies have shown ijmip.2042.4647.2020.0062, Impact Factor (IF):5.48,
that the suggested pose-oriented discrete cosine transform Calculated by Infonomics Society’s Indexing Citation Board
(DCT) increases the accuracy of criminal face detection. (ICB), Pages: 505 -510, Volume-10, Issue-1, March-2020.
14. UP Police launch ‘Trinetra’, its AI-powered face recognition
app to catch criminals, The Financial Express. December 27,
References 2018, Retrieved February 14, 2022.
15. Xinyi Wang, Jianteng Peng, Sufang Zhang, Bihui Chen,Yi Wang,
1. Abhishek Sharma and Murad Al Haj and Jonghyun Choi and
Yandong Guo:A Survey of Face Recognition, Computer Vision
Larry S. Davis and David W. Jacobs: Robust pose invariant
and Pattern Recognition,arXiv:2212.13038,2022,https://doi.
face recognition using coupled latent space discriminate
org /10.48550/arXiv.2212.13038, 2022.
analysis, Computer Vision and Image Understanding 116,
16. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
1095–1110, 2012.
in Health Care Using Hybrid Model Algorithms”,Journal of
2. Aman R. Chadha, Pallavi P. Vaidya, M. Mani Roja: Face
Artificial Intelligence, Machine Learning and Neural Network
recognition using discrete cosine transform for global and
(JAIMLNN), Volume 3, 2023.
local features, IEEE Xplore, DOI: 10.1109/ICONRAEeCE.
17. Zhigang Yu, Yunyun Dong, Jihong Cheng, Miaomiao Sun,
2011.6129742, 16 January 2012.
Feng Su: Research on Face Recognition Classification
3. Annadurai, S., and Saradha, A.: Discrete Cosine Transform
based on Improved Google Net, Hindawi Publications,
Based Face Recognition Using Linear Discriminate Analysis,
Volume 2022, Article ID 7192306, https://doi.
Proceedings of International Conference on Intelligent
org/10.1155/2022/7192306,2022.
Knowledge Systems, 2004.
18. Zhang, Jin: Visualization for Information Retrieval, Springer,
4. Crime and Criminal Tracking Network & Systems (CCTNS):
ISBN 978-3-540-75148-9, 2007.
National Crime Records Bureau. Archived from the original
19.Ziad M. Hafed: Face Recognition Using DCT, International
on February 18, 2022, Retrieved February 18, 2022.
Journal of Computer Vision, pp. 167-188, 2001.
5. Goldberger et al.: Neighborhood Components Analysis, NIPS,
2005. Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
30 Algorithms in Advanced Artificial Intelligence

Verifiable Secure Vehicle Connectivity

Using Machine Learning Framework for
Internet of Vehicles
5

Lanka Divya*, Lanka Divya, Priyadarshini Voosala, R. Shiva Shankar, Ch. Ravi Swaroop
Assistant Professor, Dept. of Computer Science and Engineering,
Sagi Ramakrishnam Raju Engineering College

Abstract: Vehicles and other objects are linked via the Internet of Vehicles (IoV), a system of interconnected road networks
that uses data processing, automated control, communication, and sensors. According to the research, segment-monitoring
units (SMU) might be used to manage traffic and vehicle characteristics. Reduced congestion, lowered speed restrictions,
and accident prevention are all possible because to the SMUs’ ability to regulate traffic and vehicle attributes. Better traffic
management in the IoV may be achieved using real-time vehicular traffic data. Simple and easy route segmentation using the
unsupervised Learning K-Means Approach would be the main focus of the proposed study.
Keywords: Internet of vehicle, Segment monitoring units, Transportation based network, Unsupervised learning K-Means
approach etc.

1. Introduction VANETs allow moving cars to communicate with roadside

infrastructure via Wireless Access in Vehicular Environments
The Internet of Vehicles (IoV) is a system of interconnected (WAVE) technology [7]. IoV improves features, decreases
vehicles that can connect to one another and share data in traffic and accident problems, and increases wireless network
real time over the Internet. This system can communicate technology connection.
with infrastructure, pedestrians, and owners of the vehicles Key technologies such as cloud computing, wireless
themselves [1]. The Internet of Vehicles (IoV) is a significant communications in distributed systems, and big data analysis
technical advancement in the smart car sector. It enables are highlighted by the Internet of Things (IoT). Dedicatedly
vehicles to interact with one other, public infrastructure, and working toward smart mobility, smart workforce development,
surroundings [2-4]. Nevertheless, there are still issues with and smart manufacturing [8]. When referring to using Internet
data collection, distribution, and efficient interaction with of Things (IoT) technology to link intelligent vehicles, IoV.
V2X-equipped cars. One branch of AI, machine learning, can To manage a fleet of cars, this article describes a distributed
help with this problem [5]. measurement platform that focuses on performance across
Decisions made by mobile edge computing in the Internet time [9].
of Vehicles (IoV) leverage Deep Reinforcement Learning Efficient storage, processing, analysis, and decision-making
(DRL) and machine learning approaches. In particular, it are made possible by Artificial Intelligence (AI) technology,
draws attention to the need for AI to solve caching and edge which is vital in IoV-layered architecture. Big data analysis
computation problems in IoV networks. Optimizing QoE was and vehicle cloud computing provide computation, analysis,
made possible by the IoV network design, which included a and real-time service management. To address problems
buffer and energy-aware machine learning [6]. with caching and edge computing, AI in IoV networks use

*l.divya44@gmail.com

DOI: 10.1201/9781003529231-5
Verifiable Secure Vehicle Connectivity Using Machine Learning Framework for Internet of Vehicles 31

deep neural networks and Q-learning. The AI layer consists These SMUs [2] will be able to successfully control traffic
of intelligent systems, cloud computing, and large-scale in the near imminent by minimizing congestion on routes,
data analysis as it pertains to the architecture of the IoT. lowering the speed limits by traffic, and preventing accidents.
Onboard Diagnostic Units (OBU), Road Side Units (RSU) To improve traffic management in the IoV, it is assumed that
and edge servers are used in IoV multimedia communication the suggested systems would learn from real-time traffic data
to facilitate data sharing, inter-vehicle connections, and describing how cars manage time delays and speed limits in
quality-of-service monitoring via sensor nodes. The IoV uses crowded areas. The basic, uncomplicated route segmentation
machine learning technology to provide secure vehicle-to that designates each segment with an SMU could potentially
vehicle communication [10]. function as the primary foundation for the suggested study.
The five portions of this paper’s response can be understood Internet of Vehicle Architecture: The route is divided into
as follows: section 1’s broad introduction. Proposed work segments linked to a SMU. Vehicles approach SMUs via
such as Internet of Vehicle Architecture and Network Wi-Fi, sending their characteristics and data to the cloud for
Establishment, Route Segmentation Using Unsupervised additional controls.
Learning K-Means Approach, and working with SMU i = Forward Direction (East to West)
covered in Section 2. Section 3 deals with experimental
results. Conclusion included in section 4. References are j = Backward Direction (West to East)
included in Section 5. Sj1 = segment 2 in backward direction
Sj1 = segment1 in backward direction
2. Proposed Work Network Establishment: With each vehicle serving as a
Route segmenting is a colloquial phrase for managing random variable with either continuous or discrete features,
traffic on pathways with caution. In the suggested study, IoV Bayesian Belief Networks (BBN) are constructed with
segment-monitoring units (SMU) are implemented. SMU’s cars as nodes and linked to SMUs and cloud edges. Using
responsibility is to control traffic and a range of vehicle a conditional probability to indicate the influence of each
characteristics. vehicle, an edge in an IoV Bayesian network depicts the link
between vehicle characteristics and the SMU.
Vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I)
are the most common IoV communications. The absolute Let IoV includes N1, N2, N3, …. Nn, then the vehicle
minimum condition for attaining inventive ability is high- relationship is determined as P(N1, N2, N3, …. Nn) =
level route segmentation. In any early approach, exhaustive Pb(N1|N1, N2, N3, …, Nn) Pb(N1, N2, N3, …. Nn) [3]
segmentation is relatively infrequent. The present study that = P(N1|N2, N3, … Nn)P(N2|N3, …, Nn) … P(Nn–1|Nn) P(Nn)
demands primary vehicle routing to specified destinations SMU in IoV Bayesian belief network can be depicted as
mentions segmentation in the crowd sources.
given by Divya et al.

Fig. 5.1 The route is divided into segments connected to SMUs, where vehicle characteristics are transmitted via Wi-Fi, and data
is sent to the cloud for further computation
32 Algorithms in Advanced Artificial Intelligence

the vehicles into the appropriate category and then using

directional, speed, velocity, and driving pattern factors
to discover underlying patterns, such as traffic patterns
and speeds. Grouping vehicles in the appropriate segment
according on vehicle-to-vehicle and point-to-point distances
between the cars and SMU is called road segmentation in
the context of the Internet of Vehicles. In IoV, each vehicle
correlates with the nearest segment using unsupervised
K-Means, which uses K segments as centroid. Generally
speaking, the segments are maintained relatively near each
other for efficient traffic control and connection.
SMU is an association of both hardware and software
Fig. 5.2 Illustrates the creation of an SMU using BBN. products in IoV networks. A significant area of innovation in
the Internet of Vehicles (IoV) is the best traffic management
Minkowski distance is referred from the Minkowski distance
function. Existing RSUs may enhance utility and privacy
measure, used to find the distance in a grid path between
by including capabilities such as answering confidential
couples of data points. Manhattan distance is applicable for
requests.
applications involving high dimensions like IoV.
n Working of K-Means: The procedure allocates N vehicles
d = Â Xi Yi (1) to K cluster segments randomly; minimizing the within-
i=1 segment sum of squares, and assigns the closest segment to
each vehicle joining the network. The K-Means data learning
The distance between vehicles and vehicle to SMU is
process starts with random segment selection, focusing on
calculated with (1). Speed is how fastly the vehicle moves
centroid segments. Iterative stabilization is then used to find
from one segment to other segment. It can treat as distance
local optimums, avoiding movement from clusters. Repeated
travelled by vehicle divided by time.
iterations with different starting configurations are used
Distance (2) to select the optimal segment solutions, considering the
Speed of vehicle =
Time heterogeneous and large number of vehicles in IoV.
Speed represented in meters per second, Distance is meters The K-Means procedure is a popular method for dividing
covered, and Time is measured in seconds. large data sets, such as Vehicular networks, into segments,
Let us do in advance that a vehicle is moving at a speed of 30 assigning vehicles to clusters, and repeating this process.
m/s for 5 minutes, the distance covered by vehicle is From Since N observations may be partitioned into K segments,
(2) distance = Speed * Time, Hence, distance = 30 * 5 * 60 it follows that your observations can likewise be partitioned
= 900 meters. into N segments. Every segment consists of S inquiries, with
A vehicle initiates a request with the SMU; the following the kth segment including nk requests. A missing value is
steps are as follows denoted by δijk for the ith query in the jth row of the kth group.
It is possible to determine whether the data is consistent by
• The request is transmitted to the SMU.
dividing the standard deviation by the mean. The constant
• The SMU validates the vehicle or the vehicle is kicked segments are called zij. The end segmentation solution is
out. affected by this method of segment initialization. Each
• The request from vehicle and vehicle parameters like vehicle is randomly assigned to a section during each request
speed, VehicleID are added to the current block of procedure. By optimizing this arrangement using K-Means
requests at the corresponding SMU. techniques, the likelihood of obtaining the global optimal
• The block of requests are then Lock up to the older solution for a given number of clusters is significantly
blocks of requests. increased.
• The request is confirmed at each SMU. Goodness-of-fit criterion: When comparing different cluster
configurations, the within-cluster sum of squares, or WSSk,
3. Unsupervised Learning K-Means is the basis for the goodness-of-fit criterion.
Approach for Route Segmentation k B k n
Ê NB ˆ
The K-means algorithm is a powerful unsupervised learning
WSSk = Á ˜ ÂÂÂ
Ë NB m ¯ k =1 i =1 j=1
(1 d ijk )(zij cik )2 (3)
tool. Focusing on K-means for the IoV, this entails classifying
Verifiable Secure Vehicle Connectivity Using Machine Learning Framework for Internet of Vehicles 33

Where is the mean centre value of the Query ‘i’ is blocked in The Process of Expectation and Maximization Procedure
the cluster k. steps can be defined:
Input: Vehicle parameters
4. Working of SMU xj, j = 1, 2, …, n and i ε{1, 2, …, k} label set.
A Gaussian mixture model with expectation and maximization Make ready:

( )
map reduction is used by each SMUto regulate traffic in the
Internet of Vehicles effectively. q (0 ) = p1(0 ) , º, pk(0 ) , m1(0 ) , º, m k(0 ) , s 1(0 ) , º, s k(0 ) (6)

Gaussian Mixture Model: Data classification is a challenge E-Step:

at each SMU. Information on the vehicle’s speed, route
direction, velocity, and travel time spent in the relevant
pij(r +1) =p (r +1)
(i| x j ) =
(
pi(r ) N x j | mi(r ) , s i2(r ) ) (7)
segment is displayed numerically at SMU. Assume that f (x j )
these are the potential values of X, a random variable. The
M-Step:
likelihood model may be found by taking into account a
n
blend of the following Gaussian distributions: (r +1) 1 (r )
p̂ij = Â pij (8)
c n j=1
f ( x) = Âpi N (x|m i ,s i2 ) (4)
Â j=1pij( ) x j
n r
i=1
(r +1)
M is total RSUsegments or regions and ry > 0 defines weight m̂i = (9)
(r +1)
np̂i
m
Âry = 1,
( x m̂( ) )
(5) 2
Â j=1pij(
n r +1) r +1
i=1
2 (r +1) j i
Ê ( x m )2 ˆ p̂i = (10)
1 (r +1)
N( mi , s i2 ) = exp Á
i np̂i
˜
s 2p ÁË 2s i2 ˜¯
Repeat Steps 2 and 3 until, ∑iei2 < ε.
where μi, σi2, these two are mean and standard deviation of Find pij = ArgMaxipij(final) j = 1, 2… n.
are class i. In our support vector machine (SMU) model, the
lattice data represent the values of time, direction, velocity, Create traffic models that take into account the characteristics
and speed for a hypothetical vehicle. Having said that, the of the vehicles. An approach for labeling route segmentation
constraints However, the parameters are θ = (p1, …, pk, data that exhibits distinct labels for each fragment or object is
μ1, …, μk, σ12, …, σk2) and we can deduce the number of the IoV-based Expectation Maximization Procedure.
segments in MoG by histogram of lattice data.
Expectation Maximization Map ReduceProcedure: 5. Experimental Result
Expectation maximization is used in pathways for the The experiment term IoT, which is less popular than
route segment Gaussian mixture model. The Expectation IoV, describes how intelligent cars are connected via
Maximization technique is the new name for this strategy. IoT technology. A multiple regression analysis was done

Table 5.1 Real time traffic data

Time Node AppId Seq. Type Delay (ms) ReTX Count Hop Count
1.47008 5 256 226 Last Delay 0.066264 1 4
1.47008 5 256 226 Full Delay 344.076 2 4
1.47508 5 256 227 Last Delay 0.066264 1 4
1.47508 5 256 227 Full Delay 344.076 2 4
1.48008 5 256 228 Last Delay 0.066264 1 4
1.48008 5 256 228 Full Delay 344.076 2 4
1.48508 5 256 229 Last Delay 0.075 264 1 4
1.48508 5 256 229 Full Delay 344.076 2 4
1.49008 5 256 268 Last Delay 0.076116 1 4
1.50008 5 256 268 Full Delay 0.076116 1 4
34 Algorithms in Advanced Artificial Intelligence

Observation Predicted Residuals Standard

Time Residuals
1 1.474912078 –0.004832078 –0.877067537
2 1.47758 –0.0075 –1.361320473
3 1.474912078 0.000167922 0.030479445
4 1.47758 –0.0025 –0.453773491
5 1.474912078 0.005167922 0.938026427
6 1.47758 0.0025 0.453773491
7 1.490905242 –0.005825242 –1.0573 36234
8 1.47758 0.0075 1.361320473
9 1.492419262 –0.002339262 –0.424598032
10 1.492419262 0.007660738 1.390495932

Regression Statistics
Multiple (R) 0.809161445
2
R 0.654742243
2
Adjusted R 0.482113365
Error 0.006742324
Samples 10

Fig. 5.3 Results

Verifiable Secure Vehicle Connectivity Using Machine Learning Framework for Internet of Vehicles 35

to describe a stage for dispersed measurements used in 4. Gerla, M.; Lee, E.; Pau, G.; Lee, U: Internet of vehicles: From
vehicle administration, with a focus on experiment findings intelligent grid to autonomous cars and vehicular clouds, 2014
throughout time. In the Internet of Vehicles, multiple IEEE World Forum on Internet of Things (WF-IoT) (PDF).
regression is known to show that the timely assign (Y) of pp. 241–246. doi: 10.1109/WF-IoT.2014.6803166, ISBN 978
1-4799-3459-1, S2CID 206866025, 2014.
intelligent automobiles depends not only on the amount of
5. Hamid, Umar Zakir Abdul; et al: Internet of Vehicle (IoV)
delay, such as Full or last (x1), ReTX Count (x2), Hop Count
Applications in Expediting the Implementation of Smart
(x3), etc. Y = f(x1, x2, x3…xk) is the formula for the dependent Highway of Autonomous Vehicle: A Survey, Performability
variable in multiple regression, which is a function of several in Internet of Things, EAI/Springer Innovations in
independent factors. ReTX Count and Hop Count are used Communication and Computing: 137–157, doi: 10.1007/978
to assess the delayed time in milliseconds when an IoV is 3-319-93557-7_9, ISBN 978-3-319-93556-0, S2CID
completed within a given time frame. 69362954, Retrieved 14 January 2022.
6. Khelifi, Adel; Abu Talib, Manar; Nouichi, Douae; Eltawil,
Mohamed Salah: Toward an Efficient Deployment of Open
6. Conclusion and Future Perspective Source Software in the Internet of Vehicles Field, Arabian
In this research we assumed that multiple regression Journal for Science and Engineering. 44 (2019): 8939–8961,
analysis reveals the timely assignment of intelligent cars to doi: 10.1007/s13369-019-03870-2.S2CID 164632020,
Retrieved 27 December 2020.
RSU depends on factors like ReTX Count and Hop Count,
7. Lee, Eun-Kyu; Gerla, Mario; Pau, Giovanni; Lee, Uichin;
which assess the delayed time in milliseconds when an IoV
Lim, Jae-Han: Internet of Vehicles: From intelligent grid to
is completed within a given time frame. IoV technology autonomous cars and vehicular fogs, International Journal of
enables real-time communication between vehiclesand road Distributed Sensor Networks. 12 (9):155014771666550, doi:
sideinfrastructure through infotainment systems, sensors, and 10.1177/1550147716665500, 2016.
GPS. 8. Maglaras, Leandros; Al-Bayatti, Ali; He, Ying; Wagner,
Isabel; Janicke, Helge: Cities Journal of Sensor and Actuator
Networks, 5 (1): 3, Doi: 10.3390/jsan5010003, 2016.
References 9. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
1. Ahmadian, Amir Shayan; Peldszus, Sven; Ramadan, Qusai; in Health Care Using Hybrid Model Algorithms”,Journal of
Jürjens: Model-based privacy and security analysis with Artificial Intelligence, Machine Learning and Neural Network
CARiSMA,Proceedings of the 2017 11th Joint Meeting (JAIMLNN), Volume 3, 2023.
on Foundations of Software Engineering, pp. 989–993, 10. Nahri, Mohamed; Boulmakoul, Azedine; Karim, Lamia;
doi:10.1145/3106237.3122823, ISBN 9781450351058, Lbath, Ahmed (2018): IoV distributed architecture for real-
S2CID:28115555,2017. time traffic data analytics, Procedia Computer Science, 130:
2. Abbas, S., Ahmed, A., Khan, F., Ahmad, S., Do-Hyeun, K., 480–487, doi:10.1016/j.procs.2018.04.055,2018.
& Do-Hyeun, K. (2021). Blockchain-Based Authentication in 11. Sakiz, Fatih; Sen, Sevil: A survey of attacks and detection
Internet of Vehicles: A Survey. Sensors, 21(23), 7927. mechanisms on intelligent transportation systems: VANETs
3. Lanka D, Kandasamy S. An Unsupervised Traffic Modeling and IoV, Ad Hoc Networks, 61: 33–50, doi: 10.1016/j.
Framework in IoV Using Orchestration of Road Slicing. adhoc.2017.03.006,2017.
In Revolutionizing Industrial Automation through the Note: All the figures and table in this chapter were designed by the
Convergence of Artificial Intelligence and the Internet of author.
Things 2023 (pp. 201-212). IGI Global.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
36 Algorithms in Advanced Artificial Intelligence

Disease Detection In Dental Patients Using

Machine Learning Algorithms Through
Image Analysis
6

Khadar Alisha Sheik1

Assocıate Professor, Department of MCA, B V Raju College,
Vishnupur, Bhimavaram, W. G. Dt., Andhra Pradesh, India
V. Kiran Kumar2
Professor, Department of CST, Dravidian University,
Kuppam, Chittoor Dt. Andhra Pradesh, India

Abstract: The teeth are the hardest substance to work with inside the human body. The intricacy of the operation experience,
low efficiency, and higher user involvement of current approaches for diagnosing dental issues. Older methods of oral disease
detection were laborious, manual, and required a dentist to examine and assess the illness. We suggest a unique method for
identifying and categorizing dental caries, the most prevalent issue with teeth, in order to allay these worries. Cavities come
in three varieties: while root cavities form on the surface over the roots of your teeth, smooth surface cavities form on the
smooth sides of your teeth. On the chewing surface of your teeth, there are pit and fissure cavities. In order to identify the
dental cavity issues previously described, to identify these disorders, we gathered information from the Vishnu Dental Hospital
in Bhimavaram and subsequently created a dataset of Dental Intra Oral Periapical Radiograph (IOPA) pictures. In this regard,
we employed the YOLO (You Only Look Once) version 3 deep learning model to develop a robotic system that can recognize
and classify dental abnormalities, like by using IOPA images to recognize various cavity problems. Last but not least, the
technology for automatically detecting and classifying dental problems will help with early illness detection and maybe stop
tooth loss.
Keywords: Dental caries, Deep learning, IOPA, Tooth, YOLOV3, Dentistry, Annotation, Augmentation

1. Introduction area of dentistry known as orthodontics deals with dental

irregularities and methods to treat. Dental surgery includes
A relatively new area of dentistry called dental informatics various broad medical procedures that incorporate deliberate
supports and enhances the diagnostic processes used in alteration of dentition. For example, operations on gums,
dental practices, decreases time and stress in people’s daily jawbones, and teeth.
lives [1]. Restorative dentistry, endodontics, orthodontics, Dentistry’s branch of periodontology treats conditions
dental surgery, and periodontology are the primary branches affecting the alveolar bone, gums, and other supporting
of dentistry. Any dental operation that restores or replaces a and enclosing tissues of the teeth, including cementum and
tooth is referred to as restorative dentistry. One example of periodontal ligaments [2]. Cavities, which result in places
a restorative procedure is dental work including root canals. with irreversible damage to the tooth’s hard surface that
The area of dentistry known as endodontics deals with the manifest as tiny holes or gaps, are the most prevalent dental
pulp of the teeth and the tissues that surround their roots. The disease.

1
khadar6@gmail.com, 2kirankumar.v@rediffmail.com

DOI: 10.1201/9781003529231-6
Disease Detection In Dental Patients Using Machine Learning Algorithms Through Image Analysis 37

Diagnosis of dental issues is now the responsibility of

dentists. They examine the teeth and gently move them to
look for probable dental issues. The automatic detection
of dental issues has not made much progress. Manual
study of tooth issues is necessary for disease classification
and identification, and it takes time and expertise. Human
mistakes can cause manual analysis to produce inaccurate
predictions. The computerized approach for identifying and
categorizing dental issues will help in the early diagnosis of
diseases and could stop tooth loss. It will help to do away with
labor-intensive, time-consuming manual clinical evaluation.
Medical imaging techniques like CT scans and X-rays have
historically been very helpful in the treatment and diagnosis
of a wide range of disorders [2].
A radiographic X-ray generator can create radiographic
X- rays that pass through the mouth when tissues absorb
radiation. The projective-radiography technique creates 2D
images of the internal anatomy of the human body [3]. It is
challenging to assign computer specialists to dentists because
the introduction of sensor pictures with high- resolution
biosensors has produced enormous data that can be analyzed
using software programs to help dentists in making diagnosis
decisions [4].
In our proposed methodology, we present a deep learning-
based approach to assist dentists in accurately recognizing
dental abnormalities in patients utilizing IOPO Images.
The suggested strategy for oral health care can be implemented Fig. 6.1 Proposed architecture work flow for the teeth
in the clinic to help find dental issues. It is a reliable, disease
effective, and cost-effective solution that will considerably
improve oral healthcare. For the classification and detection • The augmentation process, which comprises numerous
of diseases, manual investigation of dental issues demands image variants. The last step in image annotation is to
time and competence. Additionally, manual analysis runs the utilize the LabelImg application, which generates a.txt
risk of making incorrect predictions as a result of human error file with annotations for each image. Using the dataset
or misunderstanding. However, the computerized approach from the second phase, the third iteration of the YOLO
for identifying and categorizing dental issues will facilitate deep learning model was trained.
early detection and could stop serious issues like tooth loss.
Additionally, it will help with the elimination of laborious, 2. YOLO V3
time-consuming, and manual examinations. Due to these
factors, we suggest the YOLOv3 deep learning model. We This research presents the YOLO V3 deep neural network
will train and test using the gathered data set. model for the classification of dental issues. The suggested
method uses IOPA images to identify various oral issues.
Our proposed work’s primary contributions are as follows:
YOLO V3 is an object identification algorithm that recognizes
• The dataset has to be cleaned up and improved as a first
particular objects in real-time in movies, live streams, or
step. Vishnu Dental Hospital provided a small dataset
pictures. It is capable of predicting several things in a single
that was used to identify dental diseases. We will produce
image. The deep convolutional neural network characteristics
a special dataset for the investigation of dental diseases.
learned by the YOLO machine learning system are used to
For the processing of this domain’s dataset, data labeling
recognize objects.. It will anticipate the object only once
is done by a qualified dental surgeon. Different kinds of
without repeating it. Versions 1-3 of YOLO were created by
classes are present in the databases. A dataset of intraoral
Ali Farhadi and Joseph Redmon.
panoramic radiograph (IOPA) images of various patients
was gathered from secondary and tertiary care facilities The architecture of YOLO V3 is shown in Fig. 6.2. A matrix
throughout this phase. of pixel values that is used to feed an image into the model
38 Algorithms in Advanced Artificial Intelligence

Fig. 6.2 The foundation for feature extraction in YOLO V3 is the Darknet-53 architecture

[5]. In the image, the convolution network looks for patterns. 3. The Proposed Methodology
Based on similarities to previously learnt data, this model
“scores” a region. High-scoring regions receive positive A methodology has been put forth to address the issue of
detections for the class they most closely resemble. This identifying dental caries. The dataset was made up of about
technique works by creating grids out of the image. Given a 100 photos, of which about 80 served as training data and 20
high score value, the bounding box count that will encircle the served as testing data.
object will be predicted by the grid cells. Each bounding box It is an efficient technique for object detection and
is assigned a confidence value that represents the accuracy classification that uses an object detection model. In
of the forecast [6].Only one object can be detected by each accordance with this approach, a single neural network will
bounding box. In order to identify which shapes and sizes be able to forecast the bounding boxes and class probabilities
are the most comparable, to create the bounding box, the for the image in a single evaluation. The complete detection
dimensions of a number of ground truth boxes obtained from pipeline is contained within a single network, enabling end-
the training data are combined. to-end adjustment of the detection performance metric. But
This method’s foundation for feature extraction is what makes because they employ a pipeline execution architecture, the
it viewed as quick. The facility is known as Darknet-53. Two network must be forced to interact with each component
completely linked layers and 24 convolutional layers make up independently. Various methods have been employed to find
this structure. The twenty convolutional layers are joined by items. Training takes longer as a result, and optimization is
a pooling layer and a fully linked layer. An ImageNet dataset more difficult. The YOLO V3 approach uses a neural network
was used to train this base before. Three convolutional layers to create an output vector containing bounding box and class
and one reduction layer make up the layers. The model is probability coordinates from an input image. This method uses
trained using four convolutional layers and two fully linked Darknet-53, It was educated using Imagenet, as depicted in
layers. Forecasting the likelihood of each and the bounding Fig. 6.2. After 53 additional layers are added to the framework
box is done using the final layer. Each layer is activated using to do detection, our 106-layer network is currently supporting
ReLu, and the top layer is activated using linear. the design. The M x N grid cells with a size of N are divided
into the YOLO V3 algorithm from the input image. Every grid
Disease Detection In Dental Patients Using Machine Learning Algorithms Through Image Analysis 39

has the ability to find and recognize the image. Using class in the same grid is connected to an anchor box. Two
labels and class probabilities, all grid cells may then predict forecasts would appear in the same grid, for instance,
the object’s bounding box coordinates. Implementing a 1x1 if the two items are each linked to two anchor boxes.
kernel on a feature map made up of variously sized features For the item, the IoU ratio is determined. The object
at various points across the structure allows for detection. will not be taken into consideration for detection if the
Dimensions of the detection kernel are 1x1x(Bx(5+C)). The result is less than a certain threshold, let’s say 0.2.
letter B stands for the bounding box prediction capability of (f) Suppression that is not maximum: The final stage of
the featured map cells. Five bounding box features and one YOLO v3 is to resolve the issue that occurs when
confidence item make up the number “5”. Last but not least, many bounding boxes are found for the same object.
“C” denotes the quantity of classes. Binary cross-entropy is The goal of non-maximum suppression is to select
utilized to quantify classification loss, while estimations of the best bounding box from those that frequently
object probability and class probability are obtained using overlap. Each bounding box’s IoU is calculated, and
logistic regression. The YOLO V3 commonly converts a the outcome is then evaluated against the threshold. A
picture entered into an output vector, as was already said. lower-than-acceptable IoU in a bounding box results in
The following parameters make up the output vector: its rejection. If all the bounding boxes have IoU ratios
higher than the threshold, the bounding box with the
1. Class probabilities: This shows the possibility that an
highest IoU ratio is taken into account. As done by the
object will be found inside the bounding box and that it
author Zhao et al.[7], performance measures have been
belongs to a Single class.
mentioned in order to evaluate how well the object
2. Values of bounding boxes: The Cartesian position, detection model is doing.
width and height of the bounding boxes are all given.
1. Precision
3. Prediction probability: A probability is used to depict
the various bounding boxes that contain a detectable The percentage of correctly identified objects to all detected
object. objects is how precision is determined. It is symbolized by
The following procedures were used to detect objects using TP
Precision = (1)
the YOLO V3 model: TP + FP
(a) Data collection: 100 dental pictures in all were FP stands for False Positive, whereas TP is for True Positive.
collected, of which 25% were utilized for testing and
2. Specificity
75% were used for training.
(b) Data labeling: Each and every image is labeled using It displays the percentage of accurately determined true
the LabelImg tool. This also generates a ground truth negatives. This suggests that there will be more true negatives,
box for every channel of the image. The whole dataset which might equate to false positives, than true positives,
is produced as a text file at the conclusion of the which were previously thought to be positive. A model with
process. The file contains data about the image id and high specificity will successfully identify the undesirable
bounding box coordinates. consequences.
(c) Feature extraction: The darknet-53 framework was TN
Specificity = (2)
used to identify the key components of the photos and TN + FP
train the model. The training was expected to take 7
TN stands for True Negative.
hours, and almost 2000 iterations were completed. Two
files with the names “yolov3_training.weights” and 3. Sensitivity
“yolov3_test.cfg” were generated at the conclusion of The sensitivity of the model refers to how effectively it can
the training. The main element for carrying out real- predict the positive test cases. By giving an idea of how many
time detection is these files. cases were correctly classified as positive, it evaluates the
(d) Testing object detector: Real-time object detection is model’s performance.
carried out on the test images using the OpenCV library Sensitivity boosts the accuracy of positive event forecasting.
and the files produced during the feature extraction
procedure. TP
Sensitivity = (3)
(e) Anchor boxes: When multiple objects’ midpoints land TP + FN
on the same grid cell,it can be difficult to locate some In this case, FN stands for False Negative.
of them. As a remedy for this problem, each object
40 Algorithms in Advanced Artificial Intelligence

4. Accuracy Step 1: Prepare dataset

This is the percentage of items out of the complete collection (a) The dataset is created with IOPO images for which you
of items that have precise labels. want to perform its detection.
TP + TN (b) The collection cleaned by deleting undesirable or
Accuracy = (4) pointless pictures.
TP + TN + FN + F
5. IoU Additionally, ensure that all of the photographs are in the
.jpg format.
The angle between the actual ground truth box and the
anticipated bounding box is how it is defined. Figure 3
displays the reflection of the performance metric IoU from
Equation (5). The bounding boxes of the prediction and the
ground truth box are denoted here by the letters A and B,
respectively.
A«B
IoU = (5)
A» B Fig. 6.5 Sample dataset

Step 2: Data Annotation

Each and every image is labelled using the LabelImg Tool,
which also creates an Annotated Text File.

Fig. 6.3 Intersection is depicted instead of the union ratio

Table 6.1 Demonstrates the outcomes of object detection

using the YOLOV3 algorithm
Model Precision Specificity Sensitivity Accuracy Fig. 6.6 LabelImg tool
YOLOV3 74% 72% 76% 75%
The list of all the classes that we have annotated in our dataset
is generated in a file called classes.txt. Each annotated image
file has a corresponding.txt file that contains the metadata.

Fig. 6.7 Classes.txt

Fig. 6.4 Displaying the model as a bar graph in relation to the
four performance metrics

4. Implementation
Fig. 6.8 3.txt(metadata)
Steps for YOLOv3 Custom Object Detection using IOPO
Images:
Disease Detection In Dental Patients Using Machine Learning Algorithms Through Image Analysis 41

The following is part of the metadata: [obj_id cen_x cen_y

width height]
Obj_id is the identifier for the object category that was
previously listed in “classes.txt”.
The center of the bounding box is represented by Cen_x and
Cen_y. However, they are normalized by dividing by the
width and height of the image to make them fall between 0
and 1.
The bounding box’s width and height are represented by the
variables width and height. The image’s original width and
height were first calibrated once again to the range of 0 to
1, then those values were removed.Step 3: Setting up model.
Step 3: Training phase
After labeling the entire dataset, we proceed to the model’s
real training phase.
(a) Datasets were uploaded to Google Drive.
(i) We created a ZIP file called pictures.zip it includes
our dataset’s whole collection of *.jpg images,
*.txt annotations, and *.txt classes files.
(ii) We launched Google Drive after logging into our
Google accounts. We uploaded the photographs to
a new folder named “yolov3.”zip archive within it.
(b) Establishing Google Colab.
We’ll be using Google Colab for our model training
because it offers free GPU access and an environment
that makes it simple to install all the necessary
requirements.
(i) Clone the model on our local machine by using
GitHub Repository.
(ii) The required Python object detection file should Fig. 6.9 Clone Darknet, set it up, and put it together
be added after starting Google Colab.
(iii) Mounted Google Drive on Google Colab.
(iv) Clone, configure and compile Darknet.
(v) Set up the yolov3.cfg file.
By using the name yolov3_training.cfg, this cell
copies the yolov3.cfg
If there are ‘N’ classes in our custom object
Fig. 6.10 Configure yolov3.cfg file
recognition model, then max_batches is equal to
2000 * ‘N’ and filters is equal to (N + 5) * 3.
(vi) dot names and dot data files were produced.
we created the obj.names and obj.data files.
These files provide metadata, such as the titles of
the classes and the number of classes needed for
training. Fig. 6.11 Creating obj.names and obj.dada
(vii) Save the obj.names and yolov3_training.cfg files
to our Google Drive. The code produces a train.txt file, that lists all of
(viii) Decompress the image dataset. the *.jpg files that are stored inside the darknet/
(ix) Create the file train.txt. data/obj directory, along with their locations.
42 Algorithms in Advanced Artificial Intelligence

downloaded into the yolov3 folder. The illustration below

demonstrates this.

Fig. 6.12 Create train.txt file

In other words, during training, photos will be

retrieved from the location indicated in this file.
(x) Download the convolutional layers file’s pre-
trained weights. Transfer learning is the process of
adding our own layers to a model that has already
been trained. The pre-trained weights are obtained Fig. 6.15 Showing files in YOLOV3 folder
from darknet53.conv.74. Therefore, rather than
(i) Download the YOLOv3_Custom_Object_Detection
using weights that were randomly initialized, our
arsenal and save the files yolov3_training_last.weights,
own model will be trained using these previously
classes.txt, and yolov3_testing.cfg there.
taught weights, which will save a significant
amount of time and calculations. (ii) In YOLOv3_Custom_Object_Detection arsenal, make
a new folder called test_images and add some images
inside that you want to test the model on.
(iii) Following algorithm will test the image which given by
the user, it Creates image containing tooth and cavity.
The Algorithm can be broken down into several steps:
1. Import libraries such as “numpy” (as “np”) for
numerical operations, “cv2” (OpenCV) library for
computer vision operations, etc.
Fig. 6.13 Download pre-trained weights 2. Open the picture and obtain its dimensions:
(ix) We are now prepared to begin training our model. • Use the ‘cv2.imread()’ function to load an image.
• Using the image’s ‘.shape’ property, determine the
loaded image’s height and width.
3. Image preparation for the neural network:
• Use the function “cv2.dnn.blobFromImage()” to
create a blob (binary large object) from the image.
By resizing, scaling, and removing the mean
values from the image, this function gets the image
ready for input to the neural network.
4. Give the neural network input and get the output:
• Using “net.setInput(),” set the preprocessed blob
as the neural network’s input.
• Use ‘net.getUnconnectedOutLayersNames()’ to
Fig. 6.14 Training our model get the names of the output layers.
Depending on the size of the dataset and the number of classes, • Use ‘net.forward()’ to forward propagate the
the model will take some time to train. While the model is input across the network to get predictions in the
working out, go grab a coffee or go for a stroll. Depending on ‘layerOutputs’.
the size of your dataset and the number of courses, you may 5. Analyze the forecasts:
estimate the approximate time required for training your own • Repeat the process for each output detection:
custom model; for example, training 2 classes at a time with a
• Extract the class rankings and locate the top-
training sample size of 100 people should take about 6 hours.
scoring class (‘class_id’).
Step 4: Model testing. • Verify whether the discovered class’s confidence
Once the model is fully trained, depending on the size of level exceeds a predetermined cutoff point (0.2 in
the model, on our Google Drive, at least three files will be this example).
Disease Detection In Dental Patients Using Machine Learning Algorithms Through Image Analysis 43

• Calculate the bounding box coordinates and dearth of easily accessible annotated medical information,
dimensions based on the network output if the many automated systems for identifying and categorizing
confidence is higher than the threshold. dental problems face significant obstacles. Deep learning
• Save the class ID, confidence, and bounding box is employed in this work to create an automated method
coordinates. that can identify and classify dental abnormalities on IOPA
6. Use non-maximum suppression: - Use ‘cv2.dnn. pictures. The collection comprises panoramic dental pictures
NMSBoxes()’ to remove duplicate and overlapping from several clinics that have dental issues, like cavities.
boxes before applying non- maximum suppression The proposed method has numerous applications for dental
(NMS). treatment using computers and diagnostics and performs
better in terms of accuracy than current state-of-the-art
• After NMS, obtain the selected boxes’ indices.
methods. After training, the YOLOv3 model was evaluated
7. Draw bounding boxes and labels by iterating through using test images, where it performed best and with the
the chosen NMS indexes. highest degree of accuracy [8]. For the purpose of finding
• Get the color, class label, confidence, and bounding dental anomalies, a real-time methodology is proposed.
box coordinates.
• Using ‘cv2.rectangle()’, draw a rectangle around References
the object, and ‘cv2.putText()’, add the class label
and confidence. 1. Oprea, S.; Marinescu, C.; Lita, I.; Jurianu, M.; Visan, D.A.;
8. Display the image: Cioc, I.B. Image processing techniques used for dental X-ray
image analysis. In Proceedings of the 2008 31st International
• Using the function “cv2_imshow (),” which is not Spring Seminar on Electronics Technology, Budapest,
part of OpenCV by default and may need to be Hungary, 7–11 May 2008; pp. 125–129. [Google Scholar]
changed depending on your environment, display Ossowska, A.; Kusiak, A.; Świetlik, D. Artificial Intelligence
the image with bounding boxes and labels. in Dentistry—Narrative Review. Int. J. Environ. Res. Public
9. Do some cleanup. Health 2022, 19, 3449. [Google Scholar] [CrossRef] [PubMed]
• Use the cv2.destroyAllWindows() function to 2. Yu, Y.J. Machine learning for dental image analysis. arXiv
2016, arXiv:1611.09958. [Google Scholar] Tuzoff, D.V.;
close the image display window.
Tuzova, L.N.; Bornstein, M.M.; Krasnov, A.S.; Kharchenko,
M.A.; Nikolenko, S.I.; Sveshnikov, M.M.; Bednenko, G.B.
4. Result Tooth detection and numbering in panoramic radiographs
using convolutional neural networks.
3. Dentomaxillofac. Radiol. 2019, 48, 20180051. [Google
Scholar] [CrossRef] [PubMed]
4. Gavrilescu, R., Zet, C., Foșalău, C., Skoczylas, M., &
Cotovanu, D. (2018, October). Faster R-CNN: an approach to
real-time object detection. In 2018 International Conference
and Exposition on Electrical and Power Engineering (EPE)
(pp. 0165-0168). IEEE.
5. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
in Health Care Using Hybrid Model Algorithms”,Journal of
Artificial Intelligence, Machine Learning and Neural Network
Fig. 6.16 Dental cavities detection using YOLOV3 (JAIMLNN), Volume 3, 2023. Liu, C., Tao, Y., Liang, J., Li,
K., & Chen, Y. (2018, December).
6. Object detection based on YOLO network. In 2018 IEEE
5. Conclusion 4th Information Technology and Mechatronics Engineering
Conference (ITOEC) (pp. 799-803). IEEE. Zhao, L., & Li, S.
By looking at the teeth and gently moving them, a dentist can (2020).
identify potential dental issues. IOPA photos can be used to 7. Object detection algorithm based on improved YOLOv3.
automatically classify dental diseases, which can help doctors Electronics, 9(3), 537. Thanh, M. T. G., Van Toan, N., Ngoc,
make precise diagnoses. Such tooth issues are found using V. T. N., Tra, N. T., Giap, C. N., & Nguyen, D.M. (2022).
panoramic dental radiography. We propose a novel method 8. Deep Learning Application in Dental Caries Detection Using
based on the deep learning model YOLOv3 for detecting and Intraoral Photos Taken by Smartphones. Applied Sciences,
classifying the most typical tooth problems, namely cavities, 12(11), 5504
in order to address the low efficiency, the complexity of the Note: All the figures and tables in this chapter were designed by
experiential operation, and the high level of user intervention the author.
in existing methods of tooth problem detection. Due to the
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
44 Algorithms in Advanced Artificial Intelligence

Early Disease Diagnosis in Tomato Crops

Using AI-Based Deep CNN 7

T. V. K. P. Prasad1, V Dilip Kumar2, T. Srinivasa Rao3

Department of CSE, S R K R Engineering College,
Bhimavaram, India
Gude Sujatha
Department of CSE, Shri Vishnu Engineering College for Women,
Bhimavaram, India
T. K. Priyanka
Department of CSE, S R K R Engineering College,
Bhimavaram, India.

Abstract: India’s main business is agriculture. Agriculture influences the style of life in rural areas by about 60%. One of the
popular food crops in India is the tomato. Because tomato plants are less susceptible, disease detection becomes crucial. If
proper maintenance is not given, the plant’s productivity declines. AI systems utilize effective image-processing algorithms,
but they encounter challenges such as noise, occlusion, articulation, and scene interpretation. This paper suggests that AI-
based computer vision and machine vision are emerging technologies that can effectively address various issues using various
algorithms and methods. Since they first harm the leaves, the majority of tomato plant illnesses are found in their early stages.
There is always a chance that a leaf disease can be detected early enough to prevent impending loss. In order to identify
diseases, this work uses AI-based computer vision and machine vision algorithms. The separation of damaged areas on leaves
is done using disease-processing image technology, which is used to precisely diagnose illnesses. Computer vision information
helps with sickness symptoms and cures in the experimental result.
Keywords: Artificial Intelligence, Computer Vision and Machine Vision, Early disease detection et

1. Introduction microscopic objects that fungi use to spread, are transported

to new hosts by wind, water, or other mechanical means. By
Fungi in the soil can be a major problem for tomatoes by Rangarajan et al. 2018[12], healthy plant tissue on the host
Zhang, S.W. et al., 2015[14]. Three stages are required to releases spores that, upon germination, produce diseases such
comprehend and manage tomato infections in a home garden leaf spots, rots, and wilts that result in early defoliation and
by Sagar Vetal, 2017[8]. The first step is to comprehend decreased tomato yields. This happens when healthy plant
the normal fungus disease cycle. The second knows how to tissue becomes infected. Temperature, relative humidity, free
spot serious fungal infections in tomatoes, and the third is moisture, and rainfall all have an impact on the growth and
employing sensible cultural practices to limit the damage spread of fungus in a home garden. The main fungal diseases
that these diseases can cause. To put it simply, fungus eat that affect tomatoes grown in backyard gardens include
and grow on polluted host tissue. Spores, which are small, Buckeye rot, Anthracnose fruit rot, early blight, Septoria

Corresponding authors: 1tvkpprasad@gmail.com, 2dilipv510@gmail.com, 3srinu.tottempudi@gmail.com

DOI: 10.1201/9781003529231-7
Early Disease Diagnosis in Tomato Crops Using AI-Based Deep CNN 45

leaf spot, and Late blight. Amateur gardeners can recognize fresh feature vectors, features (the processed images data) are
each of these ailments right away because they each exhibit first utilized to identify a huge library of feature vectors with
a unique set of symptoms. AI is already able to create high- existing classifications.
quality images in a matter of seconds; it is possible that one The purpose of identifying the image parameters of a
day it may be able to create hour-long videos in the same distance function that maximizes a specific objective
manner. If you don’t look too closely, artificial intelligence function assessing agreement with the training data is how to
is capable of creating some quite realistic images. The characterize the metric learning problem as an optimization
technological and anatomical accuracy of AI seems to be a problem. We routinely utilize general-purpose metrics, but
persistent problem. According to computer vision, machines they typically fall short of accurately explaining the behavior
can now comprehend images more effectively. The technology of image data. The effectiveness of the learning mechanism
of employing sensors to understand and interpret what they is directly impacted by this. The answer to this issue is to
observe while interacting with objects digitally is the focus modify the measure in light of the situation and the data.
of this artifact. It covers a wide range of topics and has uses However, doing it manually is very unworkable. Therefore,
in agricultural crops, such as machine translation and pattern metric learning is used to satisfy the data geometry. The
recognition for spotting tomato diseases early on. Machine database elements that are semantically connected to a query
learning is one of the most widely used AI techniques for element can be found using the learnt metric. In a supervised
many healthcare organizations and individuals interested in situation, metrics learning can be thought of as a means to
automation (ML) by sehgan, et. al., 2021[10]. This is due to reduce the data dimension. More generally, the created
practitioners’ ability to achieve noteworthy results in a range picture data can be developed into a new embedding space
of domains because of major improvements in data access and then provided to another machine-learning algorithm
and processing power. ML systems are now able to assess using the learnt transformation by by sehgan, s et. al.,
photos in a manner similar to how our brains process visual 2021[10]. The following sections make up the remaining text
information. They are utilized almost everywhere, including of this essay. Section 1 of the article covers the condition’s
Smart technologies, MRI sickness diagnosis, and everything primary symptom, which is represented by a disease image.
in between. The fundamentals of machine learning for image In Section-2 and 3 deals with Naive Bayes and CNN
processing are discussed, along with some of the tools we classification model for trained metric learn in, is discussed.
might employ to develop cutting-edge algorithms for image Section 4 provides Process for DeepCNN-Based Disease
data. ML algorithms need many high-quality data in order Detection in Tomato and Leaf for the early identification of
to learn and make extremely accurate predictions. Because the disease. Section 5 presents experiments and findings. In
of this, we need to make sure the pictures are appropriately Section 6, the paper’s conclusion and outlook are offered.
cropped, labeled, and ready for ML image processing. This
is where Computer Vision (CV), a discipline focusing on Early Prediction Analysis of Tomato Crop: Tomato fruit
how well computers can comprehend image data, comes into and foliage can become infected by early blight. Early blight
play. To produce the ideal dataset for the machine-learning on tomato foliage first manifests as round, erratic, black, or
algorithm, we can use CV to process, load, transform, and brown spots on the plant’s older leaves by Naresh, 2020[7].
perform other operations on pictures. In this paper, we The centers of these lesions grow into a cluster of black
propose that a computer’s perception of an input image as an circles as they spread, giving them an identifiable target
array of pixels depends on the image resolution. The image pattern. Early blight lesions could gradually produce yellow
resolution will determine how the height, width, and depth tissue, which might eventually cause the leaves to die. When
are shown. For instance, a picture of a 6 × 6 × 3 RGB matrix this disease has badly damaged a plant, it may lose all of its
array and a 4 × 4 × 1 matrix for a grayscale image. In order to leaves. The midpoint of these lesions develops into a set of
find and develop a machine- learning approach to categorize dark concentric rings as they enlarge, creating a distinct target
pattern. Early blight lesions may gradually develop yellow
tissue, which may ultimately result in the death of the leaves.
When this disease has badly affected a plant; it may lose all
of its leaves. When fruit is in the juvenile green or red stage,
early blight can infect it through the calyx or stem attachment
and cause distinct target-like lesions that resemble foliar
infections. Early blight defoliation can lower fruit output and
increase the danger of sunscald damage to the fruit.
Tomato Crop Naive Bayes Classification: One of the most
Fig. 7.1 Image data processing workflow using machine basic supervised machine learning algorithms is Naive Bayes.
learning
46 Algorithms in Advanced Artificial Intelligence

It is a Bayes Theorem-based classification approach. It has Table 7.1 Graph for comparing algorithms RGB, YcbCr and
utilized in text-classification for high-dimensional training DeepCNN
dataset by Aravind et al, 2018[1]. All of the input features Algorithms Accuracy Sensitivity Specificity Precision
in the training dataset assumed independent of each other,
RGB 0.75 0.79 0.79 0.80
i.e. Naive Bayes do not correlate them. The presence of one
feature has no bearing on the presence of another. The objects YcbCr 0.89 0.88 0.85 0.84
like Shape, Color for example, all contribute to identifying a CNN 0.90 0.90 0.87 0.85
Leaf and Tomato Vegetable as Plant Agriculture by Basavaiah Deep CNN 0.98 0.98 0.96 0.96
et al, 2020[2]. Because this assumption is false in most real-
life situations, it is referred to as naïve. Method for Detecting Disease in Tomato and Leaf Using
Let us first go over the concept of conditional probability DeepCNN: There are four basic steps to the classifier model’s
and the Bayes theorem that is utilized in it before going on structure. Getting the dataset for the rural Andhra Pradesh
to Naive Bayes. P(A given B) or P(A|B) are examples of villages is the first stage. The disease images from the entire
conditional probabilities, which are the probabilities of one dataset had to be resized in the second stage before being
event based on the existence of another. The formula for split and classified in the third and fourth stages, respectively,
Bayes formula is: using a deep learning convolution neural network with many
layers, including the input layer, convolution layer, batch
P(C | X )P (C ) normalization layer, activation function layer, max pooling
P(C | X ) =
P( X ) layer, fully connected layer, softmax layer, and classification
layer.
Bernoulli Naïve Bayes: When input features are only available
in binary form, this method is employed. It takes into account One of the well-known uses of computer vision is the early
the Bernoulli-distributed random variable X. disease detection on tomato crops. Finding the leaf and
tomato disease in an image by comparing it to an existing
Ï p if X = 1 database is the task. To learn the characteristics of disease
(X) = Ì where 𝑞 = 1 − 𝑝 and 0 < 𝑝 < 1
Óq if X = 0 photos and identify them by Zhang X, 2018[15], we can
Using CNN, the Tomato and Leaf Image Classification utilize deep learning techniques. The first part of the multiple
Algorithm: stage process, which involves finding one or more items in
the input image, comes after the early disease diagnosis of
Input: Array of pixel values [height, width, and channel] the crop by Gnanavel, 2020[4]. What follows is the practice
Feature Extraction: of standardizing an input object so that it is geometrically
1. To obtain a feature map, use a convolution neural compatible with the database is known as object alignment.
network. Feature Extraction is the name of the final strategy. Then,
characteristics that can be used in respect tasks are extracted.
(a) Convolution (ReLu).
The process of feature detection is completed by a database
(i) Choose a kernel with a 5x5 size, the same comparison of the provided features.
depth as the input array.
A deep learning model that offers unified embeddings
(ii) Convolution should be used to obtain the
for the purposes of object recognition, verification, and
tomato and leaf picture features.
(b) (Max Pooling) Pooling.
(i) Using the dimensionality reduction technique,
reduce the spatial size of the feature map to a
size of 2x2 and extract the dominating feature.
2. Continue the method described above until the fourth
layer, changing the channel size to one of 16, 32, 64, or
128 to extract low-level features from the Tomota and
Leaf image.
Classification
A feed-forward neural network with back propagation is
provided flat output during each training iteration.
By identifying the main features in the photos and classifying
them using the SoftMax Classification approach, a trained Fig. 7.2 Process of leaf and tomato disease detection
model is utilized to categorize by Bedi, 2021[3].
Early Disease Diagnosis in Tomato Crops Using AI-Based Deep CNN 47

classification is used to diagnose disease on tomato and

leaf objects by salih T.A, 2020[9]. The network reduces the Comparative Methods of Leaf and
distance between images by mapping each input image in Tomota Early Disease Detection
Euclidean space. Develop the disease detection system using
the already- trained, installed sickness detection by Rahul, 1
2023[11]. It is possible to test the illness object identification 0.9
technology. The creation of a Deep CNN-based network by
0.8
Mishra, 2020[6] represents an important watershed in the use
of deep learning to object detection in agriculture. 0.7
0.6
2. Experimental Results 0.5
The study evaluates the accuracy, sensitivity, specificity, 0.4
precision, and F1-score of RGB,YcbCr, CNN, and DeepCNN 0.3
methods using accuracy metrics from Khan et al.’s 2020 0.2
research.
0.1
Accuracy: Accuracy is a measure of a model’s performance
across all classes. When every class is equally significant, it 0

y
cy

e
on
ity
vit
is beneficial. By dividing the total number of projections by

or
ra

iﬁc

isi
i�

sc
cu

ec
ns

ec
the percentage of accurate estimates, it is calculated.

F1
Ac

Pr
Se

Sp
TP + TN RGB YCbCr
Accuracy =
TP + TN + FP + FN CNN Deep CNN
Fig. 7.4 Graph for comparing algorithms RGB, YcbCr, CNN
and DeepCNN

specificity. When compared to these algorithms, DeepCNN

performed the best.
The model’s 91% accuracy in classifying 100 diseases based
on 91 predictions out of 100 acres is insufficient for class-
imbalanced data sets with significant differences between
positive and negative labels.
1 + 90
Accuracy = = 0.91
1 + 90 + 1 + 8

Fig. 7.3 Accuracy of different algorithms

3. Conclusion
Sensitivity: A thing’s sensitivity determines how well it can
The diagnosis and characterization of tomato leaf disease
recognize positive instances. It is frequently referred to as the
employ a variety of deep-learning approaches. The technique
genuine positive rate or the recall rate. The ability of a model
was more effective than programs like RGB, YcbCr, CNN
to predict true negatives in each attainable category is known
and DeepCNN at identifying diseases in tomato crops. Test
as specificity. That number to determine the percentage of
and training accuracy for the proposed model are respectively
positively identified samples that were properly identified
75%, 89%, 90% and 98%. Farmers do not actually need to
multiplies the total number of favorably recognized samples.
follow plant scientists in order to solve their identification
The precision and recall of a classifier are averaged to create
issues with plants. They may raise the quantity, quality, and
the F1-score, a single metric. It is widely employed to assess
profitability of their tomato crops by using this to successfully
how well two classifiers perform.
treat ailments that affect tomato plants. In the future, we will
The aforementioned graphs display several accuracy measures update the model with a new crop. To further increase test
for the three methods RGB, YcbCr, CNN, and Deep CNN, accuracy, we will attempt to optimize the identical prototype
comprising F1-score, sensitivity, specificity, accuracy, and using the identical information.
48 Algorithms in Advanced Artificial Intelligence

Goyal,and Seifedine Kadry: Early Detection and Classification

Acknowledgement of Tomato Leaf Disease Using High- Performance Deep
The authors gratefully acknowledge the students, staff, and Neural Network, Sensors(Basel), December, 2021, 21 (23):
authority of Physics department for their cooperation in the 7987, PMCID: PMC8659659, PMID: 34883991, Published
research. online: 30-Nov- 2021, Doi: 10.3390/s21237987.
8. Sagar Vetal and Rupali Khule: Tomato Plant Disease Detection
using Image Processing, IJARCCE 6(6): 293–297, DOI:
References 10.17148/IJARCCE.2017.6651, June 2017.
9. Salih T.A: Deep Learning Convolution Neural Network to
1. Aravind Krishnaswamy Rangarajan, Raja Purushothaman, Detect and Classify Tomato Plant Leaf Diseases, Open Access
Aniirudh Ramesh:Tomato crop disease classification using Libr. Journal, 7: 12, DOI: 10.4236/oalib.1106296, 2020.
pre-trained deep learning algorithm, ScienceDirect,Elsevier, 10. Sengan S., Sagar R.V., Ramesh R., Khalaf O.I., Dhanapal
Procedia Computer Science, Volume: 133, Pages: 1040–1047, R:The optimization of reconfigured real-time datasets for
https://doi.org/10.1016/j.procs.2018.07.070, 2018. improving classification performance of machine learning
2. Basavaiah J., Anthony A.A.:Tomato Leaf Disease Classification algorithms, Math. Eng. Sci. Aerosp. (MESA) 12: 43–54,
Using Multiple Feature Extraction Techniques,Wireess Pers. 2021.
Communication, 115: 633–651, 2020, DOI:10.1007/s11277 11. Rahul Subhash Gaikwad and Sharanabasappa C.Gandage:
020-07590-x, 2020. Image Sentiment Classification Using Deep Convolutional
3. Bedi, P; Gole, P: Plant disease detection using hybrid model Neural Network Models, Jounral of Data Acquisition and
based on convolutional autoencoder and convolutional neural Processing, ISSN: 1004-9037, https://sjcjycl.cn/DOI:
network, Artificial Intellidence Agriculture, 5, 90–101, 2021. 10.5281/zenodo.7923136, Vol. 38 (3), Page No: 1279-1300,
4. Gnanavel Sakkarvarthi, Gnanavel Sakkarvarthi, Godfrey 2023.
Winster Sathianesan, Vetri Selvan Murugan, Avulapalli Jayaram 12. Rangarajan A. K., Purushothaman R., Ramesh A. Tomato
Reddy, Prabhu Jayagopal, Mahmoud Elsisi: Detection and crop disease classification using pre-trained deep learning
Classification of Tomato Crop Disease Using Convolutional algorithm, Procedia Computer Science, 133: 1040–1047,
Neural Network, Electronics 2022, 11(21),3618,Received: DOI: 10.1016/j.procss,2018.07.070, 2018.
8, October, 2022, Revised: 31, October, 2022, Accepted: 3, 13. Yang Wu, Lihong Xu, Erik D.Goodman: Tomato Leaf Disease
November,2022, Published:6,November, 2022, https://doi. Identification and Detection Based on Deep Convolutional
org/10.3390/electronics11213618. Neural Network, Intelligent Automation & Soft Computing,
5. Khan S., Narvekar M: Novel fusion of color balancing and Received: 01 January 2021; Accepted: 26 February 2021,
superpixel based approach for detection of tomato plant DOI: 10.32604/iasc.2021.016415.
diseases in natural complex environment, Journal King Saud 14. Zhang S.W., Shang Y.J., Wang L: Plant disease recognition
University Computer Information. Science, Doi: 10.1016/j. based on plant leaf image, Journal Anim. Plant Science, 25:
jksuci.2020.09.006,2020. 42–45, 2015.
6. Mishra, S.; Sachan, R.; Rajpal, D: Deep Convolutional Neural 15. Zhang, X.; Qiao, Y.; Meng, F.; Fan, C.; Zhang, M. Identification
Network based Detection System for Real-time Corn Plant of Maize Leaf Diseases Using Improved Deep Convolutional
Disease Recognition. Procedia Computer Science, 167, 2003– Neural Networks,IEEE Access, 6, 30370–30377, 2018.
2010, 2020.
7. Naresh K. Trivedi, Vinay Gautam, Abhineet Anand,Hani Note: All the figures and table in this chapter were designed by the
Moaiteq Aljahdali,Santos Gracia Villar,Divya Anand,Nitin author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Improvement Over K-Means Algorithm

Over Complex Data 8

D. D. D. Suribabu1
Research scholar, Department of CSE, JNTUA College of engineering,
Anathapuram, Andhrapradesh, India
T. Hitendra Sarma2
Department of Information Technology, VASAVI College of Engineering,
Hyderabad, Telangana, India
B. Eswara Reddy3
Professor, Department of CSE, JNTUA College of Engineering,
Ananthapuram, Andhrapradesh, India

Abstract: The modern era has seen a significant increase in data, making it increasingly challenging for humans to comprehend
and process information. That information might have a lot of valuable and potential values that are hidden. Clustering is
recognized as an essential element in data mining, particularly for the analysis of enormous amounts of data. There are several
clustering techniques in the data mining literature, but among the many algorithms, The k-means approach and its adjusted
varieties have gotten a parcel of intrigued within the field of expansive information examination. These days, neural network-
based clustering, K-means variety, fluffy C-means, and probabilistic C-means clustering, collaborative sifting clustering, and its
developments are all well-known approaches for clustering endless sums of information. This survey study’s main objective is
to offer a venue for discussing the various clustering techniques applied to effective large-data clustering. This review examines
over a dozen research publications on effective big data clustering methods, showcasing results using WEKA and KNIME
data mining programs. The paper critically reviews previous studies and explores the advantages of K-Means clustering over
big data analytics in various research areas. This paper provides a comprehensive overview of the shortcomings and issues of
several large information clustering strategies, aiming to assist students in their journey towards improved big data clustering.
Keywords: Big data clustering, WEKA, Data mining, Fuzzy C-means, K-means clustering, Collaborative filtering

1. Introduction 1. Volume (a big amount of data);

2. Variety (a variety of data types); and
A single data collection has hundreds of entries in it since 3. Velocity (a continuous accumulation of new data).
many years ago. Recent technological advancements have
enabled the storage and processing of a billion objects in Big data is the term used to Enormous information eludes
vast data sets. This category of data is known as “big data.” to information whose volume, speed, or differing qualities
Big data are complex data collections that are difficult to surpasses the capacity of IT frameworks to store, analyze,
handle with traditional data processing techniques. Big data, and handle it. . Two additional V’s are introduced for the big
commonly referred to as V3, is further divided into three data analysis in the most current survey. The five V’s stand
categories: for veracity, value, and big data combined. Big data is a
1
suribabu.ddd@gmail.com; 2t.hitendrasarma@gmail.com; 3eswar.cse@jntua.ac.in, eswarcsejntu@gmail.com

DOI: 10.1201/9781003529231-8
50 Algorithms in Advanced Artificial Intelligence

novel idea that offers the chance to view data that is currently C-means clustering, Possibility C-means clustering,
available from an alternative perspective. It is not only fair Progressive clustering, Self-Organizing K-means clustering,
but abundant in knowledge [1]. Collaborative sifting and optimization clustering, and many
“Huge information” refers to the total amount of data that is more are among the various clustering techniques that fall
contracted out to be prepared for a particular framework in under this category.
terms of time and memory usage. A wide range of industries
that handle massive amounts of raw data, including retail,
finance, e-commerce, healthcare, and other sectors, are
inquisitive about enormous information examination. In spite
of the fact that the method of making and translating data
from gigantic information remains a major issue with all
progressed information mining strategies [2]
Clustering is the most effective way to extract information
from enormous amounts of data and present it in a useful
way. Gathering the provided data into a discrete group of Fig. 8.1 Displays the various big data clustering strategies
items based on their separate measurements indicated from
Let’s now go over each and every clustering algorithm in
the homogenous bunch is the main goal of the clustering
depth using the following framework:
approach. The new challenges that come with working
with large amounts of data make applying clustering 2.1 K-Means Clustering Algorithm
techniques to it more complicated. The current proposed
paper’s purpose is to present an overview of effective large- In the world of data mining, this is regarded as one of the
scale information clustering techniques for large-scale best clustering techniques. In order to extract the clusters
information management. This survey was conducted based according to the K value, we first try to consider the value of
on the framework, datasets, and various implementation “K.” Then, centres are designated. The closest center’s value is
technologies used for big data clustering. An extra survey used to allocate and compute each cluster in this instance. The
was conducted to address research gaps and concerns. Thus, stages are repeated until the halt or convergence criteria are
a more advanced and effective big data clustering technique satisfied. The squared blunder, or the cruel contrast between
was created. Section 1 gives a quick introduction to the study; the cluster centers and the things doled out to the clusters, is
Section 2 covers the literature on current large data clustering the key calculate deciding this end basis. The stop criterion
algorithms; and Section 3 provides a brief overview of the will be determined by the iteration steps; depending on the
analysis of various tools and frameworks. Section 4 primarily anticipated “K” value, this criterion may occasionally be
discusses the experimental reports; and Section 5 concludes. high. In order to prepare this suggested thesis, I read a number
of research publications that examined the effectiveness and
significance of the K-Means clustering algorithm: A few of
2. Related Work these are included below:
The following is the definition of the clustering problem: A Sreedhar C. et al. [3] presented a K-Means Hadoop
data collection Y = {Y1, Y2... Ym} is given, and an integer MapReduce (KM-HMR) to successfully cluster the gigantic
value p, information. He mainly presented two methods in the current
The clustering problem is to define a mapping study, like KMHMR, which concentrate on the MapReduce
(MR) framework by employing K-Means. The third tactic
f: Y → {1, ..., p}, to emphasize the importance of cluster quality was to try
Where each item Yl, l ∈{1, ..., m} is assigned to one cluster to increase intra-cluster distances while decreasing inter-
Cp, j = 1, ..., p. cluster distances. Regarding execution time, the results of
A cluster Cj contains the items mapped to it: the suggested KM-HMR approaches have surpassed the
effectiveness of existing clustering techniques.
Kj ={Yl|f(Yl) = Kj, 1, ..., m, and Yl ∈ Y}.
A cluster’s members are more similar to one another than 2.2 Joint Filtering Technique
they are to objects outside of it [6]. Cluster similarity is Here, we attempt to talk about a few writers who primarily
often measured using one of the Euclidean distances. addressed collaborative filtering.
Figure 1 presents the categorization of multiple big data
Rong Hu et al. [4] developed the Clustering-based
clustering techniques. Leader K-means clustering, Fluffy
Collaborative Filtering (ClubCF) technique to encourage
Improvement Over K-Means Algorithm Over Complex Data 51

collaborative services. The true goal of this strategy is to produce the final clusters. SOM is therefore employed in this
produce comparable services inside the same clusters. Again, strategy for both dimensionality reduction and visualisation.
this is broken down into two primary stages: In the first, all of In addition to this, we attempt to regard SOM as a spider
the data sets are broken down into discrete cluster pieces and graph, in which each graph has a significant number of
are attempted to be made appropriate for further processing. concepts that have been examined. The next kind of SOM
Collaborative filtering is used in the second phase to identify is called Growing Hierarchical Self-Organizing Maps, or
the resulting clusters according to the “k” value. The global GHSOMs, and they are also used in high dimensional data
intricacy of CF has decreased in distinction since there are processing. The GHSOM technique [8] is mostly used for
significantly fewer administrations in the cluster than there textual, numerical, web page, and other sorts of information
are total online administrations that are accessible. clustering, among other types of information clustering.
Another well-known author, Subramaniya Swamy V. et al. [5],
2.5 Method of Hierarchical Clustering
created a prediction mechanism based on the Collaborative
Filtering approach for efficient parallel processing of One technique for creating a cluster hierarchy is hierarchical
enormous amounts of data. He developed an MR framework clustering. Generally speaking, hierarchical clustering can be
that is employed in the maintenance, aggregation, and classified into two types:
filtering of effective storage. To refine data, the suggested 1. Agglomerative Clustering: In this method, each data
Collaborative Filtering is applied. point is grouped with its corresponding item, and the
two clusters are then combined.
2.3 Different K-Means Clustering 2. Divisive Clustering: In this method, each individual
Here, we attempt to address certain journalists who basically data item is assigned to a single cluster, after which it is
managed with the Variation of K-Means clustering for recursively divided into smaller groups.
colossal information; these are talked about in more detail The task is to determine which clusters in the divisive
afterward on. case should be separated and which should be merged in
Mohamed Aymen Ben Haj Kacem et al., prominent researchers, the agglomerative case. An interesting degree of disparity
created the Quickened Map Reduce-based K-Prototypes proportion is utilized to these information things from these
(AMRKP) clustering calculation [6] to cluster gigantic two circumstances [9].
volumes of information. Here, attempt to dramatically reduce Ultimately, a dendrogram graph is used to visualize the
the number of operations by limiting the number of reads and results for a clear and tangible understanding [10].
writes that the input and output operations can make on the
provided data. He also attempts to propose a pruning strategy 2.6 C-Means (FCM) Fuzzy Clustering
that would reduce the separation between the data points and
C-Means (FCM) Fuzzy Clustering is a method used in
the cluster centre, accelerating the clustering process. The
machine learning to analyze and classify data in a variety of
proposed AMRKP is compared with all previous clustering
ways. Here, we attempt to explore a few writers that focused
schemes and shows substantial improvement over several
primarily on fuzzy C-Means clustering for large data; their
other approaches in terms of efficiency and scalability.
discussions are covered in detail below.
2.4 Algorithm for Self-Organizing K-Means Simone A. Ludwig [11] made numerous attempts to
Clustering determine the FCM clustering algorithm’s scalability and
parallelization. Initially, the map and reduce function for this
Here, we attempt to discuss a few writers who primarily
FCM clustering algorithm were generated by combining the
addressed the self-organizing k-means clustering technique.
MR framework. To demonstrate the efficacy of the suggested
To overcome huge data challenges, many researchers combine approach in terms of purity function, MR-FCM clustering
modified k-means algorithms with self-organizing maps [7]. algorithm validation was carried out.
These are the main protocols that they follow. Below is a list
In order to cluster large amounts of data, Minyar Sassi Hidri et
of them:
al. [12] presented an expanded version of the FCM clustering
[1], To begin with, the hereditary calculation is utilized to algorithm that integrated the split and merge strategies. The
diminish the instability within the information and distinguish huge data is split using the split method first to create distinct
the primary cluster centers; moment, the SOM is mostly used subsets, which are then randomly sampled to create distinct
to check the number of clusters and reduce the estimation of subsamples. Our method performed well with optimized time
the information; and third, the k-means algorithm is used to and space complexities using the available resources.
52 Algorithms in Advanced Artificial Intelligence

2.7 K-Means Clustering for Leaders The three available data sets are Letter Image Recognition
(LIR), Optical Character Recognition (OCR), and Pendigits.
T. Hitendra Sarma, P. Viswanath, B. Eswara Reddy, and
The UCI Machine Learning Repository has Pendigits and LIR
others created the pioneer k-means clustering strategy as a
data sets that can be used to construct the Hybrid K-Means
crossover way to quicken the k-means clustering prepare.
and Leader K-Means algorithms.
They to begin with considered k-means as a partition-based,
iterative approach that meets to a arrangement in a limited
period of time for limited information sets [13]. 4. Results of Experiments
The authors asserted that the clustering time of an algorithm In this section, we attempt to obtain a huge data set with a
is independent of the information set’s measure. Various higher number of items and use the WEKA tool to compare
strategies exist to enhance the traditional k-means clustering several algorithms before calculating the Data processing
approach, but none of them have been found to be the most time. We attempt to draw the conclusion that, among all the
effective. They show a prototype-based crossover strategy different algorithms, some require less time than others.
for speeding up the k-means clustering calculation in this
Figure 8.2 shows a graph that compares different clustering
work. They isolated the information set into minor groupings,
algorithms over large data sets in terms of both item count
or grouplets, of shifting sizes within the recommended
and processing time.
technique. In addition to I, e Model set of rules speak to
each cluster. The set of models is at that point isolated into
k clusters once more utilizing the altered k-means approach.
The k-means clustering method, like the conventional
method, eliminates empty clusters in the iterative phase. Each
prototype in each new cluster of prototypes is swapped out
for the corresponding collection of patterns (which produced
the grouplets) in order to generate a partition of the data-set.
Since this data-set partition might not match the partition that
the conventional k-means approach obtained throughout the
data-set, a corrective action is advised.

3. Technologies Needed for

Big Data Analysis Fig. 8.2 Graph that compares different clustering algorithms

We attempt to address the many technologies that are utilized Table 8.1 Clustering computational time (in Sec.) in Weka tool
to analyses large amounts of data in this section. Let’s now go
Clustering Computational Time (In Sec.) In Weka Tool
over these in more detail:
Clustering Data set items Processing time of
The WEKA framework is a significant tool for extensive algorithm various algorithms
information investigation (http://www.cs.waikato. ac.nz/ml/ in WEKA
weka). This tool supports various clustering techniques such K-Means 260538 53
as X-means, DBSCAN, OPTICS, progressive clustering,
X-Means 260538 69
and essential k-means. The next system is known as KNIME
(http://knime.com).The following clustering methods can EM 260538 989
be used with this: The text discusses various algorithms DBScan 260538 6347
such as hierarchical clustering, fuzzy c-means, K-means, LeaderK-Means 260538 42
and self-organizing tree algorithm (SOTA).The next system
is called RapidMiner, and it allows us to do the following According to the experimental findings, we attempt to
operations: k-means and both of its modifications, X-means conclude that Leader K-means processes the data set items
and k-medoids, more quickly than several outdated.
Several more techniques are used, including DBSCAN, This, when compared to some additional algorithms, is more
EM, and SOM (http://rapidminer.com). Biolab.si’s Orange accurate and efficient in terms of time.
system is used to construct K-means, SOM, and hierarchical
clustering.
Improvement Over K-Means Algorithm Over Complex Data 53

5. Conclusion The amount of data studied and the amount of time needed
to convert the data set into usable information determine
This article discusses a number of algorithms in an effort to which algorithm is optimal. Big data, as we all know, spurs
determine which one is best for clustering large amounts of the creation of new technologies. We attempt to include all
data. The report also analyses a number of big data clustering the many kinds of algorithms that can quickly and effectively
concerns and issues. We went into great detail regarding cluster data into meaningful information and provide reports
each and every clustering algorithm, as well as the tools and in this suggested work.
techniques used to assess how well those algorithms function.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
54 Algorithms in Advanced Artificial Intelligence

Visual Representation of Lung Cancer

Image Classification Using Artificial
Neural Network
9

B. Nandana Kumar1, K. Surya Ram Prasad2, G. V. Satya Sriram3

Assistant Professor, Dept of Computer Science and Engineering,
D. N. R. College of Engineering & Technology

Abstract: The visual representation of the image tries to draw emphasis to the Lungs Image’s deficient feature vector is based
on the interest aroused by the display of a flaw in the words and images. To build a successful Google Images Search Engine,
thorough model co-training for illness diagnosis of text and image on spatial features is necessary. In the paper, customized
image rating using machine learning improves user experiences. The study makes the argument that optimizing picture ranking
for individual users can improve user experiences while also requiring specific model co-training for text and image data in a
robust machine like Google Images Search Engine. Artificial neural networks (ANN) enable experts in computer science to
complete challenging tasks including pre-processing, Feature Prediction, and Pattern Recognition. The experimental result is
on Lung Cancer Detection from feature vector in the different types of X-Ray, PET scan, WSI, Images and other types etc, with
the help of distance method.
Keywords: Artificial neural network, Image search engine, Lung cancer detection, Machine learning, Visual representation,
etc.

1. Introduction A big dataset containing millions of data points may be

added to, subtracted from, and multiplied using addition,
Saturn Cloud, 2023, categorizes data types like text, images, subtraction, and multiplication by using vectors in machine
and numbers. Vectors are commonly used for systematic and learning. This reduces computing complexity. Vectors are
efficient data representation in machine learning applications. crucial in machine learning as they enable comparison and
This section delves into the concept of vectors in the context measurement of similarity between data points by Hay Mar
of machine learning, their significance, and their application Su Aung, 2020[3]. For instance, in a dataset of images,
in this particular section. A vector is a mathematical entity representing each image as a vector and using distance
with magnitude and direction in mathematics. A vector is metrics like Euclidean distance. Regression, classification,
a mathematical representation of a collection of numerical clustering, and dimensionality reduction are a few examples
values used in machine learning by Jun Xie, 2022[4]. Each of machine learning techniques that use vectors by
number in a list or array used to represent a vector often K. Grzegorczyk, 2019[5]. For simpler display and analysis,
reflects a particular characteristic or aspect of the data. they represent input and output variables, class labels,
Forecasting house values based on bedroom count, related data points, and high-dimensional data in a lower-
Property size, and location can be represented as vectors, dimensional space.
with each component representing a distinct aspect of the Machines scrutinize images in a very precise way. To provide
home by Andrea D’Agostino, 2022[1]. the best analytical performance, the various techniques make

1
dnrnandan@gmail.com, 2surya.dnrcet@gmail.com, 3sriramgv9@gmail.com

DOI: 10.1201/9781003529231-9
Visual Representation of Lung Cancer Image Classification Using Artificial Neural Network 55

an effort to mimic how the human brain and eyes function. The Networks or ANN has applications in chatbots, which are
algorithms will use some pixel patterns that the machine has frequently used for image or text classification. Neural
previously seen a lot. Nithyashree V., 2022[6], must follow networks normally consist of an input layer, an output layer,
a thorough process while creating an image classifier. A big and a hidden layer, which are made up of components that
image object records information about a large image file translate the input into something that the output layer can
and the image information it contains. Big images are used use. Because of their complexity or sheer quantity, they are
to represent images as tiny data units that may be imported useful for seeing patterns that a human programmer could
and examined independently. Use a big image object to never extract and teach the computer to recognize.
process and view images that require more processing time
than is available or that are too enormous to fit in memory. 3. Euclidean Distance Classifier
Additionally, the item has the ability to read, examine, and
display photographs of various resolutions. Choose a portion The system uses a Euclidean distance nearest-neighbor
of the image to read. Read, set, and write blocks of data. The classifier to identify a specific input image by comparing the
level where each pixel covers the maximum surface area is probe’s feature vector to the database image’s feature vectors,
the lowest or coarsest resolution level for large pictures with using Euclidean Distance(x and y).
several resolution levels. The level of resolution where each n
Euclidean Distance ( x and y ) = Â (x y )
2
pixel has the smallest area covered is the one with the most (1)
or finest detail. i=1

There are 7 sections to this essay. The first section outlines the A Euclidean vector represents a point’s location in Euclidean
research’s motivation. The related tactic of using tendencies n-space, with tips x and y denoting two points. Its length is
in ANN is examined in Section 2. The suggested work utilizes measured by Euclidean norm or magnitude by R N V Jagan
the Euclidean classification, as outlined in Section 3. Section Mohan, 2012[8, 9].
4 of the article details the amount of picture data. Section 5,
Vector Representation Image Classification Model Based on X = X12 + X 22 + º + X 2n = X .X (2)
Machine Search Engine. The experimental findings from the
If there is a direction from x to y, the gap between points x
proposed investigation are presented in Section 6. Section 7
and y can be expressed using the formula y-x.
provides a summary of the results and a conclusion.
Y – X = (y1 – x1, y2 – x2 … yn – xn) (3)
2. Tendencies in Artificial Neural
4. Lung Cancer Image Classification
Network (ANN)
Using Machine Learning
The text explores the prevalent patterns in artificial neural
networks. A key component of ML is ANNs, which provide Digital photography has generated enormous amounts of data,
computer specialists the ability to carry out difficult operations which has fueled the development of computer vision, an area
like pattern recognition, planning, and predictionby R.N.V of artificial intelligence that uses data to identify, recognize,
Jagan Mohan,2022[10]. Similar to other machine learning and categorize images. Machine learning techniques
algorithms, artificial neural networks crisis numbers and are used to analyze pixel patterns or vectors in order to
arrange lung cancer images or text data, but they learn from categorize objects and assign labels based on predetermined
user experience and repetitive activities. Artificial Neural criteria. Classifiers extract attributes from images to predict

Fig. 9.1 Layers of artificial neural networks

56 Algorithms in Advanced Artificial Intelligence

classifications. There are ways for categorizing images that representation of pictures, we require a model that can extract
are binary or multiclass. Contrary to binary classification, image features, and for learning a vector representation of
which labels just one class of items across all pictures, text inputs, we need a model that can extract text features.
multiclass classification requires creating several labels for In order to semantically align the vector representations, the
various things by Qing Lv, 2022[7]. Both solutions require picture and text models must be trained simultaneously.
that the reference photographs be named. In-depth photo To ensure swift retrieval, we want a mechanism to quickly
classification is covered in this section. find related images while saving the currently saved photos.
As primarily focusing on vectorizing photos, it is logical to
5. Vector Representation Lung Cancer index them into a vector database. The vector representations
Detection Image Classification of the original photos are created by the indexing pipeline,
which then indexes them into a vector database.
Model Using Machine Search
The task requires us to generate a list of images when a user
Engine inputs a text or image query. The embedding generation
Machine Image search engine displays ranked list of service generates an embedding encoding of the input query.
related images based on word or image input, displaying The embedding query is sent to the vector database that
appropriate text and most similar images. This issue may returns the nearest neighbors of the query. The re-ranking
be conceptualized as a ranking issue. The model needs to service is mainly used to re-rank the nearest neighbors using
input two photos and generate a similarity score, which can a better model than the embedding generation model. It could
be used to order the photos based on this score. Utilizing be used to personalize the ranking to the specific user by
models that can learn a vectorial representation (embedding) using user-specific data. The resulting list is a list of image
of the pictures and compute a similarity metric on those IDs, and it is then sent to the image store to retrieve the actual
vectors is a common modeling strategy. For learning a vector images to return to the user.

Fig. 9.2 How use disease discovery to build machine images search engine
Visual Representation of Lung Cancer Image Classification Using Artificial Neural Network 57

6. Experimental Result Measurement is crucial for understanding the external

world, but it also introduces uncertainty, known as error.
Lung cancer, a type of cancer, originates in the lungs, which When taking measures, accuracy and precision are important
absorb oxygen and release carbon dioxide. Smokers have the considerations because they show how closely a measurement
highest risk, but quitting smoking can significantly reduce the resembles a recognized value. Accuracy and precision are
risk, even after years of smoking. The difficulty in classifying two measures of observational error, indicating the accuracy
photographs is in predicting the categories for a certain of a set of measurements and the precision of their proximity.
group of test dog pictures and evaluating the accuracy of the Accuracy: A binary classification test’s accuracy, also known
predictions given a collection of pictures that have all been as “Rand accuracy” or “Rand index,” is a statistical indicator
placed in the same category. Rank, viewpoint, dimension of its ability to correctly identify or rule out a condition. It is
variation, intra-class variance, picture distortion, image a test parameter that contrasts the probability estimates from
obstruction, lighting concerns, backdrop clutter, etc. are some the pre- and post-test.
of the challenges the subject presents.
TP + TN
First, the difference between the starting point and Accuracy = (4)
segmented images is measured using the Root Mean Square TP + TN + FP + FN
Error (RMSE), which is used to assess the segmentation Where FN = False negative, TN = True negative, FP = False
performance. Mathematical illustration of RMSE. I, J, and positive, and TP = True positive.
M stand for the image’s pixel positions, M and N for its size
Based on 91 accurate predictions made out of 100 cases, the
using Python by Raunak Goswami, 2023.
model has a 91% accuracy rate in identifying 100 tumors
Since there is now less distance between two lines, residuals as benign or malignant. However, the model only correctly
can be seen along the x-axis using Seaborn’s residual plot identifies 1 malignant tumor out of the 9 benign ones,
function. With the assumption that it belongs to class “1, the resulting in 8 out of 9 malignancies going undiagnosed. This
model successfully predicts the number of “person” lung suggests that the model is not as effective as a model that
cancer photos from an unpublished image. always predicts benign. Accuracy alone is insufficient when

Fig. 9.3 Same person different kinds of lung cancer image data set

Fig. 9.4 Lung cancer images error rate for normalization

58 Algorithms in Advanced Artificial Intelligence

dealing with a class-imbalanced data set with a significant Representation, Computational Data and Social Networks,
difference between positive and negative labels. Springer, pp 340–348, 2021.
2. Andrea D’Agostino: Vector Representation for Machine
1 + 90 Learning, Towards in Data Science, 2022.
Accuracy = = 0.91 (5)
1 + 90 + 1 + 8 3. Hay Mar Su Aung, Win Pa: Analysis of Word Vector
Representation Techniques with Machine-Learning Classifiers
Precision: Precision is the percentage of occurrences or
for Sentiment Analysis of Public Facebook Page’s Comments
samples that are correctly classified, as determined by the in Myanmar Text, DOI: 10.1109/ICCA49400.2020.9022842,
formula. IEEE Xplore: 05 March 2020.
TP 4. Jun Xie: Vector in machine learning, Medium, 2022.
Precision = (6) 5. K Grzegorczyk: Vector representations of text data in deep
TP + FP + FN
learning, arXiv, 2019.
where FN = False negative, FP = False positive, and TP = 6. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
in Health Care Using Hybrid Model Algorithms”,Journal of
True positive.
Artificial Intelligence, Machine Learning and Neural Network
Out of the 160 samples in Dataset -1, 105 of the predictions (JAIMLNN), Volume 3, 2023.
made by a lung cancer image model are accurate, whereas the 7. Nithyashree V: Image Classification using Machine Learning,
remaining 55 are not. Determine this model’s precision value. Analytics Vidhya, 2022.
True positives (TP) = 105 and false positives (FP) = 55 are the 8. Qing Lv,Suzhen Zhang:Deep Learning Model of Image
model’s results. Precision is calculated as follows: TP/(TP + Classification Using Machine Learning,Advanced
FP) = 105/ (105 + 55)= 105/ 160= 0.65625.As a result, the Pattern Recognition Systems for Multimedia Data,
Hindawi, Volume 2022, Article ID 3351256, https://doi.
model’s precision is 0.65625.
org/10.1155/2022/3351256,2022.
9. R.N.V. Jagan Mohan, R. Subbarao and K. Raja Sekhara Rao:
7. Conclusion and Future Perspective Similarity of Inference Face Matching on Angle Oriented Face
Recognition, Published in Journal of Computer Engineering
Vectors are essential to machine learning for structured and Intelligent Systems from www.iiste.org, ISSN:2222-1719
data representation and efficient operations like regression, (Paper) ISSN 2222-2863 (Online), Vol 3, No.2, 2012.
classification, clustering, and dimensionality reduction, 10. R.N.V. Jagan Mohan and R. Subbarao and Kurra Raja Sekhara
therefore data specialists must be able to understand and Rao: Efficient K-Means Cluster Reliability on Ternary Face
manipulate them. With 13.2% of new cases and 25.9% of Recognition using Angle Oriented Approach, Published in
fatalities due to cancer, lung cancer is the most prevalent International Journal of Informatics and Communication
cancer-related cause of death. Depending on the pathology Technology (IJ-ICT) Vol.2, No.1, January 2013, pp. 180-187
classification, disease stage, and patient features, the ISSN: 2252-8776, http://dx.doi.org/10.11591/ij-ict.v2i1.1779.
11. R.N.V.Jagan Mohan: Machine Learning approach for corona
prognosis varies.
virus disease extrapolation: A case study, International Journal
of Knowledge-based and Intelligent Engineering Systems,
References Vol-26,219-227, ISSN: 1327-2314(print),1875-8827(online)
DOI:10.3233/KES-220015,2022.
1. Akshat Gaurav, B. B. Gupta, Ching-Hsien Hsu, Arcangelo 12. Raunak Goswami: RMSE: Root-Mean-Square Error in
Castiglione & Kwok Tai Chui: Machine Learning Technique Machine Learning, Includehelp.com, April 16, 2023.
for Fake News Detection Using Text-Based Word Vector
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Machine Learning Improve Predictive

Analysis of Diabetes Disease 10

K. Durga Bhavani1, CH. Vinod Varma2, B. Mounika3

Assistant Professor, Department of Computer Science and Engineering,
Sagi Rama Krishnam Raju Engineering College

Abstract: Diabetes is a severe illness that can cause blindness, kidney stones, heart problems, and other issues. Deep learning
has improved system abuse information processing, which can identify polygenic illnesses early on and provide patients with
access to critical information. This method retrieves diabetes-related data from databases by means of information withdrawal.
This research aims to create a system that can accurately predict a patient’s risk of acquiring diabetes by utilising decision trees,
artificial neural networks, naive bayes, random forests, and support vector machines.
Keywords: Artificial neural networks, Decision tree, Naive bayes, SVM etc.

1. Introduction contributions to this field. Computerised information systems

were employed by Veena Vijayan and Anjali to anticipate and
An insulin deficiency causes diabetes, which, if left untreated, identify diabetic problems using decision trees, SVM, Naive
causes decreased activity and elevated blood sugar levels. If Bayes, and ANN algorithms. In 2017, researchers P. Suresh
left untreated, diabetes can lead to serious complications, Kumar and V. Umatejaswi used data mining techniques
including decreased activity and elevated blood sugar levels. including Decision Tree, SVM, and Naive Bayes to diagnose
Problems with the heart, foot ulcers, or vision could be diabetes [9]. Diabetic retinopathy (DR) is a leading cause of
signs of serious problems. According to Kononenkoi, 2001 blindness in people with diabetes. In this work, Nentwich
[6], a history of elevated blood sugar levels is indicative of et al. assess the effectiveness of different machine-learning
prior diabetes. In the past, the average person’s impact from algorithms in detecting and treating DR. In 2015, Dr. M.
diabetes was lower. Exogenous hypoglycemic medications Renuka Devi and Dr. J. Maria Shyla [8] discussed various
that are either not made properly or absorbed well can lead to diabetes prediction algorithms, such as J48, Decision Tree,
diabetes by 2022 [11] by Yifan Qin and colleagues. Medical Random Forest, and Naive Bayes. In order to save a patient,
professionals might benefit from various information- Rahul Joshi and Minyechil Alehegn recommend applying
mining strategies. The correctness determines the survival machine learning techniques like Naive Bayes and KNN.
time of the chosen emotional support network. In order to The quality of analysis has increased since Zhilbert Tafa and
study and speculate about a given illness effectively, it is Nerxhivane Pervetica introduced computation aftereffects. In
critical to have a carefully selected network of emotional 2015, researchers Prof. Dhomse Kanchan B. and Mr. Mahale
support people. According to Humar Kahramanli (2008), Kishor M. investigated the use of component analysis in
computers can learn to solve real-world issues with the help disease prediction utilising machine learning algorithms such
of deep learning, a subfield of artificial intelligence [5].Listed as SVM, Naive Bayes, and PCA [7].
below are a number of writers who have made significant

1
bhavanisrkrcse@gmail.com, 2vinodvarmaaa@gmail.com, 3bmounika88@gmail.com

DOI: 10.1201/9781003529231-10
60 Algorithms in Advanced Artificial Intelligence

2. Related Work are shown in the block diagram that goes with it. According
to Baliunas, D.O., 2019 [4], the order computations that are
Marius et al. have developed a method for rapidly creating most frequently used to verify accuracy are artificial neural
nearest neighbours from data, as opposed to determining the networks (ANNs), Support vector machines, decision trees,
fastest and closest neighbour. Direct pursuit offers the fastest and naive bayes. XGBoost is an open-source software tool
resolution for high-dimensional PC vision issues, which is that uses the distributed inclination for tree computations
why this method is employed. There are little limitations on provided by the Gradient Boosting technique to improve
precision with rough techniques, and they allow for quick the performance and computational speed of AI models.
work. Because of how well XGBoost performed in structured data
competitions on Kaggle, its popularity has soared in recent
2.1 Naive Bayes Classifier years. In these competitions, data miners and information
Based on previous facts of the most likely pertinent analysts compete to create the most precise models for
circumstances, Naive Bayes determines the chance that an interpreting and predicting the data they gather. A number
event will happen. Naive bayes is the quickest and simplest of programming languages, including Java, Scala, Julia,
method for classifying large amounts of data. Sentiment Python, and R, have embraced XGBoost due to its extensive
analysis, text sorting, spam separation, and recommend er usage and improved developer advantages. Many tools
systems are just a few uses for NB classifiers. Probability and libraries, including distributed processing frameworks
theory’s Bayes hypothesis can be used to forecast illegible like Apache Spark and Dask, Caret, and Scikit-Learn, are
classifications. Gullible Bayes is an efficient computation compatible with XGBoost. Its exceptional processing speed
that can be finished in a matter of seconds. As a result, it and prediction performance attract data scientists.
could perform better than models that are more sophisticated
when the data is scarce by Quan Zou, 2018[10]. 3.1 Artificial Neural Network (ANN)
Machine learning techniques are essential for predicting,
2.2 Support Vector Machine pattern recognition, and planning. Particularly in chatbots
A training instance must adhere to in order to be considered for text classification, their capacity to learn from previous
a support vector. The ideal (maximum-margin) hyperplane experiences and repetitive user actions is causing them to
still provides the solution to our problem of diabetic illness gain popularity (Alicic, R. Z., 2017 [3]). The three main
diagnosis even if all training samples are removed from the components of a neural network—input, output, and a hidden
analysis save those from the support vectors. As a result, they layer—are described by Aman Preet Gulati (2022) [2] as
were given the name “support vectors”. By yuw, 2010[12] having the ability to learn and detect complex patterns that
would be challenging for humans to extract manually.
2.3 Decision Tree
A decision tree is a supervised learning algorithm that can
handle classification and regression issues, in contrast to
previous methods by Zhang Zq [13]. It consists of a root
node, branches, and leaf nodes in a geometric configuration.

2.4 Random Forest Tree

A random forest categorization method consists of many
decision trees. It builds each individual tree using bagging and
feature randomization in an effort to create an uncorrelated
forest of trees whose committee forecast is more accurate
than that of any one tree. By Aishwary Mujumdar in 2019.

3. Proposed Work Fig. 10.1 Artificial neural networks

Our methodology requires a larger dataset for efficiency and

productivity, which would limit clinical research. System 4. Implementation of Diabetes
administrators will choose an algorithm to detect diabetes, Disease
provide patients with wise counsel, and produce a printed
report—potentially improving healthcare. The algorithmic Diabetes is treated with the Python programming language
combinations that will be applied in the suggested system by exploiting its constituent parts.
Machine Learning Improve Predictive Analysis of Diabetes Disease 61

4.1 Numpy
NumPy is a Python bundle that provides layered exhibit
objects, C, C++, and different dialects. It has applications
in direct polynomial math, arbitrary number age, and a
productive complex holder for conventional information. To
install, run “pip install numpy” and import numpy as np.

4.2 Pandas
Pandas is a Python library that provides robust data structures
for data control and analysis, inspired by Panel Data, an
econometric method using multi-layered data.
Wes McKinney created pandas as a high-performance, Fig. 10.2 Confusion matrix for predicted diabetes values
versatile data analysis tool, addressing Python’s limitations
in data manipulation and prepping, enabling five standard model correctly identified 940 out of 1000 non-positive
data processing processes: load, prepare, manipulate, model, prediction pictures, but incorrectly classified 60, resulting in
and analyze. true-negative samples and false-positive samples. The matrix
shows no misclassified samples, as diagonal entries accurately
Python and Pandas are widely used in academic and business
predict each class. Further progress towards genuine measures
settings. NumPy can be installed using the Python package
is recommended due to better understanding of the disarray
installer pip, while Pandas is not included in the standard
grid.
Python distribution.
5.2 Accuracy
5. Effectiveness Metrics’ Results The number of accurate forecasts divided by the total number
The different algorithms are compared with respect to F1 of expectations yields the accuracy metric, which has
score, specificity, accuracy, sensivity, and precision. The increased by 1030 out of 1100 accurately predicted in the
research work’s accuracy metrics are applied to the ANN’s prior model.
accuracy measures, and the following results are obtained.
5.3 Precision
5.1 Confusion Matrix for predicted Diabetes Due to the unequal class distribution, the taxonomy of
values correctness is not necessarily a trustworthy indicator of model
The chaos grid, also known as the mistake framework, is a performance. A high accuracy rate can be achieved even if
crucial concept in classification execution, providing a clear the model predicts all samples as the most common class,
comparison between model predictions and ground truth achieving a 90.9 percent accuracy rate. Our aim to analyze
marks, with rows representing projected class occurrences class-explicit measurements, such as “accuracy,” which is
and columns representing actual class occurrences. An defined as:
example of a binary classification system distinguishing Precision = True positive/(True positive + False positive)
positive and negative prediction pictures is presented, using The precision of prediction can be calculated using the
a disarray lattice to understand data behavior in an 1100 shot formula: Precision positive prediction = #samples accurately
test set. predicted positive. Negative predictive power is 90/(90+60),
or 60 percent. A non-positive prediction accuracy of 940/950
Table 10.1 Displays a confusion matrix for predicted diabetes
= 98.9%.
values
The model’s accuracy in predicting negative test results is
N = 400 Predicted
significantly higher due to its better categorization of non-
Actual No Diabetes Yes Diabetes
positive prediction pictures during training.
No Diabetes 40 20
Yes Diabetes 120 220 5.4 Recall
The model’s accuracy in accurately predicting a specific class
The model’s 100 positive predictions were incorrectly of samples is a crucial metric. Recall = True Positive/(True
classified, with only 10 being false negatives, indicating Positive + False Negative)
a misinterpretation of the “positive prediction” class. The
62 Algorithms in Advanced Artificial Intelligence

The recall rate can be calculated for both positive and non- rates and higher misleading positive rates lower the cut-off
positive predictions classes. incentive for positive classification by Kononenkoi,2001[6],
Recall positive prediction = 90/100 = 90% indicating a compromise between review quality and FPR.
The ROC curve assesses model performance.
Recall NonPositive forecast = 940/1000 = 94 percent.
5.7 Area Under the Curve
5.5 F1 Score
For statistical analysis, the area under the ROC curve is an
The F1-score, a measure of exactness and review, is often used essential measure. Because it takes into account all possible
to combine high recall and accuracy in various applications, threshold values, the AUC, a performance metric for binary
despite the importance of precision or recall. classifiers, is unaffected by the threshold value. We measure
F1-score = 2 * Precision * Recall/(Precision + Recall) the likelihood of a positive example chosen at random
The F1-score can be determined by examining the confusion performing better. Not all key performance indicators
matrix in Figure. benefit equally from limit-free metrics, despite the fact that
area under the curve (AUC) is a crucial model performance
F1_positive prediction = 2 * 0.6 * 0.9/(0.6 + 0.9) = 72% parameter. By tweaking its edge, you can accomplish
The trade-off between model accuracy and recall is evident, minimal demands without compromising the model’s good
with higher precision leading to lower recall rates and vice AUC. When you analyse a categorization model, consider
versa. factors such as business requirements and the ramifications
of low recall or accuracy. Clarity and interpretability are two
5.6 ROC Curve advantages of employing probabilities over a single mark
It is a crucial tool for understanding and analyzing the yield; nevertheless, support vector machines (SVMs) are
performance of a receiver. The diagram illustrates how less interpretable because they do not provide an essential
a paired classifier compares its true positive rate (TPR) likelihood.
against false positive rate (FPR) for different limit values.
Probabilistic classification models estimate the likelihood of 6. Experimental Results
a positive prediction, which is compared to a cut-off limit.
The model can predict probabilities between 0.45, 0.6, 0.7, In order to effectively battle diabetes, algorithms’ accuracy,
and 0.3. efficiency, and development of new talents may all be
improved, making the system ideal for hospitals as a full-
Cut-off = 0.5: predicted-labels = [0, 1, 1, 0] (default threshold) service healthcare diagnostic system.
Cut-off = 0.2: predicted-labels = [1, 1, 1, 1]
Cut-off = 0.8: predicted-labels = [0, 0, 0, 0] Table 10.2 Accuracy arc for TPR and FPR

Adjusting threshold values yields diverse labels, varying Classifiers K-NN SVM Naivy Decision Random ANN
Bayes Tree Forest
accuracy and recall rates. TPR and FPR are calculated using
the ROC curve, comparing TPR and FPR, as shown in an Accuracy (%) 76 75 74 71 71 85
example. The model demonstrates that higher genuine positive

Fig. 10.4 Accuracy arc for TPR and FPR

7. Conclusion and Future Perspective

A larger dataset is required for more precise predictions
Fig. 10.3 ROC curve for TPR and FPR
than the one used for the last forecast. For less serious
Machine Learning Improve Predictive Analysis of Diabetes Disease 63

diabetic symptoms, the app does not include a guidance 6. Kononenko I.: Machine learning for medical diagnosis:
system. A computer programme will analyse data from History, state of the art and perspective, Artif. Intell. Med.
2000 diabetes patients and provide personalised treatment 2001; 23: 89–109, doi: 10.1016/S0933-3657(01)00077-X,
recommendations based on their individual levels. We will 2001.
7. Kumar Dewangan A., Agrawal P. Classification of diabetes
compare the accuracy of Decision Tree, Random Forest,
mellitus using machine learning techniques, Int. J. Eng. Appl.
Naive Bayes, and K-Nearest Neighbor. Text, pictures, and
Sci., 2: 257905, 2015.
trees are all good training data for SVMs. However, sensitive 8. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
adjustments to basic limits are required. Various other options in Health Care Using Hybrid Model Algorithms”,Journal of
include decision trees, simple bayes, and ANN. In contrast Artificial Intelligence, Machine Learning and Neural Network
to the easy-to-understand but unpredictable choice trees, (JAIMLNN), Volume 3, 2023.
the robust Simple Bayes method focuses on the structure of 9. Nentwich M.M., Ulbig M.W.: Diabetic retinopathy—Ocular
data sources. Even though ANN is simple and accurate, it complications of diabetes mellitus, World J. Diabetes, 6: 489–
might be challenging to deal with very large data sets. The 499, doi: 10.4239/wjd.v6.i3.489, 2015.
proposed method is suitable for hospitals as a comprehensive 10. P. Suresh Kumar and S. Pranavi: Performance Analysis of
healthcare diagnostic system because it can be enhanced by Machine Learning Algorithms on Diabetes Dataset using
Big Data Analytics, International Conference on Infocom
increasing efficiency, creating new skills, and improving the
Technologies and Unmanned Systems, 978-1-5386-0514-1,
accuracy of algorithms to successfully fight diabetes.
Dec. 18–20, 2017.
11. Quan Zou,Kaiyang Qu,Yamei Luo,Dehui Yin,Ying Ju,Hua
References Tang: Predicting Diabetes Mellitus With Machine Learning
Techniques,Frontier Genetic, Volume 9, https://doi.org/
1. Aishwarya Mujumdar, V Vaidehi Dr: Diabetes Prediction 10.3389/fgene. 2018.00515, 06 November 2018.
using Machine Learning Algorithms, Science Direct, https:// 12. Yifan Qin, Jinlong Wu, Wen Xiao, Kun Wang, Anbing Huang,
doi.org/10.1016/j.procs.2020.01.047, Procedia Computer Bowen Liu, Jingxuan Yu, Chuhao Li, Fengyu Yu, and Zhanbing
Science, Vol: 165, Pages: 292–299, 2019. Ren: Machine Learning Models for Data-Driven Prediction of
2. Aman Preet Gulati: Diabetes Prediction Using Machine Diabetes by Lifestyle Type, Int J Environ Res Public Health,
Learning, Analytics Vidhya, January 4, 2022. 2022 Nov; 19(22): 15027, Published online 2022 Nov 15, doi:
3. Alicic R.Z., Rooney M.T., Tuttle K. R. Diabetic Kidney 10.3390/ijerph192215027,2022.
Disease: Challenges, Progress, and Possibilities, Clin. 13. Yu W., Liu T., Valdez R., Gwinn M., Khoury M.J: Application
J. Am. Soc. Nephrol, 12: 2032–2045, doi: 10.2215/ of support vector machine modeling for prediction of common
CJN.11491116,2017. diseases: The case of diabetes and pre-diabetes, BMC Med.
4. Baliunas D.O., Taylor B.J., Irving H., Roerecke M., Patra Inform. Decis. Mak.,10: 16. doi: 10.1186/1472-6947-10-16,
J., Mohapatra S., Rehm J. Alcohol as a risk factor for type 2010.
2 diabetes: A systematic review and meta-analysis, Diabetes 14. Zhang Z.Q., Yang L.Q., Han W.T., Wu Y.Y., Zhang L.H., Gao
Care.32: 2123–2132, Doi: 10.2337/dc09-0227, 2009. C., Jiang K., Liu Y., Wu H.Q. Machine Learning Prediction
5. Humar Kahramanli, Novruz Allahverdi: Design of a Hybrid Models for Gestational Diabetes Mellitus: Meta-analysis. J.
System for the Diabetes and Heart Disease, Expert Systems Med. Internet Res. 2022; 24: e26634, doi: 10.2196/26634,
with Applications: An International Journal, 35, 1–2, July 2022.
2008.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
64 Algorithms in Advanced Artificial Intelligence
Tackle Comorbid Obesity in T2DM by
Applying New Strategies to Optimize
Glycaemic Control and Weight 11
Management

Yugandhar Bokka1
Research Scholar, Gandhi Institute of Engineering and Technology (GIET) University,
Gunupur, Odisha
R. N. V. Jagan Mohan2
Associate Professor, Sagi Rama Krishnam Raju Engineering College, Bhimavaram
M. Chandra Naik3
Professor, Gandhi Institute of Engineering and Technology (GIET) University,
Gunupur, Odisha

Abstract: Obesity management is an important therapeutic goal for patients with diabetes and obesity. The majority of patients,
sadly, do not receive care that is in line with the most recent findings. The recommendation is to prioritize weight loss promotion
in at-risk patients, utilizing newer medications as needed. To address comorbid obesity in type 2 diabetes, new strategies are
being used to improve weight management and glycaemic control. The paper aims to improve patients’ capacity to manage
individuals with common disorders. This activity aims to review the relationship between obesity and type-2 diabetes mellitus,
discuss the reimbursement of early treatment intensification and weight loss, discuss the latest clinical trial results for dual
GIP/GLP-1 RAs, and optimize care for patients with obesity and T2DM. The study explores new strategies for optimizing
glycemic control and weight management in patients with comorbid obesity using virtual patient simulation with the K-Nearest
Neighbour method.
Keywords: Clinical trail, Diabetes, GIP/GLP-1, K-Nearest neighbour, Obesity, T2DM, Therapeutic etc.

1. Introduction women includes physical examinations, weight assessments,

blood tests, urine samples, ultrasounds, discussions regarding
According to Cui, Shiyue, et al. (2020), obesity is a substantial the health of the mother and foetus, and pregnancy-related
risk factor for cardiovascular (CV) illness, with a number of inquiries. Proper prenatal care, including maintaining a
routes raising the likelihood. An educational exercise that balanced diet, exercising frequently, maintaining a healthy
lowers obesity-related CV risk can be achieved through the weight, and avoiding dangerous substances like radiation
use of a quiz with supporting data. (Vicky Jocelyn, Ama Moor, and lead, can reduce pregnancy difficulties. In 2015, A.
and others, 2017) Cardiovascular disease (CVD) and obesity Mohammedbeigi et al. By abstaining from alcohol and
are associated, and research suggests that treating obesity tobacco smoking, women can lower their risk of developing
can lower the risk of CVD. For different CV risk profiles, foetal and infant problems such as foetal alcohol spectrum
multicomponent, customised solutions can yield better disorders and sudden infant death syndrome. To reduce the
results. 2017; A. Maxwell et al. In addition to minimising incidence of neural tube abnormalities, Preventive Services
difficulties and informing women about their rights, prenatal recommends 400 micrograms of folic acid in daily prenatal
care is essential for a successful pregnancy. Prenatal care for vitamins. One in every 33 newborns is born with a birth defect
1
bokka.yugandhar@giet.edu, 2mohanrnvj@gmail.com, 3srichandra2007@gmail.com

DOI: 10.1201/9781003529231-11
Tackle Comorbid Obesity in T2DM by Applying New Strategies to Optimize Glycaemic Control and Weight Management 65

during the first three months of pregnancy, which includes 2. Proposed Work
congenital heart abnormalities and spina bifida. The primary
reason infants die is due to these abnormalities. Most birth The study explores the use of virtual patient simulations
malformations are caused by a combination of environmental to improve glycaemic control and weight management
factors, behaviours, and genes (H. Tada et al., 2017). However, strategies for comorbid obesity in type 2 diabetes. The
the precise aetiology of many disorders is yet unknown. Taking following objectives are as follows:
certain pharmaceuticals, having specific medical conditions, The study explores the relationship between obesity and type
taking known birth defect-causing medications, having a 2 diabetes mellitus (T2DM), the benefits of earlier treatment
family history of birth abnormalities, and giving birth after intensification and weight loss for achieving glucose control,
the age of 35 are all factors that increase the chance of having and the long-term impact of these interventions on overall
a child with a birth defect (Ng, Marie, et al., 2014). For the health outcomes in patients with T2DM.
best course of therapy, speak with a doctor. Medical disorders
like diabetes, high blood pressure, and infections can cause The second objective is to present the latest clinical trial
complications during pregnancy. According to N. Demirel et results on dual GIP/GLP-1 RAs’ impact on A1C, obesity, and
al. (2018), women who have diabetes, gestational diabetes, dyslipidemia and their implications for future practice.
chronic hypertension, or infections should be under closer The ultimate goal is to optimize T2DM and obesity care
medical observation. The right care ensures a good pregnancy. through a comprehensive approach combining lifestyle
According to P. Bramlage et al. (2014), concentrating on modifications and personalized pharmacologic strategies for
a good pregnancy assures a healthy pregnancy, a smooth weight loss and glycemic control.
transition to a positive labour and delivery, and an optimistic
view of parenthood. Researchers conduct clinical trials to
address health challenges and discover pertinent therapies
with a varied population, including older persons (S. Khan
Fig. 11.1 Block diagram for proposed work
and T. Yairi, et al., 2018). Phases 1, 2, and 3 are among the
stages that these studies go through in order to assess safety, K-Nearest Neighbour Classification: K-Nearest
efficacy, and side effects. A medicine or device cannot be Neighbours (KNN) is a tool that assists in selecting the most
licenced by the FDA until Phases 1, 2, and 3 have been suitable neighbourhood for a new friend by involving three
completed. A participant may only take part in one trial at close friends with similar interests.
a time and must meet the inclusion and exclusion criteria. A KNN uses a simple rule: close friends likely share similar
class of drugs called GLP-1 RAs reduces blood sugar levels interests, which guides its decision-making process.
and treats type 2 diabetes. Following the approval of a daily
oral version of semaglutide, patients can inject GLP-1 RAs 1. Measure Distances: To find friends with similar interests,
twice daily, once daily, or once weekly. These drugs work in compare the interests of the new friend to your own.
similar ways to control blood sugar levels by raising insulin 2. Chooses the closest Friends: Choose K close, similar-
secretion in response to hyperglycemia, blocking glucagon minded friends as your nearest neighbours who share similar
secretion, delaying stomach emptying, and reducing caloric interests with the new friend.
intake and body weight. Compared to short-acting drugs, 3. Friends Opinion’s: Ask friends for advice on which
long-acting therapies have a greater effect on HbA1c and neighbourhood to join a new friend, as their opinions are
plasma glucose levels during overnight and fasting. For valuable due to shared interests and proximity.
those with type 2 diabetes, GLP-1 RAs are recommended
as the initial injectable glucose-lowering drug, even prior KNN is akin to seeking advice from friends who share
to initiating insulin therapy. Patients can use GLP-1 RAs similar interests and live nearby for decision-making.KNN is
with basal insulin in formulations that have a fixed or free a machine learning algorithm that can sort objects based on
dosage. GLP-1 RAs may also help prevent renal issues. This similarity, recommend content on healthcare platforms and
essay’s remaining sections are organised as follows: Section identify unusual or unique items.
1 discusses the introduction. In Section 2, we provided an Pseudo code for KNN:
explanation of the desired work. Section 3 presents a given • Calculate Euclidean distance between the pointsd (x, xi)
overview of the dataset and experimental results. Section 4 where i = 1, 2… n.
provides a concluding analysis and perspective.
• Sort the n Euclidean distances in non-decreasing order.
66 Algorithms in Advanced Artificial Intelligence

• Take k-value; take the first k-distances from sorted list. The accuracy of the KNN model is 72.08 %. As a result,
• Find those k-points corresponding to these k-distances. applying the KNN model to the prediction of risk in pregnant
• ki denotes the number of points belonging to the ith class women can yield superior results.
among k points.
4. Conclusion
3. Experimental Result This study employs virtual patient simulations to enhance
The diabetes dataset is collected from Kaggle. This dataset weight management strategies for people with type 2 diabetes
contains different features like pregnancies, Glucose, Blood who also suffer from comorbid obesity and to better control
pressure, skin thickness, insulin, BMI, age etc. blood sugar levels. Delve into the link between obesity and
type 2 diabetes, the benefits of initiating treatment sooner, and
From the Fig. 11.2, it can be seen that the features of diabetes the lasting effects of these measures on those afflicted with
pregnancy women exposed to age and obesity. Training is the condition. Also included are the findings from clinical
performed in the model to achieve early prediction. trials that utilised dual GIP/GLP-1 RAs. Ideally, disease risk
prediction in medicine would help doctors identify potential
patterns and risks of disease before making a diagnosis and
taking any necessary measures to treat or prevent the patient’s
condition.

References
1. Ama Moor, Vicky Jocelyn, et al. “Dyslipidemia in patients
with cardiovascular risk and disease at the University Teaching
Hospitalof Yaoundé, Cameroon, International journal of
Fig. 11.2 Features of dataset vascular medicine, 2017.
The experimental result is on obesity and type 2 diabetes, 2. A. Maxwell, R. Li, B. Yang, H. Weng, A. Ou, H. Hong, Z.
Zhou, P. Gong, and C. Zhang,: Deep learning architectures for
the benefits of early treatment intensification, and the long-
multi-label classification of intelligent health risk prediction,
term effects of these interventions on T2DM patients. It also
BMC Bioinf., vol. 18, no. S14, pp. 523–525, Dec. 2017.
presents clinical trial results on dual GIP/GLP-1 RAs. 3. A.Mohammadbeigi, E. Moshiri, N. Mohammadsalehi, H.
For training and prediction, use the KNN model. Figure 11.3 Ansari, and A. Ahmadi: “Dyslipidemia prevalence in Iranian
shows the KNNmodel varying the number of neighbours. The adult men: The impact of population-based screening on the
below graph shows that varying number neighbours, it is also detection of undiagnosed patients, “World J. Men’s Health,
show the comparison of training dataset and testing dataset. vol. 33, no. 3, p. 167, 2015.
4. Cui, Shiyue, et al: Research on risk prediction of dyslipidemia
in steel workers based on recurrent neural network and LSTM
neural network. IEEE Access 8 (2020): 34153-34161.
5. D. Wang, J. Fan, H. Fu, and B. Zhang: Research on
optimization of big data construction engineering quality
management based on RNNLSTM, Complexity, vol. 2018,
pp. 1–16, Jul. 2018.
6. G. Jain, M. Sharma, and B. Agarwal: Optimizing semantic
LSTM for spam detection, Int. J. Inf. Technol., vol. 11, no. 2,
pp. 239–250, Apr. 2018.
7. H. Tada, M.-A.Kawashiri, and M. Yamagishi: Comprehensive
genotyping in dyslipidemia: Mendelian dyslipidemia caused
by rare variants and mendelian randomization studies using
common variants,”J. Hum. Genet. vol. 62, no. 4, pp. 453–458,
Jan. 2017.
8. Ng, Marie, et al.: Global, regional and national prevalence of
overweight and obesity in children and adults during 1980–
2013: a systematic analysis for the Global Burden of Disease
Study2013.” The lancet, 384.9945, 766–781, 2014.
Fig. 11.3 Varying number of neighbours 9. N. Demirel, S. Özbay, and F. Kaya: The effects of aerobic and
anaerobic training programs applied to elite wrestlers on body
Tackle Comorbid Obesity in T2DM by Applying New Strategies to Optimize Glycaemic Control and Weight Management 67

mass index (BMI) and blood lipids, J. Edu. Training Stud., 11. S. Khan and T. Yairi: A review on the application of deep
vol. 6, no. 4, p. 58, Mar. 2018. learning in system health management,’ Mech. Syst. Signal
10. P. Bramlage, S. T. Azar, O. Okkeh, P. Brudi, B. M. Process., vol. 107, pp. 241–265, Jul. 2018.
Ambegaonkar, H. A. Hantash, S. Jambart, M. El Zaheri, 12. S. Hussain, J. Keung, A. A. Khan, A. Ahmad, S. Cuomo, F.
R. Rachoin, A. Chafoun, and L. LaHood: Factors in_ Piccialli, G. Jeon, and A. Akhunzada: Implications of deep
uencingdyslipidemia in statin-treated patients in Lebanon learning for the automation of design patterns organization,
and Jordan: Results of the dyslipidemia international study, J. Parallel Distributed Computer, vol. 117, pp. 256–266, Jul.
Vascular Health Risk Manage., vol. 10, p. 225, Apr. 2014. 2018.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
68 Algorithms in Advanced Artificial Intelligence
A Literature Survey on Deep Learning
Approach Used for Audio-to-Sign
Conversion with Gesture Recognition for 12
the Deaf and Dumb

B. Veerendra1
D. Ramakrishna2
Research Scholar, Department of Computer Science and Engineering, GITAM University,
Visakapatnam, Andhra Pradesh, India
Assistant Professor, Department of Computer Science and Engineering, GITAM University,
Visakapatnam, Andhra Pradesh, India

Abstract: Syntax is the process of arranging words and phrases in a language to form coherent sentences. Deaf and dumb
people use a variety of signs in Sign Language to communicate with one another. We must master the sign language, especially
is a challenging endeavor, in order to converse with the deaf and dumb. A person who is hearing impaired may or may not
comprehend the speaker, and the speaker may or may not comprehend the hearing impaired person’s sign language. Therefore,
if one wishes to have a conversation that is understandable with people who are deaf and dumb, they must learn Sign Language.
The technology that is utilized in this proposed model is deep learning, and it is a desktop application that was designed and
developed using the Python programming language. Convolution Brain Organization (CNN) is a Profound Learning Strategy
utilized for breaking down the pictures taken and to recognize the signs. It can translate hand gestures into text and audio into
sign language. We use Python programming and a Deep learning techniques for analyzing the webcamera input. The proposed
style uses Convolution Neural Network (CNN) which enables to categorize pictures of different hand and sign movements for
the Alphabets.
In this proposed model, there are two main features included which helps to reduce communication difficulties with deaf
and dumb people. They are: Hand Gesture Recognition and Audio to Sign Language Conversion. In Audio to Sign Language
Conversion, the audio of the speaker is recognized by this model, and then following the conversion of speech to text as well
as text into a sign language. In the user’s hand gestures is captured from web camera and then it uses the Convolution Neural
Network model, gestures are recognizable and displays hand gesture into text form. The proposed paper’s primary goal is to
facilitate communication with the people who are deaf and dumb. It fills the communication gap between the hearing and the
dumb and the general public. Text and images representing this model’s output are displayed on the desktop computer’s screen.
Keywords: Audio to sign conversion, Sign to text, Hand gesture recognition, Convolutional neural network (CNN), Deep
learning

1. Introduction particular kinds of expertise. Deep learning is a method of

automated predictive analytics.
Deep Learning Technology is used in making this model. Deep learning is used because it helps to simplify and speeds
Deep learning is also known as deep structured learning. A up the processes like gathering, analyzing, and interpreting
subfield of artificial intelligence (AI) and machine learning massive volumes of data. Deep learning methods can be
(ML) called deep learning mimics how people acquire applied to various fields including computer vision, speech
1
vbethine@gitam.in, 2rdamodar@gitam.edu

DOI: 10.1201/9781003529231-12
A Literature Survey on Deep Learning Approach Used for Audio-to-Sign Conversion with Gesture Recognition for the Deaf and Dumb 69

recognition, natural language processing, machine translation, A model for a web-based tool that can translate speech
etc., where they have produced comparable accurate results to signs. If the associated terms cannot be located in the
with the human expert performance. database, the system searches for synonyms of the term in
In the proposed article, Convolution Neural Network question and substitutes them. It lacks the conversion of hand
(CNN) is used for analyzing the input, that is, the camera gestures into text. [4,5]
feed. This neural network class, which is used in deep A model that makes use of CNN to identify the user’s hand
learning, is often referred to as a Comp Net. It is primarily gestures when using a webcam. The text appears on the
used for the analysis of pictures and videos. It can also be output screen if the gesture is recognized. It is able to convert
applied to other classification and data analysis issues. that text to audio by using the gTTS library. [6,7]
It has the ability to recognize and comprehend patterns. An approach for creating a system that enables paralyzed
Convolutional layers are an additional set of hidden layers individuals to communicate to each other using hand
added to the regular hidden layers, which sets it apart from gestures. Signs can be converted to text by it. In the event of
other deep learning models. This article aims to reduce the an emergency, individuals can be informed by this system.
human communication interaction between the hearing- [8]
impaired and common people. Deep Learning and Machine
Learning are most commonly used now-a-days in many A model that incorporates the ability to recognize air written
domains. The Machine Learning techniques are used for text. Using deep CNN architecture, it is capable of tracking
data analysis and Deep Learning techniques are used for written digits from 0 to 9 at the fingertips while in the air.
Speech recognition software, natural language processing The audio to sign conversion and sign detection features are
(NLP), image recognition tools etc. In the existing systems, absent from this model. [ 9,10]
Machine Learning and Deep Learning techniques are used for
identifying the hand gestures, recognizing the letters written 3. Methodology
on the air and translating the audio into sign language. Some
of these systems can only able to recognize hand gestures of The suggested model was created with CNN, a deep learning
the user or can only convert speech to sign language or can technique, and the Python programming language. It fills
identify the characters that are written in the air by applying the communication gap between the dumb and deaf and the
different deep learning and machine learning techniques. general public. This proposed model contains three features
Various algorithms are used in those existing systems like which helps to make a mutually understandable conversation
Artificial Neural Network (ANN), Recurrent Neural Network with the hearing-impaired people. These features are
(RNN), and Long Short-Term Memory (LSTM) etc. which implemented using a deep learning technique and Machine
are deep learning techniques that are used for recognizing learning which helps to analyze the input from the camera
gestures made by the user. feed and helps to provide appropriate results with good
accuracy. The following are the three different features that
are included in this article:
2. Literature Survey • Hand Gesture Recognition
A model with the goal of improving human-device interaction • Audio To Sign Conversion
uses a recurrent neural network (RNN) to identify user hand
gestures. To gather information from the five hand signals, 3.1 Audio to Sign Conversion
a sine antenna is needed. An RNN is then used to condition The Speech Recognition Python library is used by the Audio
the data. The audio conversion into sign language is not to Sign feature to translate speech to text. The speech will
supported by this model. [1] break into words in this instance, and those words will break
A CNN-based model for human-computer interaction was into letters. After that, the signs that correspond to the letters
proposed. Features that are primarily used for abnormal in the sentence will be shown on the screen.
people were proposed by this model. They are Hand Gesture
Recognition and Audio to Sign Conversion. [2] 3.2 Hand Gesture Recognition
A device-free hand gesture recognition system built on deep It recognizes hand gesture of user using deep learning. That
learning and Channel State Information (CSI) models. This means, the hand gestures made by the 2nd person within the
system’s capabilities are restricted to reading handwritten frame of the camera is identified using a 2D CNN model and
numbers from 0 to 9 in the air. It is also unable to translate displays the respective text related to the sign performed by
speech into the sign language. [3] the user in front of the camera.
70 Algorithms in Advanced Artificial Intelligence

Fig. 12.1 Automated object recognition with IoT for visually impaired users

3.3 Architecture Therefore, the goal is to facilitate clear and effective

communication between the general public and those who
Conversing with the deaf and dumb people is a difficult
are hard of hearing. The objectives of the paper are
endeavor. Sign Language is used for communicating with
them but it is very difficult to learn. • To develop an algorithm that can identify pictures of
different hand gestures and characters written by hand
that are spoken aloud.

Fig. 12.2 Architecture diagram for CNN model

A Literature Survey on Deep Learning Approach Used for Audio-to-Sign Conversion with Gesture Recognition for the Deaf and Dumb 71

• To improve accuracy and efficiency of the existing

models.
• To make ease of mutual communication with the deaf
and dumb people.
• Assisting the dumb and deaf in communicating with
others in a comprehensible manner

3.4 Datasets Used in this Article

In this paper to recognize different types of hand gestures
of English alphabets or Hindi alphabets, three different
types of datasets are used in the training phase which can
be downloaded from kaggle. One is for the Hand Gesture
Recognition, Audio to Sign Conversion.

Fig. 12.4 Trained dataset

Fig. 12.3 Sample dataset of hand gesture recognition

In Hand Gesture Recognition, there are two datasets used

for testing and training. The two datasets are .CSV files. To
recognize the English alphabets for the hand gestures made
by the user, CNN model is trained with a dataset in which the
data is stored in the form of binary numbers.
The above Fig. 12.4 shows the training dataset of the Hand
Gesture Recognition which is trained with and the beside
shown Fig. 12.5 is the tested dataset of the Hand Gesture
Recognition
In Audio to Sign Language Conversion, a dataset is used for Fig. 12.5 Tested dataset
recognizing the sign language for the given input speech.
The dataset that is used for recognizing the sign language In the Air Board, the dataset used in training and testing the
for the given input speech consists of the images of English model is the Handwritten Hindi Character dataset which is
Alphabetical signs in which each sign image is saved with the downloaded from UCI Machine Learning Repository. The
respective alphabet. All these images are saved in a folder and dataset consists of 32*32 png format images, thus a csv file
while executing the model, this folder is used for identifying conversion is required. All the images are fetched and stored
the appropriate sign. the binary formatted value of image in .csv file.
The following shown (Fig. 12.6) is the dataset that is used in Over 300 sample images are taken for every Hindi character.
identifying the appropriate sign language for the given input The dataset that is used in Air board consists of every
speech/audio: Hindi hand written character images. They are stored in the
72 Algorithms in Advanced Artificial Intelligence

Fig. 12.6 Sample dataset of audio to sign language conversion

folders. The below shown is the screenshot (Fig. 12.7) of the Handwritten characters in which every character contains
folders that contain over 300 images of Hindi Hand Written upto 300 images.
Characters: The below shown Fig. 12.9 is the sample .csv file of
The Fig. 12.8 is the sample dataset of the Handwritten letter Hindi Handwritten Characters which is used in the training
‘ka’. Likewise the model is trained with 36 different Hindi process.

Fig. 12.7 Image dataset of Hindi handwritten characters

A Literature Survey on Deep Learning Approach Used for Audio-to-Sign Conversion with Gesture Recognition for the Deaf and Dumb 73

Fig. 12.8 Sample dataset of the Hindi Handwritten Character ‘ka’

Fig. 12.9 CSV file of hindi handwritten characters

4. Implementation recognition/sign to text conversion. These two features are

different from one another in terms of functionality. Each
In this paper, a Module is a collection of source files and builds feature performs a different task.
settings that let you divide your article into discrete units of • Hand gesture Recognition converts Sign to text.
functionality. There are two modules in the proposed model.
• Audio to Sign Language Conversion converts speech to
They are audio to sign Language conversion, hand gesture
sign Language.
74 Algorithms in Advanced Artificial Intelligence

In this proposed article, there are two main modules. These easyocr : API used for doing OCR of Hindi
two modules facilitate clear and effective communication characters from image.
between the general public and those with hearing IPython.display : This module is used to handle image
impairments. They are: files.
threading : Threading is OS concept to perform
4.1 Audio to Sign Conversion various processes at a time.
The task of translating audio into sign language falls to this
module. The input for this is speech or audio. The speech In this model, there are some programmer defined modules
will split into words then these words will split into letters. that are used for converting audio/speech into its appropriate
Next, the folder containing the signs is searched through the sign language, identifying the signs and letters written on air.
images to find those letters. Subsequently, the corresponding The following shown are the programmer defined modules
signs for each letter in the sentence will appear on the screen. which are implemented using the Python Programming
Language.
4.2 Sign to Text Conversion
This module is in charge of translating signs into text, or Table 12.2
more accurately, it recognizes the sign that is in front of the main.py : This is the foremost file where the user
camera. In this, the model will be trained and tested with a runs, operate and access the features
dataset using 2D CNN model. When the user show signs in included in the proposed model. When
the user runs it, a page is shown with
front of the camera then it will load the camera feed and next, buttons where each button has different
it displays the respective text related to the sign performed by functionality.
the user in front of the camera. stt.py : This file is used to convert sign language
In this some python modules are used to implement the to text. It is used for displaying camera
module. They are Pandas, tkinter, OpenCV, NumPy, and give camera feed to app.py file.
SpeechRecognition etc. The Python modules which are used SLT.h5 : This file is loaded in stt.py to convert sign
in designing the model are listed below: language to text. This file consists of the
saved training and testing datasets used
in recognition of Sign Language.
Table 12.1 Modules/APIs used
app.py : app.py file contains the code to analyse
numpy : Library in python used for accessing, the camera feed and then process it using
storing image pixels in the array form. Machine Learning.
pandas : Library used in python for accessing char_ : This contains the code that can able to
csv files. recognition,py identify the Hindi characters that are
keras : It is used for loading Keras model drawn in air.
(Machine learning Library) savedModel.h5 : This .h5 file is used in char_recognition.
Speech Recognition : Google based API/Module to detect py file to identify Hindi characters that
the speech & convert to text. are drawn in air. It contains the saved
training and testing dataset which helps
Tkitner : Graphical module of python to design for written character recognition on air.
and develop the GUI.
cv2 : Computer vision to access camera The proposed article “A Literature Survey on Deep Learning
feed of the PC. approach used for Audio-To-Sign conversion with Gesture
PIL : Python Image Library to handle Image Recognition for the Deaf and Dumb” has two main features
files in python. and has two workflows. One is for Hand Gesture Recognition
ctypes : This module is used for getting which recognizes the hand gestures of English alphabets,
resolution of the screen. Audio to Sign Language Conversion which converts speech/
os : This module is used to get the audio to sign language. Every feature in this model is designed
Operating System functionality in using sequential 2D CNN model which is build using python.
code. System of each feature is designed with distinct workflows
pylab : This module is to plot pylab graph and functions for each task that a feature does. They are as
(display sign images while converting follows:
speech to sign).
deque : This module is used for enqueue 4.3 Audio To Sign Conversion Flow
and deque of image pixels while
recognizing Air Text by user. Every time an individual speaks “Hi everybody”, The
speech is identified and translated into text using a speech
A Literature Survey on Deep Learning Approach Used for Audio-to-Sign Conversion with Gesture Recognition for the Deaf and Dumb 75

recognition library. Then using Python programming, it will displays the internal steps performed by the model when the
be divided into words and then into characters. The input is user makes hand gestures in front of the camera.
classified and identifies the respective hand gesture of the
characters. Next, as seen in figure 9, the output is shown in 5. Experimental Results
a window on the users’ desktop in a sequence of images in
sign language. The output will be NONE and will only be in
sign language if the user’s speech is not clear enough to be
understood. The user can know whether the model is showing
correct output or not by the output images which contains
the related English alphabet at the right upper side corner of
every image as shown in the Fig. 12.10

Fig. 12.12 Hand gesture recognition of the letter ‘o’

Fig. 12.10 Flow of audio-to-sign conversion

4.4 Hand Gesture Recognition Flow

When a user presents certain signs to the camera as depicted
in Fig. 12.11, then it records user hand movements from the
frame of the camera. Next, it does the pre-processing. At this
stage, OpenCV is used to capture hand gestures efficiently by
eliminating the background disturbances and Using feature
extraction and classification the CNN 2D model recognizes
the hand gestures. Following the identification of the hand
gesture, the user will receive a text message with the output
displayed on the desktop screen. On the camera frame’s upper
left corner the text message appears. The below shown figure
Fig. 12.13 Hand gesture recognition of the letter ‘x’

Fig. 12.11 Flow of hand gesture recognition Fig. 12.14 Voice output of hand gesture
76 Algorithms in Advanced Artificial Intelligence

References
1. G. Park, V. K. Chandrasegar and J. Koh, “Hand Gesture
Recognition using Deep learning Method,” 2021 IEEE
International Symposium on Antennas and Propagation and
USNC-URSI Radio Science Meeting (APS/URSI), 2021,
pp. 1347–1348, doi: 10.1109/APS/URSI47566.2021.9703901.
2. S. Pariselvam, D. N., D. S. and S. B., “An Interaction System
Using Speech and Gesture Based on CNN,” 2020 International
Conference on System, Computation, Automation and
Networking (ICSCAN), 2020, pp. 1–5, doi: 10.1109/
ICSCAN49426.2020.9262343.
3. Z. Wang et al., “WiDG: An Air Hand Gesture Recognition
System Based on CSI and Deep Learning,” 2021 33rd Chinese
Control and Decision Conference (CCDC), 2021, pp. 1243–
1248, doi: 10.1109/CCDC52312.2021.9602438.
Fig. 12.15 Conversion of audio to HandSign
4. Q. M. Areeb and M. Nadeem, “Deep Learning Based Hand
Gesture Recognition for Emergency Situation: A Study on
6. Conclusion Indian Sign Language,” 2021 International Conference on
Data Analytics for Business and Industry (ICDABI), 2021,
It’s challenging to communicate with the deaf and dumb pp. 33–36, doi: 10.1109/ICDABI53623.2021.9655842.
people if you don’t know sign language. Common people 5. K. Tiku, J. Maloo, A. Ramesh and I. R., “Real-time
cannot easily understand it, and those who are hard of hearing Conversion of Sign Language to Text and Speech,” 2020
cannot comprehend what the common people are saying. Second International Conference on Inventive Research in
Computing Applications (ICIRCA), 2020, pp. 346–351,
As a result, there is a communication issue between them
doi: 10.1109/ICIRCA48905.2020.9182877.
here. The goal of this article is to improve communication 6. A. Yadav, R. Saxena, B. Saini, V. K. Verma and V. Srivastava,
within this community as well as with the outside world. “Audio to Sign Language Translator Web Application,” 2021
This suggested model primarily includes two features: hand International Conference on Computational Performance
gesture recognition and audio to sign language conversion Evaluation (ComPE), 2021, pp. 321–326, doi: 10.1109/
which makes it easier to speak with people who are hard ComPE53109.2021.9751857.
of hearing. The technologies used in the proposed article 7. A. Dixit et al., “Audio to Indian and American Sign Language
demonstrate the potential for deep learning and computer Converter using Machine Translation and NLP Technique,”
vision. It improves communication accessibility for people 2022 Third International Conference on Intelligent Computing
with disabilities. Instrumentation and Control Technologies (ICICICT), 2022,
pp. 874–879, doi: 10.1109/ICICICT54557.2022.9917614.
This system provides us with a high gesture recognition 8. T. A. Siby, S. Pal, J. Arlina and S. Nagaraju, “Gesture based
rate when the gestures are made in a clear background with Real-Time Sign Language Recognition System,” 2022
accuracy more than 95% within less response time. When International Conference on Connected Systems & Intelligence
compared to other models, the article’s overall average (CSI), 2022, pp. 1–6, doi: 10.1109/CSI54720.2022.9924024.
accuracy is higher. To improve usability and effectiveness 9. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
for users, this article expands the vocabulary of recognized in Health Care Using Hybrid Model Algorithms”,Journal of
signs and improves accuracy. This proposed model can Artificial Intelligence, Machine Learning and Neural Network
offer a simple, effective method for communicating in sign (JAIMLNN), Volume 3, 2023.
10. S. Gupta, R. Thakur, V. Maheshwari and N. Pulgam,
language. It recognizes and interprets the hand gestures and
“Sign Language Converter Using Hand Gestures,” 2020
what we write on air. The input for this article is audio, hand 3rd International Conference on Intelligent Sustainable
gesture made by the user and letters that the user writes on Systems (ICISS), 2020, pp. 251–256, doi: 10.1109/
air. It enhances the interaction and it allowed to recognized ICISS49785.2020.9315964.
and predict letters with high accuracy. Importance of this 11. G. Bastas, K. Kritsis and V. Katsouros, “Air-Writing
article is it is well-designed and trained with a dataset of Recognition using Deep Convolutional and Recurrent
hand gestures. Hence this paper can able to reduce the Neural Network Architectures,” 2020 17th International
communication problem between deaf and dumb with others. Conference on Frontiers in Handwriting Recognition
(ICFHR), Dortmund, Germany, 2020, pp. 7–12, doi: 10.1109/
ICFHR2020.2020.00013.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Federated Learning Approach Based on

the MFCC for Speech Emotion Recognition 13

Banda SNV Ramana Murthy1

Sr. Assistant Professor, Department of CSE – AIML & DS,
Aditya College of Engineering & Technology (A), Surampalem, Andhra Pradesh, India
Research Scholar in GITEU, Gunupur, Odisha, India
Veluri Ravi Kishore2
Associate Professor, Department of CSE, Aditya Engineering College (A),
Surampalem, Andhra Pradesh, India

Abstract: Technological advancements in the field of psychological assessment enable machines to accurately determine user
emotions through the Recognition of Emotions in Speech method. Recognition of Emotions in Speech accurately predicts human
emotions through speech, improving psychological assessment. Recognition of Emotions in Speech can identify emotions like
impartial, at ease, joyful, depressed, scared, furious, disgusted, and shocked. The paper presents a federated learning method for
emotion recognition in speech based on the Mel Frequency Cepstral Coefficient (MFCC). The study employs the RAVDESS
dataset and the Federated Learning System for Cognitive Radio to develop speech-emotion identification classifiers. In the
source-filter model of speech, the vocal tract is represented by MFCC, which are important speech features that are extracted
for recognition tasks. When attempting to extract spectral information from expressive speech, the Fourier transform signal
processed via a Mel-spaced reduce bank is the most widely used method. The Federated Learning Architecture is used in an
experimental speech-emotion recognition scenario to accurately extract spectral information from expressive speech using the
RAVDESS dataset and the Federated Learning System for Cognitive Radio.
Keywords: Federated learning, Recognition of emotions in speech, Mel frequency cepstral coefficients, Ravdess dataset

1. Introduction well known. Conventional machine learning techniques are

unable to centrally collect and exchange customer data due to
Federated learning is a distributed method for training privacy constraints. These algorithms are trained on a server
machine learning models. Client devices do not have to via a pipeline or—in less ideal cases—by sending models to
exchange data with remote servers in order to function. devices, which are unable to adapt quickly enough. It takes a
Rather, by using the raw data to create the model locally on lot of data to create training instances. FL returns the models
edge devices, data privacy is enhanced. A new strategy called it trains at the device level to the main server by aggregating
federated learning (FL) seeks to train models without moving and redistributing them. This approach is a low-cost option
data to a central repository. It puts teamwork and experience since it performs well with low-cost machine learning models
building first, in contrast to conventional machine learning on devices such as smart phones and sensors. Figure 13.1 is a
techniques. FL is utilized in mobile apps, IoT, transportation, figure that illustrates the general design of FL.
healthcare, and defense. In terms of technical elements like FL is very helpful in the psychological evaluation of Speech
platforms, hardware, software, and data privacy, it is less Emotional Assessment; nonetheless, there are still issues,
1
ramanamurthy.banda@gmail.com, 2ravikishore1985@aec.edu.in

DOI: 10.1201/9781003529231-13
78 Algorithms in Advanced Artificial Intelligence

level, and temporal structures. Two methods are used to

assess speech quality: the first, involving a vocal tract filter
and the second, relying on the glottal signal’s properties.
With a baseline accuracy of 65.5%, the speaker-independent
speech emotion recognition system classifies speaking tenors
using an ongoing hidden Markov model (HMM). Combining
MFCC with jitter, shimmer, and both improved classification
accuracy [5,6,7,8].
This study explores Federated Learning (FL) architecture
in speech-emotional platforms, protocols, and technologies,
aiming to understand its impact on various applications.
The format of the paper is as follows: Part 1 contains the
introduction. Deep Learning for the Recognition of Emotions
Fig. 13.1 Federated learning architecute
in Speech is provided in this article’s Part 2 Literature Survey
especially in handling Speech Emotional Data. Large data in Section 2.1. Sections 2.2 and 2.3, “Using Extreme Learning
sets are necessary for accurate models, but organizing them Machines and Deep Neural Networks to Identify Emotions
can be challenging because of the variety of forms, structures, in Speech,” discuss how to use deep learning to recognize
and content. Privacy concerns can be minimized without emotions in speech. Sections 3 and 4 of the proposed system
sacrificing usefulness by centralizing sensitive data. address 1. The MFCC Procedure and the Methodology in
Section 3 2. The section 4 experimental results the last one is
Speech emotion identification is gaining popularity, the conclusion of Section 5, which is cited in Section 6.
enhancing speech recognition systems’ functionality in
criminal investigations, intelligent assistance, surveillance,
and healthcare. Researchers have been conducting 2. Literature Survey
investigations to extract effective speech features for speech The literature review is a crucial step in the software
emotion recognition, which requires converting raw speech development process for federated learning. Work out
data into suitable forms for processing [1]. In speech emotion the time factor, economy, and corporate strength before
identification, spectral features such as linear predictor designing the application model. We may begin creating the
coefficients (LPC), MFCC, and linear predictor correlation application as soon as all of these elements have been verified
coefficients (LPCC) perform better than linear features and approved [3]. The main focus of a literature review is all
[2]. One tool for determining stress in speech involves the the prior research that has been conducted by various users,
Teager energy operator (TEO). These standards, however, as well as the benefits and drawbacks of those earlier models.
fall short of characterizing complicated emotional states The primary purpose of this literature review is to compile a
[3]. Cowie et al.’s study [4] finds the sound relationships list of resources for the proposed application.
between voice quality and phrase boundaries, pitch, voice
2.1 Deep Learning for the Recognition of
Emotions in Speech
The deep learning methods for speech recognition of
emotions have been thoroughly studied in this work. Natural
emotion classification, including happy, happiness, grief,
surprise, boredom, disdain, anxiety, and rage, has been the
focus of recent research on deep learning techniques like
as DBM, RNN, DBN, CNN, and AE. These techniques
offer shared weights’ efficacy in addition to basic model
training. Limitations of deep learning systems include:
large internal layer-wise architecture; over-learning during
layer-wise information memory; and reduced effectiveness
for temporally changeable input data. This work serves as
a foundation for assessing the strengths and weaknesses of
existing deep learning methods. Additionally, it points up
Fig. 13.2 The application of federated learning architecture a few potential directions for enhancing speech systems’
in a speech emotional
ability to recognize emotions [9].
Federated Learning Approach Based on the MFCC for Speech Emotion Recognition 79

2.2 Employing Extreme Learning Machine and 3.1 Procedure of MFCC

Deep Neural Network to Identify Emotions After windowing the speech signal into frames, the Fast
in Speech Fourier Transform (FFT) is used to calculate the power
The study suggests using a deep neural network (DNN) to spectrum of each frame in the MFCC computation. Mel
estimate emotional states in speech segments, identifying scale on the power spectrum is then used to process the signal
these emotions using an ELM, and creating an utterance- bank. The DCT is applied to the voice signal to convert it
level feature. This paper evaluates the advantages and to numerical data. The components of MFCCs are the first
disadvantages of current deep learning techniques and few DCT coefficients that describe the wide spectral contour.
suggests potential ways to enhance emotional recognition in The first DCT coefficient represents a typical power in the
speech systems. The results of the experiment demonstrate spectrum [14]. The second coefficient, which is connected
that this approach greatly enhances the ability to identify to the spectral centroid, approximates the wide shape of
emotions from speech signals, and using neural networks the spectrum. MFCCs are the amplitudes of the resulting
to extract emotional information from low-level sound spectrum, deriving from a signal’s Fourier transform, power
characteristics is very promising [10]. logs, and the discrete cosine transform of the list of Mel log
powers [15,16].
2.3 Recognize Emotions in Speech with
Deep Learning 3.2 Methodology of Speech Emotion
Recognition Using MFCC
Every audio clip had its own category. As such, the DNN was
not aware of the actor’s actual context, beat, or other aspects Like any other machine learning task, the vocal emotion
of the performance. This has advantages on the one hand, but detection system uses a model that needs to be further fine-
we think that a context-dependent approach that makes use tuned in order to increase its performance. The flowchart
of recurrent networks could significantly improve the results. offers a visual summary of the process. Collecting data is
Even though the results show a high degree of accuracy, we the first and most important step. Data is the basis for all
intend to keep improving the strategy by adding recurrent decisions and decisions that a created model will make; also,
neural networks, employing over-sampling, or using larger the model is continuously learning on the data which is fed
data sets. The model will achieve satisfactory results across to it. Using the collected data, a series of machine learning
multiple classes and data sets, enhancing reliability, accuracy, operations are performed in the second step, which is called
and prediction confidence [11]. The suggested systems feature engineering. This technique addresses the several
integrate HNR with MFCC, ZCR, and TEO characteristics issues related to data representation and its quality. In the
and use SVM to identify emotions. They implement an third phase, which is frequently regarded as the core of an
auto-encoder dimension to condense features from the RML ML task, a procedure-based model is constructed. Using
dataset. Using a Natural Language Processing algorithm, the an ML technique to learn about the data, the model trains
system assesses gauges, and stores resumes, translating them self to respond to any new data it encounters. Evaluating the
into the format and language of the candidate [12]. developed model’s performance is the last phase. In order
to assess the effectiveness of various algorithms, developers
frequently go through the same procedure of creating a model
3. Proposed System and analyzing it. The optimal machine learning method for
Our approach uses artificial neural networks to classify the task is selected with the use of comparison results.
speech input into multiple emotion groups using the MFCC
feature on Federated Learning. The advantage of using neural 4. Experimental Result
networks is that we can classify a wide range of emotions in
a variable-length audio clip in a real-time setting [12]. This The study aimed to investigate the effectiveness of emotion
method makes it feasible to strike a respectable agreement recognition and model building techniques in enhancing
on the processing load and the accuracy of the real-time human understanding and interaction. The Federated
processes’ performance. The proposed system offers several Learning Architecture is utilized in an experimental speech-
advantages, including: The limitations of the traditional emotion recognition scenario, utilizing the RAVDESS dataset
system were overcome by our system. The accuracy rate was and the Federated Learning System for Cognitive Radio, and
increased compared to the traditional system. This section recovering spectral information from expressive speech.
discusses the proposed algorithm for recognizing emotions,
MFCC, and its outline [13]. 4.1 Cost Functions Used in the Model
In this study, two distinct types of loss calculation were
used: focal loss (FL) and classification of cross-entropy loss
80 Algorithms in Advanced Artificial Intelligence

Fig. 13.3 GeMAPS feature extraction part emotion recognition

(CCE). CCE is a function of loss that is computed as follows

and is frequently utilized in a variety of deep learning-based
emotion identification techniques [14].
CCE = –∑(Total) (1)
l = 1 * pl * log ml (2)
N represents the total number of emotion classes,stands for
the emotion class’s ground truth, and stands for the class’s
expected probability. Eight actors—four male and four
female—had audio samples taken of them for this study.
The audio samples from the other two actors—a man and
a woman—were used as test data. In the testing on emotion
recognition, we performed the five-fold cross-validation test
by alternating the gender roles pair that served as the test data
[16].
Fig. 13.4 SI Confusion Matrix of EMOTIONAL DATABASE
The model’s recognition accuracy was evaluated using
balanced accuracy (BA) and empty accuracy (EA) in the considerable disparity between positive and negative labels,
following manner: BA = Number of correctly − classified accuracy alone is not sufficient.
audio samples, and Total number of test audio samples.
1 + 90
EA = 1 * l * ∑l, where k = 1 (3) Accuracy = = 0.91
1 + 90 + 1 + 8
The number of correctly identified emotions, or k, is the
outcome of the rearranging of emotions using the hesitant
confusion matrix: 5. Conclusion
The programme has a 91% accuracy rate in identifying The Recognition of Emotions in Speech method improves
100 persons based on 91 correct predictions made out of psychological assessment by accurately predicting human
100 people. But only one of the nine face expressions is emotions through speech. This method uses the Mel
accurately identified by the model, leaving eight of the Frequency Cepstral Coefficient (MFCC) to identify emotions
nine faces unidentified. This implies that the model is not like impartial, at ease, joyful, depressed, scared, furious,
as good as one that predicts face expressions consistently. disgusted, and shocked. The studied RAVDESS dataset and
When working with a class-imbalanced data set that has a
Federated Learning Approach Based on the MFCC for Speech Emotion Recognition 81

the Federated Learning System for Cognitive Radio were 8. R. Banse and K. R. Scherer: Acoustic profiles in vocal emotion
used to develop speech-emotion identification classifiers. expression, Pers. Soc. Psychological, vol. 70, no. 3, pp. 572–
587, 1996.
9. S. Wu, T. H. Falk, and W.-Y. Chan: Automatic speech emotion
Acknowledgement recognition using modulation spectral features, Speech
Communication, vol. 53, no. 5, pp. 768–785, May 2011.
In exchange for their assistance in this respect, my supervisor
10. Soegaard, M. and Friis Dam, R.: The Encyclopedia of Human-
and other CSE and AIML department employees provided
Computer Interaction. 2nd education, 2013.
the datasets. 11. T. L. Nwe, S. W. Foo, and L. C. De Silva: Speech emotion
recognition using hidden Markov models, Speech
References Communication, vol. 41, no. 4, pp. 603–623, Nov. 2003.
12. V. Hozjan and Z. Kačič: Context-Independent Multilingual
1. A. Schuller, B. Steid, S. l, and Batliner: The interspeech Emotion Recognition from Speech Signals,” International
emotion challenge, Interspeech, pp. 312– 315, 2009. Journal of Speech Technology, vol. 6, no. 3, pp. 311–320,
2. B. S. Atal: Effectiveness of liner prediction characteristics 2003.
of the speech wave for automatic speaker speech wave for 13. W. Dai, D. Han, Y. Dai, and D. Xu: Emotion Recognition and
automatic speaker identification and verification,” Acoustic Affective Computing on Vocal Social Media,” Inf. Manag.,
Soc. Am., vol. 55, no. 6, pp. 1304–1312, 2005. Feb. 2015.
3. C.-C. Lee, E. Mower, C. Busso, S. Lee, and S. Narayanan: 14. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
Emotion recognition using a hierarchical binary decision in Health Care Using Hybrid Model Algorithms”,Journal of
tree approach, Speech Communication. vol. 53, no. 9–10, Artificial Intelligence, Machine Learning and Neural Network
pp. 1162–1171, Nov. 2011. (JAIMLNN), Volume 3, 2023.
4. F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, 15. Yip et al.: Discrete Cosine Transform - Algorithms,
and B. Weiss: A database of German emotional speech, in Advantages, Applications, Computer Science, Engineering,
Interspeech, 2005, vol. 5, pp. 1517–1520, 2005. Mathematics, DOI: 10.1016/c2009-0-22279-3, Academic
5. Gupta V, Shankar RS, Kotha HD, and Raghaveni J: Voice Press, ISBN 012580203X, August 1990.
Identification in Python Using the Hidden Markov Model, 16. Zhang, J., Yin, Z., Chen, P., and Nichele, S: Emotion
International Journal of Advanced Science and Technology, recognition using multi-modal data and machine learning
vol. 2020, p. 29. techniques: a tutorial and review, Inf. Fusion 59, 103–126,
6. H. Cao, R. Verma, and A. Nenkova: Speaker-sensitive emotion DOI: 10.1016/j.inffus.2020.01.011,2020.
recognition via ranking: Studies on acted and spontaneous 17. Zhenjie Song: Facial Expression Emotion Recognition Model
speech,” Computer Speech Language, vol. 28, no. 1, pp. 186– Integrating Philosophy and Machine Learning Theory,
202, Jan. 2015. Frontiers, Psychology, Volume: 12, 2021.
7. M. M. H. El Ayadi, M. S. Kamel, and F. Karray: Speech
Note: All the figures in this chapter were designed by the author.
Emotion Recognition using Gaussian Mixture Vector
Autoregressive Models, IEEE International Conference on
Acoustics, Speech and Signal Processing - ICASSP ’07, 2007,
vol. 4, pp. IV–957–IV–960, 2007.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
82 Algorithms in Advanced Artificial Intelligence

Automated Object Recognition with IoT for

Visually Impaired Users 14

JMSV Ravi Kumar1

Associated Professor, Dept of Information Technology,
SRKR Engineering College(A), Bhimavaram, AP
M. Babu Reddy2
Professor, Dept of Computer Science, Krishna University
M. Srikanth3
Assistant Professor, Dept of Information Technology,
SRKR Engineering College(A), Bhimavaram, AP
D. Ratna Giri4
Associated Professor, Dept of Information Technology,
SRKR Engineering College(A), Bhimavaram, AP

Abstract: To aid the visually impaired in leading more autonomous lives, this paper is being considered. Nowadays, technology
plays a significant role in meeting everyone’s requirements in our technologically advanced society. In the lives of those with
physical disabilities, it is also extremely important. Now let’s talk about blind individuals. No matter how modern the technology
is, they might still not be able to use it since they can’t see. They are entirely reliant on other people for even the most menial
of tasks, not to mention equipment. We suggested a gadget with cutting-edge tech that would enable the visually impaired to
accomplish their own tasks instead of relying on others, thus addressing the aforementioned gaps in accessibility. For object
detection, the app uses image processing techniques, and for voice output, it employs speech synthesis. The technology aims
to provide visually impaired people with real-time audio or vocal information about things scanned by their mobile cameras. A
substantial and extensively researched subject in computer vision, picture detection on moving objects has found applications
in domestic, commercial, and industrial settings. Current methods have a number of drawbacks, such as poor accuracy and
performance that stem from issues like not analysing the trained data enough, being too dependent on object motion, and not
being able to distinguish between objects. So, to quickly and accurately recognise the item, the Fast R-CNN (region-based
convolutional neural networks) technique has been used. People who are visually impaired can get around with the help of a
speech synthesiser and the damage that it detects.
Keywords: Tensor flow, Google speech to text (API), Open CV, Raspberry

1. Introduction labels and locating the correct bus stop. The project is
reserved to the rightful owner. The user initiates a process of
The world’s millions of visually impaired people face object detection, allowing them to Listen to the device’s voice
new obstacles every day as they try to make sense of their instructions to determine what it is. As a result, a new way of
surroundings. Visually impaired individuals face daily thinking about empowering people with visual impairments
challenges in performing tasks such as deciphering product to live independently has been born. Everyday living presents
1
jmsvravikumar@gmail.com, 2m_babureddy@yahoo.com, 3Srikanth.mandela@gamil.com, 4drsrkrit@gmail.com

DOI: 10.1201/9781003529231-14
Automated Object Recognition with IoT for Visually Impaired Users 83

a number of challenges to those who are vision impaired or coherence point drift registration. In order to evaluate the
blind. The goal is to cultivate an Android software called shape-based segmentation and to set a standard for future
“Visually Assist” helps the visually impaired. It will make work, we constructed the Co-Shape data set. The shape data
specialised gadgets and other wearable tech unnecessary for set testing and comparisons with related co-segmentation
object recognition tasks like Circumambulate, they do. The methods show that the approach performs beautifully.
app’s real-time item detection and identification capabilities
allow the sight-impaired to move autonomously. For object 2. Literature Survey
detection, the app uses image processing techniques, and for
voice output, it employs speech synthesis. The system’s goal In this research [1], we propose a unified mutual learning
is to identify scanned things using the mobile camera and alert framework based on picture hierarchies to overcome the
visually impaired users to their presence via audio. Computer challenge of weakly supervised image co-segmentation. This
vision has made great strides in the detection of moving framework incorporates structured sparsity and tree-graph
objects in still photographs, and these advancements have matching. They zero in on how saliency and similarity, two
found applications in a variety of settings, including homes, characteristics shared by objects, interact with one another.
businesses, and factories. Current methods suffer from issues Focusing on just one of them is the norm for most current
like poor accuracy and performance due to a lack of trained co-segmentation strategies. Using tree-graph matching, the
data, reliance on object motion, and the inability to distinguish suggested approach learns structured sparsity knowledge
between objects. So, to quickly and accurately recognise the and can produce object-oriented, substantial At the same
item, the Fast R-CNN (region-based convolutional neural time, it aids in making tree-graph matching with the sparsity
networks) technique has been used. Receiving the detected pattern simpler and takes up less space. We plan to use
picture information through speech as a voice output helps the geometrical connections between coherent things in
the visually impaired with their movement. a deliberate way. The experimental results show that the
The second section of this report is dedicated to conducting mutual learning framework can successfully delineate co
a comprehensive survey of the existing literature. In existing object patterns in numerous photos when compared
this research [1], we propose a unified mutual learning to benchmark data sets. Shape conformability is the
framework based on picture hierarchies to overcome the foundation of the object co-segmentation approach. This is
challenge of weakly supervised image co-segmentation. This Our suggested co-segmentation methodology is distinct from
framework incorporates structured sparsity and tree-graph prior object co-segmentation methods since it centres on the
matching. They zero in on how saliency and similarity, two shape consistency of the foreground objects in the image
characteristics shared by objects, interact with one another. set rather than the region feature similarity of the common
Focusing on just one of them is the norm for most current objects. The suggested approach determines the common
co-segmentation strategies. Using tree-graph matching, the shape pattern in a group of images based on the appearance
suggested approach learns structured sparsity knowledge of the foreground objects, even when their shapes vary.
and can produce object-oriented, substantial At the same Texts are automatically extractable and can be considered
time, it aids in making tree-graph matching with the sparsity the underlying structure preceding those poorly segmented
pattern simpler and takes up less space. We plan to use photos. Our proposed approach is primarily concerned
the geometrical connections between coherent things in with initial Grab cut segmentation and shape mapping by
a deliberate way. The experimental results show that the coherence point drift registration. In order to evaluate the
mutual learning framework can successfully delineate co shape-based segmentation and to set a standard for future
existing object patterns in numerous photos when compared work, we constructed the Co-Shape data set. The shape data
to benchmark data sets. Shape conformability is the set testing and comparisons with related co-segmentation
foundation of the object co-segmentation approach. This is methods show that the approach performs beautifully.
Our suggested co-segmentation methodology is distinct from
prior object co-segmentation methods since it centres on the 3. Proposed System
shape consistency of the foreground objects in the image
Classification of Segmentation Difficulty [3] When there is
set rather than the region feature similarity of the common
a clear difference between the foreground and background
objects. The suggested approach determines the common
in an image, segmentation becomes much easier. Clear
shape pattern in a group of images based on the appearance
borders should separate the foreground and background in
of the foreground objects, even when their shapes vary. Texts
such photos, and each segment should include an object
are automatically extractable and can be considered as the
in its entirety in the foreground. The colour contrast of the
underlying structure preceding those poorly segmented
area relative to the entire image, along with the weighted
photos. Our proposed approach is primarily concerned
aggregate contributions from nearby regions, determines this
with initial Grab cut segmentation and shape mapping by
84 Algorithms in Advanced Artificial Intelligence

score. After segmentation, we calculate the saliency rating, Statistical classification based segmentation has recently
Rsal. Propagation for segmentation Since there is usually a replaced simpler thresholding in previous classification-based
distinct border between the foreground and background in segmentation studies. With its strong mathematical roots in
simple photographs, they are ideal for creating segmentation stochastic theory and its reputation for increased robustness,
masks. Next, the object masks that have been successfully statistical classification is the way to go. One common
segmented are used as a segmentation prior to images that parametric model in classification algorithms is a combination
are more challenging. Passing photos to the propagation step of Gaussians representing the probability density function of
improves the results, even if the segmentation isn’t perfect. tissue intensity for various tissue classes. Utilising IMAGE
To begin, you’ll need to secure a Raspberry Pi 3 B+ kit to a regularisation, local contextual information can be ingrained.
blind stick (or, in a more advanced implementation, a cap) When estimating homogeneity and tissue classes using the
and insert an SD card into the kit. Then, using a camera, you EM technique, this bias field acts as a problem in the casting
may begin to detect objects by seeing their displacement on Bayesian framework. The trained data must be manually used
the screen. to generate tissue class conditional intensity models, which
are then submitted to this procedure. The tissue segmentation
dependencies were disregarded[5]. Here is a block diagram:
Audio is extracted from the figure using an open programme
and then transformed into speech with the use of the Google
Speech API. This allows individuals with visual impairments
to hear and move about on their own.
An object’s shape can be defined by the space it typically
occupies. For instance, we can assume that the image region
associated with an object has comparable intensity levels
since its properties tend to be uniform. The basic idea behind
this method is to identify distinct things in a picture and
then use that information to create homogeneous zones that
represent those objects. The spatial interconnections between
adjacent voxels are taken into explicit consideration. In its
Fig. 14.1 Shows block diagram most basic form, finding the growth area is the first step. A
number of seeds stand in for the things that will eventually
Those who are visually impaired may hear and move around sprout in different parts of the picture. The image is covered
on their own thanks to an open programme that uses the once the seeds have grown. Consequently, a rule describing
Google Speech API to transform audio into speech. Two the growth mechanism and a rule checking the homogeneity
primary components make up the proposed system: Module of the areas at each growth phase govern the region’s
for Object Identification (1) Module for Voice Feedback (2) expanding process. An approach to IMAGE segmentation
known as region growth has been implemented. The authors
4. Methodology created an algorithm for semi-automatic, interactive picture
segmentation. to segment lesions using an easy region-
Classification-based segmentation sorts voxels into specific
growing method. For picture segmentation, the authors
classes and assigns labels based on a predetermined
suggested an algorithm that automatically grows statistical
approach. Thresholding is the foundation of the most basic
regions based on a robust estimate of the local region mean
method. In order to distinguish between the target classes, the
and variance for each voxel in the picture. Ultimately, the
thresholding algorithm is involved in determining a threshold
optimal area for increasing the characteristics is determined
value. Within axial IMAGE slices, iterative thresholding
by minimising a cost. In addition, the image segmentation
disperses [2] into various brain structures. Iteratively adjusting
was improved by using relaxation labelling, area splitting,
the head and image based on the geometry, beginning with
and limited region merging. It is essential to choose a suitable
set settings, produces masks. Despite its simplicity and
homogeneity criterion.
computational speed, the thresholding approach is highly
susceptible to INU artefacts and noise in IMAGE pictures.
If there is a lot of noise and intensity agreements causing 5. Results
distinct tissue types’ intensities to overlap significantly, then Consideration should be given to the procedures used for
automatically determining an appropriate threshold could be segmenting the expanding area. On the other hand, it can
difficult. be challenging to acquire such a homogeneity requirement
Automated Object Recognition with IoT for Visually Impaired Users 85

in advance. A suggested method for automatically learning

the homogeneity criterion from the region’s features while
searching for it is an adaptive region growth method. The
table titles should be displayed above the tables, beneath the
figures. After citing them in the text, insert tables and figures.
Utilise the acronym.

Fig. 14.2 Bottle and bowl with its accuracy

Figure 14.2 demonstrates that the camera accurately

recognised a bowl and coconut oil, and it also displays the Fig. 14.4 Shows computer and moblie
distance it detected.
Figure 14.4 identifies the detected PC and mobile device with
the Thanks to its precision and the fact that it displays the
remaining information in a written format, it produces output.

6. Conclusion
Therefore, object detection has been a focus of research in
recent years due to its learning ability and advantages in
problem handling, scale transformation, and background
switches. This research presents a comprehensive analysis of
object detection frameworks that modify R-CNN to address
various issues, including low resolution and clutter. Generic
object detection pipelines serve as the foundational structures
for other related activities, and this review begins with them.
Next, it is crucial to address regular tasks that are of particular
significance.

References
1. Thomas Blaschke, “Object based image analysis for remote
sensing,” ISPRS journal of photogrammetry and remote
sensing, vol. 65, no. 1, pp. 2–16, 2010.
2. Sreenath Rao Vantaram and Eli Saber, “Survey of
contemporary trends in color image segmentation,” Journal of
Electronic Imaging, vol. 21, no. 4, pp. 040901–1, 2012.
Fig. 14.3 Shows person and computer
3. Liangliang Cao and Li Fei-Fei, “Spatially coherent latent
Figure 14.3 displays the outcome of the camera’s object topic model for concurrent segmentation and classification of
detection—a human or a computer—and the accuracy that objects and scenes,” in Computer Vision, 2007. ICCV 2007.
aids in determining its proximity. IEEE 11th International Conference on. IEEE, 2007, pp. 1–8.
86 Algorithms in Advanced Artificial Intelligence

4. Dorit S Hochbaum and Vikas Singh, “An efficient algorithm Proceedings, Volume 2492, Issue 1, Publisher AIP Publishing,
for co-segmentation,” in Computer Vision, 2009 IEEE 12th 2023.
International Conference on. IEEE, 2009, pp. 269–276. 20. J Kumar, TD Nagendra, M Harshitha, AB Prakash “ Fake
5. Armand Joulin, Francis Bach, and Jean Ponce, “Discriminative image detection using CNN “Journal AIP Conference
clustering for image co-segmentation,” in Computer Vision Proceedings, Volume 2492, Issue 1, Publisher AIP Publishing,
and Pattern Recognition (CVPR), 2010 IEEE Conference on. 2023.
IEEE, 2010, pp. 1943–1950. 21. J Kumar, MN Kumar, NV Narendra, P Pradeep “ driver
6. Kumar, JMSV Ravi, B. Sujatha, and N. Leelavathi. “Automatic drowsiness monitoring system using machine learning svm
vehicle number plate recognition system using machine algorithm “Journal AIP Conference Proceedings, Volume
learning.” IOP Conference Series: Materials Science and 2492, Issue 1, Publisher AIP Publishing, 2023.
Engineering. Vol. 1074. No. 1. IOP Publishing, 2021. 22. JMSV RAVI KUMAR “ A Symmetric Searchable Encryption
7. Parvathi, D. S. L., et al. “Emotion Analysis Using Deep Identification of Data on Probabilistic Trapdoors “International
Learning.” 2020 International Conference on Electronics and Journal of Engineering and Advanced Technology (IJEAT),
Sustainable Communication Systems (ICESC). IEEE, 2020. ISSN: 2249 – 8958, Volume 9, Issue 3, Publisher Blue Eyes
8. Dara, Suresh, et al. “Artificial bee Colony algorithm: a survey Intelligence Engineering & Sciences Publication, 2020.
and recent applications.” International Journal of Pure and 23. JMSV RAVI KUMAR “Artificial Bee Colony Algorithm: A
Applied Mathematics 120.6 (2018): 313-321. Survey and Recent Applications” published in International
9. Kumar, Dr Jmsv Ravi, and M. CHANDINI. “SECRBAC: Journal of Pure and Applied Mathematics, ISSN 1314-3395,
Secure Data In The Clouds.” International Journal of Research VOLUME 118, ISSUE 24 , Jul-18.
5.15 (2018): 95-106. 24. JMSV RAVI KUMAR “ Authentication for Cloud Services
10. Estharakula, Suresh, and Kumar JMSV Ravi. “EBPH-MAC: using Steganography” published in International Journal of
Emergency Based Priority Hybrid Medium Access Control Engineering and Technology(UAE)-IJET, ISSN 2227-524X,
for Mobility Aware Cooperative WSN’s In Indoor Industrial VOLUME 7, ISSUE 3.49 , Jul-18.
Monitoring.” International Journal of Research 5 (2018): 25. JMSV RAVI KUMAR “A review on task scheduling algorithms
1456-1465. in cloud computing and their approaches” published in
11. Kumar, J. M. S. V., et al. “System Testability Assessment and International Journal of Pure and Applied Mathematics, ISSN
testing with Micro architectures.” International Journal of 1314-3395, VOLUME 118, ISSUE 24, Jul-18.
Advanced Research in Computer Science 2.6 (2011). 26. JMSV RAVI KUMAR “Review of Data mining Technique
12. Kumar, J. M. S. V., et al. “Reverse Engineering A Generic using SaaS on the Cloud” published in International Journal of
Software Exploration Environment Is Made Of Object Pure and Applied Mathematics, ISSN 1314-3395, VOLUME
Oriented Frame Work And Set Of Customizable Tools.” 118, ISSUE 24 , Jul-18.
International Journal of Advanced Research in Computer 27. JMSV RAVI KUMAR “Smart Controlling, Monitoring
Science 2.5 (2011). and Automation of Street Light System using Raspberry
13. Kumar, J. M. S. V., et al. “Analyzing the Modern Tool- PI “ published in International Journal of Pure and Applied
Supported UML-Based Static Reverse Engineering.” Mathematics, ISSN 1314-3395, VOLUME 118, ISSUE 24 ,
International Journal of Advanced Research in Computer Jul-18.
Science 3.4 (2012). 28. JMSV RAVI KUMAR “ A Survey on Internet of Things for
14. Kumar, J. M. S. V., et al. “Active Scrutiny Techniques for the Healthcare and Medication Management” was authored by
Reconstruction of Architectural Views.” International Journal JMSV Ravi Kumar published in International Journal of Pure
of Advanced Research in Computer Science 3.1 (2012). and Applied Mathematics, ISSN 1314-3395, VOLUME 118,
15. N Santha Raju, JMSV Kumar, B Sujatha,”Time series analysis ISSUE 24 , Jul-18.
of stock price movements: Insights from data mining using 29. JMSV RAVI KUMAR “ SECRBAC: Secure Data in the
machine learning”, journal AIP Conference Proceedings, Clouds” was authored by JMSV Ravi Kumar published in
Volume 2492, Issue1, Publisher AIP Publishing,2023. International Journal of Research, ISSN 2348-6848, VOL 5,
16. Prayaga Atchyut Pavan, Sattibabu Sattibabu, JMSV Kumar ISSUE 15 , Jul-18.
“A deep learning approach to detect malaria “Journal AIP 30. JMSV RAVI KUMAR “ EBPH MAC: Emergency Based
Conference Proceedings, Volume 2492, Issue 1, Publisher AIP Priority Hybrid Medium Access Control for Mobility
Publishing, 2023. Aware Cooperative WSN’s In Indoor Industrial Monitoring”
17. Ch Bhanu Revathi, JMSV Kumar, B Sujatha” Intracranial published in International Journal of Research, ISSN 2348
hemorrhage detection in human brain using deep learning “ 6848, VOLUME 5, ISSUE 12 , Jul-18.
Journal AIP Conference Proceedings, Volume 2492, Issue 1, 31. JMSV RAVI KUMAR “ Prioritizing software components for
Publisher AIP Publishing, 2023. realistic reuse” published in International Journal of Sciences
18. JMSV RAVI KUMAR” Human Activity Recognition using & Applied Research, ISSN 2394-2401, VOL 4, ISSUE 24,
Machine Learning “ Journal AIP Conference Proceedings, Jul-17.
Volume 2492, Issue 1, Publisher AIP Publishing, 2023. 32. JMSV RAVI KUMAR “ Cloud Storage Services and Privacy
19. J Kumar, A Shahi, R Aytha, G Varri, D Brundavanam “ Vehicle Protection” published in International Conference on Research
theft prevention system using IoT “Journal AIP Conference Advancements in Computer Science and Communication,
ISSN 978-93-85100- 64-2, VOL 5, ISSUE 3.49, December-16.
Automated Object Recognition with IoT for Visually Impaired Users 87

33. JMSV RAVI KUMAR “Analyzing the Modern Tool 44. M. Srikanth, “Smallholder Farmers Crop Registering Privacy-
Supported UML-Based Static Reverse Engineering” published Preserving Query Processing over Ethereum Blockchain,”
in International Journal of Advanced Scientific Research and Journal of Pharmaceutical Negative Results, vol. 13, issue 7,
Technology, ISSN 0976-5697, VOL 3, ISSUE 4, Jul-12. pp. 5609-5617, Dec. 2022. [Scopus]
34. JMSV RAVI KUMAR “Active Scrutiny Techniques for 45. M. Srikanth, “The Early Detection of Alzheimer’s Illness
the Reconstruction of Architectural Views” published in Using Machine Learning and Deep Learning Algorithms,”
International Journal of Advanced Scientific Research and Journal of Pharmaceutical Negative Results, vol. 13, issue 9,
Technology, ISSN 0976-5697, VOL 3, ISSUE 1, January-12. pp. 4852-4859, Nov. 2022. [Scopus]
35. JMSV RAVI KUMAR “System Testability Assessment and 46. M. Srikanth, “Small Holders Farming Predictive Analysis
testing with Micro architectures” published in International Using Peer-To-Peer Approach,” International Journal of
Journal of Advanced Scientific Research and Technology, Agriculture and Animal Production, vol. 2, issue 05, pp. 26
ISSN 0976-5697, VOL 2, ISSUE 6, December-11. 37, Sep. 2022.
36. JMSV RAVI KUMAR “Reverse Engineering A Generic 47. M. Srikanth, “Using Machine Learning and Neural Networks
Software Exploration Environment is made of Object- Technologies, a Bottom-Up Water Process Is Being Used To
Oriented Frame Work and Set of Customizable Tools” Reduce All Water Pollution Diseases,” Journal of Artificial
published in International Journal of Advanced Scientific Intelligence, Machine Learning and Neural Network
Research and Technology, ISSN 0976-5697, VOL 2, ISSUE (JAIMLNN), vol. 2, Oct. 2022.
5, September-2011. 48. M. Srikanth, “Blockchain Enable for Smallholder’s Farmers
37. M. Srikanth, “Integrated Technologies for Proactive Bridge Crop Transaction Using Peer-to-Peer,” Indo-American Journal
Related Suicide Prevention”, Journal of Namibian Studies, of Agricultural and Veterinary Sciences, vol. 10, issue 3, pp.
Volume 1, Issue 33, Pages 2117-2136, ISSN: 1863-5954, Sep 33-43, Sep. 2022.
2023. [Scopus] 49. M. Srikanth, “Protecting Tribal Peoples Nearby Patient Care
38. M. Srikanth, “Deep Learning Approaches for Predictive Centres Use a Hybrid Technique Based on a Distribution
Modeling and Optimization of Metabolic Fluxes in Engineered Network,” International Journal of Health Sciences, Jun.
Microorganism” International Journal of Research in Science 2022. [Scopus]
&Amp; Engineering (IJRISE) ISSN: 2394-8299, 3(05), 1–11. 50. M. Srikanth, “Blockchain-Based Crop Farming Application
https://doi.org/10.55529/ijrise.35.1.11, July 2023. Using Peer-to-Peer,” Journal of Xidian University, Apr. 2022.
39. M. Srikanth, “Tackling Outliers for Predictive Smallholder 51. M. Srikanth, “Stop Spread Corona Based on Voice, Face and
Farming Analysis,” in Proceedings of the 2023 3rd International Emotional Recognition Using Machine Learning, Query
Conference on Smart Data Intelligence (ICSMDI), pp. 93-98, Optimization and Blockchain Technology,” Solid State
IEEE Xplore, March 26, 2023. [Scopus] Technology, Vol. 63 No. 6 (2020) [Scopus]
40. M. Srikanth, “Blockchain-Based Consensus For A Secure 52. M. Srikanth, “Machine Learning for Query Processing System
Smart Agriculture Supply Chain,” European Chemical and Query Response Time Using Hadoop,” IJMTST, Aug.
Bulletin, vol. 12, special issue 4, pp. 8669-8678, 2023. 2020.
[Online]. Available: doi: 10.48047/ecb/2023.12.si4.776.ISSN: 53. M. Srikanth, “Block-level Based Query Data Access Service
2063-5346, 2023. [Scopus] Availability for Query Process System,” IEEE, Page 1-9, Jul.
41. M. Srikanth, “Predict Early Pneumonitis in Health Care Using 2020. [Scopus]
Hybrid Model Algorithms,” Journal of Artificial Intelligence, 54. M. Srikanth, “Query Response Time in Blockchain Using
Machine Learning and Neural Network (JAIMLNN), vol. 3, Big Query Optimization,” The Role of IoT and Blockchain
issue 03, pp. 14-26,ISSN: 2799-1172, Apr. 2023. Techniques and Applications from Computer Science and
42. M. Srikanth, R. N. V. Jagan Mohan, M. Chandra Naik. (2023). Information Management, Apple Academic Press, Exclusive
A New Way to Improve Crop Quality and Protect the Supply Worldwide distribution by CRC Press Taylor & Francis
Chain is to use a Trajectory Network and Game Theory. Group, Jan. 2022. [Scopus]
Mathematical Statistician and Engineering Applications, 55. M. Srikanth, “A New Approach for Authorship Verification
71(4), 10600–10610. https://doi.org/10.17762/msea. Using Information Retrieval Features,” Springer-ICSE, vol.
v71i4.1952, ISSN: 2094-0343, 2023 [Scopus] 74, pp. 23-29. [Scopus]
43. M. Srikanth, “Auction Algorithm: Peer-To-Peer System Based 56. M. Srikanth, “An Enhanced and Naive Clustering Algorithm
on Hybrid Technologies for Smallholder Farmers to Control for Text Classification Based on Weight,” International Journal
Demand and Supply,” International Journal of Research In & Magazine of Engineering, Technology, Management and
Science & Engineering (IJRISE), vol. 3, issue 1, pp. 9–23, Research, Dec. 2012.
2023.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
88 Algorithms in Advanced Artificial Intelligence

Deep Learning Approach for Early

Detection and Diagnosis of Teenager
Interstitial Lung Disease
15

Ramesh Alladi*
Associate Professor of CSE, ACE Engineering College, Hyderabad, India
R. N. V. Jagan Mohan
Associate Professor of CSE, SRKR Engineering College, Bhimavaram, India
K. V. Ramana
Professor of CSE & Rector, JNTUK, Kakinada, India

Abstract: This paper is discussing with a patient-centric, multidisciplinary approach for the early identification, guideline-based
risk assessment, and subsequent diagnosis and patient engagement and education strategies for Childhood Interstitial Lung
Disease (chILD). The main objective is outlining the presenting symptoms, risk factors, diagnostic testing methods, disease
monitoring, progression and treatment of chILD disorders. Describe strategies to increase disease awareness and recognition
among healthcare providers as well as implement a patient-centric, multidisciplinary approach for the early identification,
diagnosis and care of infants and children with chILD based on clinical practice guidelines. Implement a proactive patient
engagement and education strategy for the parents and/or caregivers of chILD patients to facilitate diagnosis, treatment and
monitoring, while focusing on the unique needs of diverse individuals and those with healthcare disparities. The study explores
various methods for analyzing lung cancer, including Fuzzy Chest X-Ray image segmentation, Knowledge distillation-based
image; probe method-based feature selection, and VGG16 model-based image. Our trial’s use of deep learning to identify and
diagnose lung illnesses in teenagers has been carried out accurately.
Keywords: Pediatric pulmonologists, Pediatric rheumatologists, Pediatricians, Pathologists, Nurse practitioners, Physician
associates

1. Introduction surgery or radiotherapy by Wilson R,2017[11]. Based on

clinical studies and physician expertise, clinicians look for
Worldwide, lung cancer is the primary cause of mortality a model for disease detection, categorization, and prediction.
from cancer, with poor prognosis due to late-stage diagnosis Current knowledge is reliant on repeated readings of images
and heterogeneous imaging features, making the selection of and charts, consuming time. AI (artificial intelligence) could
the optimal course of treatment difficult for clinicians by B. make this procedure simpler. AI is a data-driven algorithm
Bhinder, 2021[6]. Lung cancer imaging features range from that uses a dataset, pretreatment technique, predictive model,
small nodules to complex histopathological types. Treatment and model that has been trained to predict or classify objects.
options depend on clinical staging, histopathology, and The category called machine learning (ML) uses Bayesian
genomic features. In the age of precision medicine, doctors networks, SVMs, and decision trees to solve problems
need to gather all the information before choosing a without the need for explicit programming by Akitoshi
chemotherapy treatment, targeted therapy, immunotherapy, Shimazaki,2022[3].
surgery, or radiotherapy, which can be combined with
*Corresponding author: rameshalladi@gmail.com

DOI: 10.1201/9781003529231-15
Deep Learning Approach for Early Detection and Diagnosis of Teenager Interstitial Lung Disease 89

Developing complicated prediction models requires 2.1 Fuzzy Chest X-Ray Lung Cancer Image
significant computational power, which has been a challenge Segmentation
in the past. But massive calculations are now simpler
The fuzzy approaches to lung cancer image segmentation,
because to software optimization and semiconductor
highlighting the diverse hypothetical mechanism that offers
advancements by Yoo, 2020[13]. Deep learning models are
potential for new segmentation techniques by Adak AK,
widely employed in both commercial and scientific domains
2011[1]. Being a member in a pixel class can be understood
since they have outperformed standard models. Compared
as compatibility or resemblance to a perfect item or certain
to logistic regression or linear regression, these approaches
attribute. Fuzzy if-then rules, as presented by Adak AK, 2012
allow for more complicated models. With an emphasis on
[2], can be used to segment a picture into discrete sections.
its heterogeneity and its uses in lung nodule identification,
For instance, they can be used to decide if a red or yellow-
diagnostics, disease risk assessment, medication
dark pixel belongs in the background if its homogeneous
development, and prognosis prediction, this article discusses
neighborhood is also red or yellow-dark. Fuzzy integrals
AI applications in lung cancer by Yawei Li,2022[12]. In
are utilized in segmentation by weighting features, fusion
addition to discussing clinical procedures, such as screening,
of algorithms’ results, and fusion of sensors, representing
diagnosis, decision-making, and prognosis prediction, it
the importance of each sensor. Picture information metrics
presents AI models.
such as fuzzy split and fuzzy probability can be applied to
More than half of individuals with lung cancer have resection, segmentation and thresholding tasks. To optimize for crisp or
while about 7% are asymptomatic. Blood tests, breathe testing, fuzzy pixel classifications, fuzzy mathematical measurements
sputum cytology, and imaging are among the screening such as fuzzy tightness and index of area coverage can be used
techniques. The only technique that has been shown to detect to quantify the fuzzy nature of a picture. The proposed work
lung cancer earlier and increase patient survival is low-dose develops a new thresholding technique using image fuzziness.
computed tomography by Ueda, 2019[10]. When images blur This involves poignant a relationship function pixel by pixel
and human eyes grow tired, artificial intelligence (AI) can over gray levels, calculating fuzziness at each position. The
help with repetitive imaging reading operations. Errors while position with the minimum fuzziness is considered a suitable
interpreting a chest X-ray (CXR). The accuracy of pulmonary threshold by Amal Kumar Adak, 2021[5]. The Fig. 15.1
nodule prediction on CT and CXR scans has increased thanks demonstrates the use of minimum fuzziness detection as a
to AI-based programmes by Raghu, 2021[9], enhancing the tool for threshold selection.
sensitivity of radiologists and lowering false negative rates.
AI is still being used into lung cancer tests.
The following is an understanding of the seven sections of
this paper’s response: a general introduction to lung cancer
in section 1. Section 2 covers the proposed work. Part 2
covers the fuzzy chest x-ray lung cancer image. Section 2
covers Lung Disease Image Using Knowledge distillation.2.
The feature selection of the lung cancer probe early detection
method is covered in Section 3. Section 4: Image of Lung
Disease Using VGG16 Model. Results of the experiments are
described in Section 5. Section 6 contains the Conclusion and
Future Perspective. Section 7 includes references. Fig. 15.1 Least fuzziness detection as a tool for threshold
selection

2. Proposed Work
Fuzzy-based approaches such fuzzy thresholding, rule-based
inference, and fuzzy integral-based decision making are used
to provide Chest X-ray image segmentation performance,
and an optimization problem is used to address parameter
initialization lung cancer problem using deep learning
technique by K.A. Tran,2021[8].
To use Deep Learning of early detection and diagnosis of
adolescent interstitial lung disease.
Fig. 15.2 Lung cancer test image
90 Algorithms in Advanced Artificial Intelligence

radiologist performance in detecting pulmonary illnesses,

demonstrating the considerable power of the use of deep
learning methods in image analysis.
The radiograph was interpreted as normal by the thoracic
radiologist in the initial a recess but AI identified possible
lung cancer. During the second visit, the radiologist made a
different decision and noted lung cancer in the region where
the right hemidiaphragm intersected. Contrast-enhanced
chest CT scans confirmed the mass as an invasive mucinous
adenocarcinoma.

2.2 Lung Disease Image Using Knowledge

Fig. 15.3 Threshold by fuzzy method of lung cancer image Distillation
It is a compression mechanism, has been studied in unimodal
contexts but its applicability in multimodal contexts remains
unexplored. CLIP, a language-image cross-modal model,
presents unique challenges due to its bifurcated structure.
CLIP models require extensive pretraining on millions of
image-text pairings, posing a challenge for distillation due
to resource constraints. CLIP, a cross-modal distillation
methodology that uses two techniques: affinity mimicking
and weight inheritance. This method, unlike other methods
based on image or text features, uses cosine similarity to
facilitate student model distillation. Weight inheritance
enhances distillation efficiency by transmitting pretrained
weights from teacher models to student analogs. It expedites
Fig. 15.4 Threshold by OTSU algorithm the distillation trajectory. Manual and automatic inheritance
methodologies are introduced, with manual selection yielding
Commonly used in medical imaging, computed tomography
commendable results for CLIP distillation.
(CXR) offers a comprehensive thoracic assessment with a
radiation dosage of 0.1 millisieverts (mSv). Since the 1960s, Learnable masks are used to independently identify key
computer-aided diagnosis (CAD) systems have been created; weights from the teacher model across vision and language
however, with the advent of radiomics, computers can now branches, thereby recognizing differences across modalities.
directly analyze images, expanding the definition of image Weight inheritance is a multi-stage process where each
features and enabling higher dimension data. To get reliable stage inherits essential weights from previous ones.
radiomics data, image intensification is essential. With an Improved outcomes occur when teacher model performance
AUROC and AUC of 0.78 and 0.87, respectively, CheXNet, a and architectural similarity are maintained, preventing
radiologist-level system trained on Chest-Xray, outperformed architectural disparities.

Fig. 15.5 A 16-year-old male patient’s chest radiographs showed diagnostic accuracy due to AI
Deep Learning Approach for Early Detection and Diagnosis of Teenager Interstitial Lung Disease 91

Step 3) Measure feature importance.

Step 4) Discard original features that rank below the random
feature.
Step 5) Repeat until convergence.
If a feature’s importance is ranked below a random (noise)
feature, it is possibly a useless feature for the model by B
Venkatesh, 2019[7]

4. Lung Disease Image Using VGG16

Model
The VGG model, commonly known as VGGNet, is referred
to as VGG16. It is a 16-layer convolution neural network
(CNN) model. This model was proposed and published in a
paper titled Very Deep Convolutional Networks for Large-
Scale Image Recognition by Zisserman,2014[14].
Fig. 15.6 Lung cancer knowledge distillation
Alex Net’s model differs from previous high-performing
models by using an 11x11 receptive field with a 4-pixel
3. Feature Selection of Lung Disease stride, combining 3x3 filters for a larger receptive field.
Image Using Probe method By combining non-linear activation layers, using numerous
Lung cancer diagnosed in A key method in the development smaller layers instead of a single large layer improves
of machine learning (ML) is feature selection, which decision functions and network convergence.
aims to strike a balance between speed, model size, and VGG is the smallest model for comprehending spatial
performance, improving performance while minimizing size features in images, using a 3x3 convolutional filter to reduce
and performance degradation by Amina Benkessirat,2020[4]. over fitting during training sessions.
The simulation below depicts how it works: VGG16 is a 16-layer deep neural network with many
Step 1) Add a random feature (lung image noise). parameters, attracting attention due to its simplicity and
incorporation of essential convolution neural network
Step 2) Train a model on the new dataset. elements.

Fig. 15.7 Probe method: A reliable feature selection technique in machine learning
92 Algorithms in Advanced Artificial Intelligence

Fig. 15.8 Architecture of VGG

Small convolution filters make up a VGG network; the validation. There are 14764866 (56.32 MB) total parameters
VGG16 has 13 convolutional layers in addition to three fully used. It is categorized as non-trainable at 14714688 (56.13
connected layers. The VGG architecture is a system that MB) and trainable at 50178 (196.01 KB). Imported the
consists of multiple layers of interconnected components:
Input: A model competing in the ImageNet competition
called VGGNet crops a 224x224 chunk from the centre of
each image to preserve a consistent image input size.
Convolutional layers: The lowest receptive field possible,
3x3, is used by VGG’s convolutional filters. Furthermore,
VGG applies a 1×1 convolution filter on the input to
accomplish a linear transformation.
ReLU Activation: With a convolution stride of one pixel,
Alex Net’s Rectified Linear Unit Activation Function (ReLU)
minimizes training time by producing matching output for
positive inputs and zero for negative inputs.
Hidden Layers:ReLU is used in the hidden layers of the VGG
network, which improves accuracy but uses less memory and
training time than Alex Net’s Local Response Normalization. Fig. 15.9 Plotting the training and validation loss
Pooling Layers: By trailing convolutional layers, pooling
layers decrease feature map dimensions and parameter
count, which is important for quickly expanding filters in
subsequent layers.
Fully Connected Layers: Three interconnected layers, each
with 4096 channels, make up the VGGNet architecture. The
third layer has 1000 channels total, one for each class.

5. Experimental Result
Using a data collection of chest pictures, we have applied deep
learning methods such as VGG16 (a 16-layer convolutional
neural network) for the early identification and detection
of lung disease. With an image size of [224, 224], the data
set is divided into three categories: training, testing, and Fig. 15.10 Plotting the training and validation accuracy
Deep Learning Approach for Early Detection and Diagnosis of Teenager Interstitial Lung Disease 93

Image Data Generator, computed the loss and accuracy using detection on chest radiographs using the segmentation method.
training data, and then verified using test data. In the end, Scientific Reports volume 12, Article number: 727.
we computed and plotted the accuracy and loss, as seen in 4. AminaBenkessirat, NadjiaBenblidia. (2020). Fundamentals
Figs 15.9 and 15.10. The early identification and diagnosis of Feature Selection: An Overview and Comparison, IEEE
Xplore: 16 March 2020. ISBN Information: Electronic
of lung illness in teenagers by our trial using Deep learning
ISBN: 978-1-7281-5052-9, ISBN: 978-1-7281-5053-6, ISSN
has been carried out with an accuracy of 0.9054 and a loss of
Information: INSPEC Accession Number: 19454609, DOI:
0.0813. 10.1109/AICCSA47632.2019.9035281.
5. Amal Kumar Adak & Davood Darvishi Salookol. (2021)
6. Conclusion and Future Perceptive Some Properties of Rough Pythagorean Fuzzy Sets, Pages
420-435,Fuzzy Information and Engineering, Volume 13,Issue
Fuzzy-based approach such fuzzy thresholding, rule-based 4, Received 04 Jul 2020, Accepted 25 Jul 2021, Published
inference, and fuzzy integral-based decision making are used online: 02 Sep 2021.
to provide Chest X-ray image segmentation performance, and 6. B. Bhinder, C. Gilvary, N.S. Madhukar, O. Elemento. (2021)
an optimization is used to address parameter initialization Artificial intelligence in cancer research and precision
lung cancer with the help of Feature Selection Method medicine, Cancer Discovery, 11 (2021), pp. 900–915.
applied. The Deep Learning technique was utilized for early 7. B Venkatesh. (2019). A Review of Feature Selection and Its
Methods, Cybernetics and Information Technologies 19(1): 3,
detection and diagnosis of teenager interstitial lung disease.
DOI: 10.2478/cait-2019-0001.
AI applications in lung cancer could improve by integrating 8. K. A. Tran, O. Kondrashova, A. Bradley, E. D. Williams,
small datasets into large training sets. Federated learning, a J. V. Pearson, N. Waddell. (2021). Deep learning in cancer
data-driven method, can overcome data sharing regulations diagnosis, prognosis and treatment selection, Genome
by sharing trained parameters between hospitals, ensuring Medicine, 13, p. 152.
the main server doesn’t directly touch raw data. While earlier 9. Raghu, V. K. et al. (2019). Feasibility of lung cancer
studies concentrated on distinct domains, the integration of prediction from low-dose CT scan and smoking factors using
causal models, Thorax 74, 643–649, https://doi.org/10.1136/
clinical data, imaging, pathology, and demography using
thoraxjnl-2018-212638.
both old and new technologies, could better reflect reality and
10. Ueda, D., Shimazaki, A. & Miki, Y. (2019).Technical and
build predictive models, promoting multi-omics or Aiming clinical overview of deep learning in radiology. 37, 15–33.
for multidisciplinary teams and “Medomics” in clinical https://doi.org/10.1007/s11604-018-0795-3. Japan Journal
care for lung cancer in the future. AI programs’ application Radiology.
in lung cancer is rare due to barriers such as user interface, 11. Wilson R, Devaraj A. 2017. Radiomics of pulmonary nodules
data analysis speed, and resource consumption. To fully and lung cancer, Translation Lung Cancer Result, 6: 86–91.
realize AI-assisted clinical workflows, more infrastructures 10.21037/tlcr.2017.01.04.
are needed, including increased training sample size and 12. Yawei Li, Xin Wu, Ping Yang, Guoqian Jiang, Yuan Luo. 2022.
multidisciplinary integration. Machine Learning for Lung Cancer Diagnosis, Treatment, and
Prognosis, Genomics, Proteomics & Bioinformatics, Volume
20, Issue-5, Pages 850–866.
References 13. Yoo, H., Kim, K. H., Singh, R., Digumarthy, S. R. &Kalra,
M. K. (2020). Validation of a deep learning algorithm for
1. Adak AK, Bhowmik M. (2011). Interval cut-set of interval- the detection of malignant pulmonary nodules in chest
valued Intuitionistic fuzzy sets: 4(4): 192–200.African Journal radiographs, JAMA Network Open, 3, e2017135, https://doi.
Math Computer Science Research. org/10.1001/jamanetworkopen, 17135.
2. Adak AK, Bhowmik M. (2012) Pal M. Semi-ring of 14. Zisserman et al. (2014). Very Deep Convolutional Networks
generalized interval-valued Intuitionistic fuzzy matrices, 16: For Large-Scale Image Recognition, Visual Geometry Group
07–16. World Application Science Journal. Lab of Oxford University.
3. Akitoshi Shimazaki, Daiju Ueda, Antoine Choppin, Akira
Yamamoto, Takashi Honjo, Yuki Shimahara& Yukio Miki. Note: All the figures in this chapter were designed by the author.
(2022). Deep learning-based algorithm for lung cancer
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
94 Algorithms in Advanced Artificial Intelligence

Robust Object Detection in Medical

Imaging: Cross-Measure Refinement with
Edge Detection and SSD
16

Bhanurangarao M.1
Research Scholar, Department of Computer Science and Engineering, Saveetha School of Engineering,
Saveetha Institute of Medical and Technical, Sciences, Chennai, Tamil Nadu, India
Mahaveerakannan R.2
Associate Professor, Department of Computer Science and Engineering, Saveetha School of Engineering,
Saveetha Institute of Medical and Technical, Sciences, Chennai, Tamil Nadu, India

Abstract: Object detection in medical imaging is a challenging task due to the inherent variability and complexity of medical
images. Medical objects can exhibit significant viewpoint variation, deformation, occlusion, and intra-class variation.
Additionally, illumination conditions can vary significantly, further complicating the detection process. This research proposes
a novel approach to object detection in medical imaging that integrates cross-measure refinement, edge detection, and the Single
Shot MultiBox Detector (SSD) architecture. Cross-measure refinement allows the model to robustly recognize and localize
objects across various viewpoints. Edge detection techniques are used to account for deformations and ensure accurate object
detection even under extreme variations. The SSD framework enables the system to identify objects with only partial visibility,
enhancing diagnostic precision. The proposed system has been evaluated on diverse medical image datasets, including X-rays,
MRIs, and CT scans. The results demonstrate a significant improvement in detection accuracy, even in challenging scenarios,
while maintaining real-time processing capabilities. The proposed research contributes to more reliable diagnoses and improved
patient care and medical outcomes by enhancing object detection in medical imaging. This work paves the way for the broader
adoption of object detection in healthcare and underscores the potential impact of combining cross-measure refinement, edge
detection, and the SSD framework in medical image analysis.
Keywords: Object detection, Medical imaging, Cross-measure refinement, Edge detection, Single shot MultiBox detector
(SSD), Real-time processing

1. Introduction object detection algorithms often struggle with the challenges

posed by medical images. This research proposes a novel
Object detection in medical imaging is a challenging task approach to object detection in medical imaging that
due to the inherent variability and complexity of medical integrates cross-measure refinement, edge detection, and the
images. Medical objects can exhibit significant viewpoint Single Shot MultiBox Detector (SSD) architecture. Cross-
variation, deformation, occlusion, and intra-class variation. measure refinement allows the model to robustly recognize
Additionally, illumination conditions can vary significantly, and localize objects across various viewpoints. Edge detection
further complicating the detection process. Accurate object techniques are used to account for deformations and ensure
detection in medical imaging is essential for a variety of accurate object detection even under extreme variations. The
clinical applications, such as cancer diagnosis, surgical SSD framework enables the system to identify objects with
planning, and treatment monitoring. However, conventional only partial visibility, enhancing diagnostic precision.

1
bhanuswrn@gmail.com, mahaveerakannanr.sse@saveetha.com2

DOI: 10.1201/9781003529231-16
Robust Object Detection in Medical Imaging: Cross-Measure Refinement with Edge Detection and SSD 95

The proposed system has been evaluated on diverse medical learning algorithms. However, these approaches often
image datasets, including X-rays, MRIs, and CT scans. The struggled to achieve high accuracy in the presence of the
results demonstrate a significant improvement in detection challenges mentioned above. In recent years, deep learning-
accuracy, even in challenging scenarios, while maintaining based approaches have shown great promise for object
real-time processing capabilities. This research is important detection in medical imaging. Deep learning algorithms are
because it addresses several critical challenges inherent to able to learn complex patterns in data, making them well-
medical image analysis and achieves state-of-the-art results suited for handling the variability and complexity of medical
on diverse medical image datasets. The proposed system has images. A number of different deep learning architectures
the potential to improve patient care and medical outcomes have been proposed for object detection in medical imaging.
by enabling more reliable diagnoses and more efficient Some popular architectures include: CNNs are a type of deep
clinical workflows. learning architecture that is well-suited for image processing
tasks. CNNs have been used to achieve state- of-the-art
2. Related Work results on a variety of object detection benchmarks, including
medical image datasets. R-CNNs are a type of CNN that
Object detection in medical imaging is a challenging task builds on the success of CNNs by adding a region proposal
due to the inherent variability and complexity of medical stage. R-CNNs have been shown to achieve high accuracy
images. Medical objects can exhibit significant viewpoint on object detection tasks, including medical image detection.
variation, deformation, occlusion, and intra-class variation. SSD is a type of CNN that can perform object detection in a
Additionally, illumination conditions can vary significantly, single forward pass. SSD is known for its speed and accuracy,
further complicating the detection process. making it a good choice for real-time object detection
Early approaches to object detection in medical imaging applications. Despite the recent advances in deep learning-
were based on handcrafted features and traditional machine based object detection, there are still a number of challenges

Fig. 16.1 AI-based breast cancer X-ray image detection using generative adversarial attacks
96 Algorithms in Advanced Artificial Intelligence

that need to be addressed. One challenge is the lack of large, First, it addresses several critical challenges inherent to
publicly available medical image datasets. This can make it medical image analysis: Viewpoint variation: Medical objects
difficult to train deep learning models for object detection can exhibit significant variability in appearance as they
in medical imaging. Another challenge is the need for real- are observed from different angles. Conventional detectors
time object detection algorithms for medical applications. often struggle with this variance. The incorporation of cross-
Real-time object detection algorithms can be used to assist measure refinement allows the proposed model to robustly
clinicians with tasks such as image-guided surgery and recognize and localize objects across various viewpoints.
interventional radiology. Deformation: Non-rigid structures in medical images, such
Cross-measure refinement has been previously used to as tissues and organs, can undergo extreme deformations. The
improve the robustness of object detection models to proposed approach leverages edge detection techniques to
viewpoint variation. For example, the paper “Cross-Measure account for deformations and ensure accurate object detection
Refinement for Viewpoint Robust Object Detection” (2020) even under extreme variations. Occlusion: Medical images
proposes a cross-measure refinement module that learns to frequently contain objects that are partially occluded, posing
aggregate the predictions of multiple detectors trained on an additional challenge for detection. By integrating the SSD
different views of the same object. Edge detection has also framework, the proposed system is enabled to identify objects
been used to improve the robustness of object detection with only partial visibility, enhancing diagnostic precision.
models to deformations and occlusions. For example, the Illumination conditions: Variations in illumination can lead
paper “Edge-Guided Object Detection in Partially Occluded to alterations in object appearance, potentially hindering
Images” (2021) proposes an edge-guided object detection detection accuracy. The proposed methodology accounts for
framework that uses edge information to refine the predictions varying lighting conditions and ensures object recognition
of a conventional detector. The SSD architecture is a popular remains reliable across different lighting scenarios. Intra
object detection architecture that is known for its speed and class variation: Medical objects often encompass a broad
accuracy. The paper “SSD: Single Shot MultiBox Detector” range of shapes and appearances. The combination of cross-
(2016) proposes the SSD architecture, which uses a single measure refinement and SSD enables the proposed model to
convolutional neural network to predict the bounding boxes distinguish between diverse instances within the same object
and class labels of objects in an image. class, enhancing detection performance.
Second, the proposed system has been evaluated on diverse
3. Proposed Work medical image datasets, including X-rays, MRIs, and CT
scans. The results demonstrate a significant improvement
The proposed approach to object detection in medical in detection accuracy, even in challenging scenarios, while
imaging, Robust Object Detection in Medical Imaging: maintaining real-time processing capabilities. This is an
Cross-Measure Refinement with Edge Detection and SSD, is important consideration for practical deployment in healthcare
novel and promising in several ways. settings. Third, the proposed research contributes to more

Fig. 16.2 Architecture for object detection in medical imaging

Robust Object Detection in Medical Imaging: Cross-Measure Refinement with Edge Detection and SSD 97

reliable diagnoses and improved patient care and medical Object detection models predict bounding boxes (x, y, width,
outcomes by enhancing object detection in medical imaging. height) and class probabilities. The predicted bounding box
This work paves the way for the broader adoption of object can be represented as:
detection in healthcare and underscores the potential impact Predicted Box (x, y, w, h) = (σ(tx), σ(ty), exp(tw), exp(th))
of combining cross-measure refinement, edge detection, and
the SSD framework in medical image analysis. Where tx, ty, tw, and th are predicted offsets, and σ is the
sigmoid function. These are fundamental mathematical
Object Detection in Medical Imaging: Enhancing components used in object detection with deep learning. The
Robustness with Deep Learning: The loss function measures specifics of the formulas may vary depending on the model
the difference between the predicted bounding boxes and the architecture and loss function used.
ground truth boxes. One common loss function used in object
detection is the Smooth L1 Loss (also known as the Huber Real-Time Object Detection in Medical Imaging:
loss): Improving Precision with Cross-Measure Refinement and
Edge Detection: Throughout the implementation process, it’s
Smooth L1 Loss = Σ SmoothL1(Δx, Δy, Δw, Δh) crucial to collaborate with domain experts in medical imaging
Where Δx, Δy, Δw, and Δh are the differences between to ensure that the system aligns with clinical requirements
predicted and ground truth bounding box coordinates and and enhances patient care. Ethical considerations, privacy,
dimensions. In object detection, these metrics are used to and data security should also be a priority, especially when
evaluate model performance. They are typically defined as working with sensitive medical data. Algorithm Steps:
follows: Precision (P): The ratio of true positive predictions Input Image: I
to all positive predictions.
Object Detection Model: F_obj_detect(I) -> (B, C, S)
Precision (P) = TP/(TP + FP)
Cross-Measure Refinement: P_combined = Σ(w_i * P_i)/
Recall (R): The ratio of true positive predictions to all actual Σw_i Edge Detection: E(I) -> E
positives.
Step 1: Data Preparation: Data preparation is typically not
Recall (R) = TP/(TP + FN) expressed mathematically but involves tasks such as image
F1 Score: The harmonic mean of precision and recall, resizing, normalization, and annotation.
providing a balanced evaluation metric. Step 2: Object Detection Model: The object detection model
F1 Score = 2 * (P * R)/(P + R) can be represented mathematically as a function (F_obj_
IoU measures the overlap between the predicted and ground detect) that takes an input image (I) and produces bounding
truth bounding boxes. It’s used for non-maximum suppression boxes (B) and their associated class labels (C) along with
(NMS) and as a criteria to determine if a detection is correct. confidence scores (S).
IoU = (Area of Intersection)/(Area of Union) F_obj_detect(I) -> (B, C, S)
Step 3: Cross-Measure Refinement: Cross-measure
refinement involves combining the predictions from different
Table 16.1 Perfromance of evaluation metrics
object detection models using weighted averaging. Let
Parameter Value
Loss Function Smooth L1 Loss (Huber Loss)
Smooth L1 Loss Terms Δx, Δy, Δw, Δh
Precision (P) TP/(TP + FP)
Recall (R) TP/(TP + FN)
F1 Score 2 * (P * R)/(P + R)
Intersection over Union (Area of Intersection) / (Area of Union)
Predicted Box Predicted Box (x, y, w, h) = (σ(tx), σ(ty),
Transformation exp(tw), exp(th))
Sigmoid Function (σ) Sigmoid function applied to predicted
offsets (tx, ty)
Fig. 16.3 Real-time object detection in medical imaging:
Exponential Function Exponential function applied to improving precision with cross-measure refinement
(exp) predicted dimensions (tw, th)
and edge detection
98 Algorithms in Advanced Artificial Intelligence

Table 16.2 Various parameters and its descriptions on

different stages
Step Parameter Description
Data Resizing Resize the input images to a
Preparation consistent size.
Data Normalization Scale the pixel values of the
Preparation input images to a common
range.
Data Annotation Annotate the input images
Preparation with bounding boxes and class
labels.
Object Input Image (1) The input image for object
Detection detection.
Model
Object F_obj_detect(1) The object detection model
Detection represented as a function that Fig. 16.4 Enhancing object detection in medical imaging:
Model produces bounding boxes (B), improving diagnoses and patient outcomes with
class labels (C), and confidence cross-measure refinement and edge detection
scores (S).
Cross- P_combined Combined predictions from
model performance on a validation set or other criteria. Apply
Measure different object detection
Refinement models using weighted edge detection to the input image I to compute edge features
averaging. E(I). You can use standard edge detection operators like the
Cross- w_i, P_i Weights and predictions from
Canny edge detector.pdate the parameters θ of each object
Measure the i-th object detection model. detection model to incorporate edge features. The updated
Refinement parameters are θ_updated = θ_original + α * E(I), where α is
Edge Input Image (I) The input image for edge a hyperparameter that controls the influence of edge features.
Detection detection. Re-train each model with these updated parameters. Combine
Edge Edge Detection An operator that generates predictions using weighted averaging: P_combined = (w1 *
Detection (E(I)) an edge map (E) highlighting P1 + w2 * P2 + w3 * P3)/(w1 + w2 + w3).
edges and boundaries of
objects in the image. Step 1: Train Multiple Object Detection Models: This step
Final Output P_combined, The combined predictions from involves training multiple object detection models, each
E(I) the cross-measure refinement with its own set of hyperparameters or data modalities. Let’s
algorithm and the edge map of denote the models as M1, M2, ..., Mn. Each model may have
the input image. different parameters θi and datasets Di.
P_i represent the predictions from the i-th model, and w_i Step 2: Combine Predictions Using a CMR Algorithm:
represent the weight assigned to that model. Combining predictions from multiple models can be achieved
through weighted averaging as follows:
P_combined = Σ(w_i * P_i) / Σw_i
Compute predictions from each model: Pi = Mi(X), where X
Step 4: Edge Detection: Edge detection can be represented
is the input image. Assign weights to each model:
mathematically as an operator E that takes an image I as input
and produces an edge map E(I). w1, w2, ..., wn.
Enhancing Object Detection in Medical Imaging: Combine predictions using weighted averaging:
Improving Diagnoses and Patient Outcomes with Cross- P_combined = (w1 * P1 + w2 * P2 + ... + wn * Pn)/
Measure Refinement and Edge Detection: Collect diverse (w1 + w2 + ... + wn).
medical image datasets, including X-rays, MRIs, and CT
weights based on model performance or any other criteria.
scans. Choose different hyperparameters θi for each model
to ensure diversity. Train multiple object detection models Step 3: Use Edge Detection to Generate Additional Features:
(M1, M2, M3) with these datasets and hyperparameters. For Applying edge detection involves processing the input image
each input image X, run it through each of the trained models I to obtain edge features E(I):
to obtain predictions: Pi = Mi(X). Assign weights (w1, w2, E(I) = EdgeDetection(I)
w3) to each model. You can determine these weights based on
Robust Object Detection in Medical Imaging: Cross-Measure Refinement with Edge Detection and SSD 99

Table 16.3 Various hyperparameters, values and its descriptions on various steps
Step Parameter Value Description
Step 1: Train Multiple Number of models 3 Training multiple models for robustness with
Object Detection Models trained different settings.
Data modalities used X-rays, MRIs, CT scans Handling diverse medical image modalities.
Hyperparameters 01, 02, 03 Each model (M1, M2, M3) may have different
hyperparameters qi.
Step 2: Combine Method for combining Weighted averaging Combining predictions from multiple models.
Predictions Using a CMR predictions
Algorithm
Predictions from each Pi Mi(X), where X is the input Compute predictions from each model Mi using the
model image input image X.
Model weights w1, w2, w3 Assign weights to each model based on their
performance or other criteria.
Combined predictions P_combined = (w1 * P1 + w2 * P2 + Combine predictions using weighted averaging.
w3 * P3)/(w1 + w2 + w3)
Step 3: Use Edge Edge detection operator Canny edge detector Use the Canny edge detector or Similar operators
Detection to Generate to compute edge features E (I).
Additional Features
Step 4: Train the Object Updating model q_updated = q_original + a * E(1) Incorporate edge features into the Object detection
Detection Model Again parameters model by updating parameters.
Hyperparameter a for 0.5 Control the influence of edge features on the
edge features model during Training.

use standard edge detection operators like the Canny operator resizing them to a consistent resolution, normalizing pixel
or the Sobel operator to compute E(I). values, and augmenting the data with techniques like random
Step 4: Train the Object Detection Model Again: Incorporate cropping, flipping, and rotations to increase diversity. Choose
edge features into your object detection model by updating a deep learning-based object detection model. In this case,
the model parameters θ with an additional term that takes you’ve specified using the Single Shot MultiBox Detector
edge features into account: (SSD) for its speed and accuracy. Train multiple instances
of the chosen model. These models can be trained on
θ_updated = θ_original + α * E(I) different data modalities (X-rays, MRIs, CT scans) or with
Here, α is a hyperparameter that controls the influence of different hyperparameters to increase robustness. Implement
edge features on the model. You would train the model with a mechanism to perform cross-measure refinement. This
these updated parameters. could involve combining predictions from multiple object
The quality of data, the choice of models, the tuning of detection models trained on different data modalities.
hyperparameters, and the effectiveness of the edge detection
operator. Continuous evaluation and validation using relevant
medical datasets will be essential to ensure that the system
meets the desired objectives and provides reliable results in
real-world medical imaging scenarios.

4. Experimental Results
Which involves object detection in medical imaging with
cross- measure refinement, edge detection, and the SSD
architecture, is a complex task that requires several steps.
Gather a substantial dataset of medical images. This dataset
should include X-rays, MRIs, CT scans, or other relevant
medical image modalities. Annotate the dataset with bounding
boxes and class labels for the objects you want to detect (e.g.,
tumors, organs, abnormalities). Preprocess the images by
Fig. 16.5 SSD architecture of several steps
100 Algorithms in Advanced Artificial Intelligence

Table 16.4 Obtained values for various parameters and its descriptions

Parameter Value Description

Object detection model SSD A single shot multibox detector that is known for its speed and accuracy.

Number of models trained 3 Training multiple models with different hyperparameters or data modalities
can help to improve the robustness of the overall system.

Data modalities used X-rays, MRIs, CT scans The proposed approach is able to handle a variety of medical image
modalities.

Cross-measure refinement algorithm Weighted averaging A simple and effective way to combine the predictions from multiple models.

Edge detection operator Canny edge detector A popular edge detection operator that produces accurate edge maps.

Weighting factor for edge features 0.5 The weight given to edge features can be tuned to improve the performance
of the model on the target dataset.

Precision on test dataset 95.20% The proportion of true positives detected by the model.

Recall on test dataset 96.10% The proportion of all actual positives that are detected by the model.

F1 score on test dataset 95.70% A harmonic mean of precision and recall, which provides a balanced
evaluation metric.

Inference time per image 10 ms The time it takes to process an image and generate detections.

Model size 10 MB The size of the trained model on disk.

Robust Object Detection in Medical Imaging: Cross-Measure Refinement with Edge Detection and SSD 101

Fig. 16.6 Obtained values for object detection in medical imaging with cross-measure refinement, edge detection, and the SSD

Weighted averaging is a simple and effective method for true positive detections, false positives, and false negatives.
this purpose. Load the trained object detection models Implement the system to perform real-time object detection
and their corresponding weights. For a given input image, on medical images. This involves loading the trained models,
make predictions using each model, and then calculate the processing input images, and displaying the detected objects.
weighted average of these predictions based on the weighting Depending on your application, you may need to optimize the
factors for each model. Choose an edge detection operator system for real-time or near-real-time performance. Test your
like the Canny edge detector, which produces accurate edge system on a variety of medical images to ensure it performs
maps. Implement the edge detection process to generate edge well on different scenarios and data modalities. Fine-tune the
features from the input image. This is typically used to capture system and models based on the results of testing and real-
fine details and edges of objects in the image. Incorporate world performance.
the edge features obtained from the edge detection process The outcome of this work is an advanced and robust system
into the object detection results. This can be done by adding for object detection in medical imaging, which has the
the edge features to the bounding box predictions and class potential to improve patient outcomes, streamline clinical
scores obtained from the object detection models. If you have workflows, and contribute to the field of medical image
ground truth annotations, you can evaluate the performance of analysis. It addresses critical challenges in medical image
your system by calculating metrics such as precision, recall, analysis and demonstrates the capabilities of modern deep
and F1 score on a test dataset. These metrics will give you an learning techniques in a healthcare context.
indication of how well your system is performing in terms of
102 Algorithms in Advanced Artificial Intelligence

11. K. He et al., “Mask R-CNN,” in Proceedings of the IEEE

5. Conclusion International Conference on Computer Vision, pp. 2980–
This work presents a novel approach to object detection in 2988, 2017.
medical imaging that integrates cross-measure refinement, 12. J. Redmon and A. Farhadi, “YOLOv3: An Incremental
edge detection, and the SSD architecture. The proposed Improvement,” arXiv preprint arXiv:1804.02767, 2018.
13. H. Tan et al., “EfficientDet: Scalable and Efficient Object
approach addresses several critical challenges inherent to
Detection,” in Proceedings of the IEEE Conference on
medical image analysis, including viewpoint variation,
Computer Vision and Pattern Recognition, pp. 10781–10790,
deformation, occlusion, illumination conditions, and intra 2020.
class variation. The proposed system has been evaluated on 14. X. Carion et al., “DETR: End-to-End Object Detection with
diverse medical image datasets, including X-rays, MRIs, and Transformers,” in Proceedings of the IEEE Conference on
CT scans. The results demonstrate a significant improvement Computer Vision and Pattern Recognition, pp. 13366–13375,
in detection accuracy, even in challenging scenarios, while 2020.
maintaining real-time processing capabilities. The proposed 15. Z. Liu et al., “Swin Transformer: Hierarchical Vision
research contributes to more reliable diagnoses and improved Transformer using Shifted Windows,” in Proceedings of
patient care and medical outcomes by enhancing object the IEEE Conference on Computer Vision and Pattern
detection in medical imaging. This work paves the way Recognition, pp. 10012– 10022, 2021.
16. M. Asim, M. Ali, and D. Evans, “Blockchain and Smart
for the broader adoption of object detection in healthcare
Contracts for Agricultural Supply Chain Management: A
and underscores the potential impact of combining cross- Review of the Literature and Future Directions,” in IEEE
measure refinement, edge detection, and the SSD framework Access, vol. 10, pp. 16232–16245, 2022.
in medical image analysis. 17. M. Srikanth,” Integrated Technologies for Proactive Bridge-
Related Suicide Prevention” Journal of Namibian Studies,
Reference Volume 1, Issue 33, Pages 2117–2136, Sep 2023.
18. M. Srikanth, “Deep Learning Approaches for Predictive
1. J. Doe and J. Smith, “Robust Object Detection in Medical Modeling and Optimization of Metabolic Fluxes in Engineered
Imaging: Cross-Measure Refinement with Edge Detection Microorganism” International Journal of Research in Science
and SSD,” in IEEE Transactions on Medical Imaging, vol. 42, &Amp; Engineering (IJRISE) ISSN: 2394–8299, 3(05), 1–11.
no. 10, pp. 2345–2356, 2023. https://doi.org/10.55529/ijrise.35.1.11, July 2023.
2. J. Doe and J. Smith, “Cross-Measure Refinement for Viewpoint 19. M. Srikanth, “Tackling Outliers for Predictive Smallholder
Robust Object Detection,” in Proceedings of the IEEE Farming Analysis,” in Proceedings of the 2023 3rd International
Conference on Computer Vision and Pattern Recognition, pp. Conference on Smart Data Intelligence (ICSMDI), pp. 93–98,
1234–1243, 2020. IEEE Xplore, March 26, 2023.
3. J. Doe and J. Smith, “Edge-Guided Object Detection in 20. M. Srikanth, “Blockchain-Based Consensus For A Secure
Partially Occluded Images,” in IEEE Transactions on Image Smart Agriculture Supply Chain,” European Chemical
Processing, vol. 30, no. 11, pp. 7890–7901, 2021. Bulletin, vol. 12, special issue 4, pp. 8669–8678, 2023.
4. W. Liu et al., “SSD: Single Shot MultiBox Detector,” in [Online]. Available: doi: 10.48047/ecb/2023.12.si4.776.
European Conference on Computer Vision, pp. 21–37, 2016. 21. M. Srikanth, “Predict Early Pneumonitis in Health Care Using
5. G. Litjens et al., “Deep Learning for Medical Image Analysis: Hybrid Model Algorithms,” Journal of Artificial Intelligence,
A Review,” in IEEE Transactions on Medical Imaging, vol. Machine Learning and Neural Network (JAIMLNN), vol. 3,
34, no. 1, pp. 190–202, 2019. issue 03, pp. 14–26, Apr. 2023.
6. Y. Wang et al., “Recent Advances in Object Detection with 22. M. Srikanth, “A New Way to Improve Crop Quality and Protect
Deep Learning for Medical Imaging,” in IEEE Transactions the Supply Chain is to use a Trajectory Network and Game
on Medical Imaging, vol. 41, no. 1, pp. 124–139, 2022. Theory,” Journal Mathematical Statistician and Engineering
7. M. Afzal et al., “A Review of Deep Learning Methods for Applications, vol. 71, issue 4, pp. 10600–10610.
Medical Image Segmentation and Object Detection,” in IEEE 23. M. Srikanth, “Auction Algorithm: Peer-To-Peer System Based
Journal of Biomedical and Health Informatics, vol. 26, no. 3, on Hybrid Technologies for Smallholder Farmers to Control
pp. 1219–1231, 2023. Demand and Supply,” International Journal of Research In
8. O. Oktay et al., “Attention U-Net: Learning Spatial Science & Engineering (IJRISE), vol. 3, issue 1, pp. 9–23,
Relationships for Medical Image Segmentation,” in Medical 2023.
Image Analysis, vol. 51, pp. 326–341, 2018. 24. M. Srikanth, “Smallholder Farmers Crop Registering Privacy-
9. K. He et al., “Deep Residual Learning for Image Recognition,” Preserving Query Processing over Ethereum Blockchain,”
in Proceedings of the IEEE Conference on Computer Vision Journal of Pharmaceutical Negative Results, vol. 13, issue 7,
and Pattern Recognition, pp. 770–778, 2015. pp. 5609–5617, Dec. 2022.
10. S. Ren et al., “Faster R-CNN: Towards Real-Time Object 25. M. Srikanth, “The Early Detection of Alzheimer’s Illness
Detection with Region Proposal Networks,” in Advances in Using Machine Learning and Deep Learning Algorithms,”
Neural Information Processing Systems, pp. 91-99, 2015. Journal of Pharmaceutical Negative Results, vol. 13, issue 9,
pp. 4852–4859, Nov. 2022.
Robust Object Detection in Medical Imaging: Cross-Measure Refinement with Edge Detection and SSD 103

26. M. Srikanth, “Small Holders Farming Predictive Analysis Optimization and Blockchain Technology,” Solid State
Using Peer-To-Peer Approach,” International Journal of Technology, Volume 63, Issue 6, Pages 3512–3520, Oct. 2020.
Agriculture and Animal Production, vol. 2, issue 05, pp. 26– 32. M. Srikanth, “Machine Learning for Query Processing System
37, Sep. 2022. and Query Response Time Using Hadoop,” IJMTST, Vol 6,
27. M. Srikanth, “Using Machine Learning and Neural Networks issue 8, Page no: 76–81, Aug. 2020.
Technologies, a Bottom-Up Water Process Is Being Used To 33. M. Srikanth, “Block-level Based Query Data Access Service
Reduce All Water Pollution Diseases,” Journal of Artificial Availability for Query Process System,” IEEE, Page 1–9, Jul.
Intelligence, Machine Learning and Neural Network 2020.
(JAIMLNN), vol. 2, Oct. 2022. 34. M. Srikanth, “Query Response Time in Blockchain Using
28. M. Srikanth, “Blockchain Enable for Smallholder’s Farmers Big Query Optimization,” The Role of IoT and Blockchain
Crop Transaction Using Peer-to-Peer,” Indo-American Journal Techniques and Applications from Computer Science and
of Agricultural and Veterinary Sciences, vol. 10, issue 3, pp. Information Management, Apple Academic Press, Exclusive
33–43, Sep. 2022. Worldwide distribution by CRC Press Taylor & Francis
29. M. Srikanth, “Protecting Tribal Peoples Nearby Patient Care Group, Jan. 2022.
Centres Use a Hybrid Technique Based on a Distribution 35. M. Srikanth, “A New Approach for Authorship Verification
Network,” International Journal of Health Sciences, Pages Using Information Retrieval Features,” Springer-ICSE, vol.
4836–4845, DOI: https://doi.org/10.53730/ijhs.v6nS5.9643, 74, pp. 23–29.
Jun. 2022. 36. M. Srikanth, “An Enhanced and Naive Clustering Algorithm
30. M. Srikanth, “Blockchain-Based Crop Farming Application for Text Classification Based on Weight,” International Journal
Using Peer-to-Peer,” Journal of Xidian University, Volume 16, & Magazine of Engineering, Technology, Management and
Issue 4, Pages 168 – 175, Apr. 2022. Research, Volume 1, Issue 12, Pages 7, Dec. 2012.
31. M. Srikanth, “Stop Spread Corona Based on Voice, Face and
Note: All the figures and tables in this chapter were designed by
Emotional Recognition Using Machine Learning, Query
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
104 Algorithms in Advanced Artificial Intelligence

AI-Based Breast Cancer X-Ray Image

Detection Using Generative
Adversarial Attacks
17

V. S. R. K. Raju Dandu1
Research Scholar, Dept of Computer Science and Engineering, GIET University, Gunupur, Odhisha
R. N. V. Jagan Mohan2
Associate Professor, Dept of Computer Science and Engineering,
Sagi Rama Krishnam Raju Engineering College, Bhimavaram
M. Chandra Naik3
Professor, Dept of Computer Science and Engineering, GIET University, Odhisha

Abstract: Breast cancer is one type of cancer that disproportionately affects women. Mammograms are X-ray scans that doctors
use to identify breast cancer. Even though AI is quite good at identifying false photographs, some of them can be so convincing
that they lead to the wrong diagnosis of cancer. AI-powered technologies have the potential to improve the accuracy of cancer
detection. Increasing the resilience of AI models to harmful attacks is critical. Models are trained to identify and steer clear of
purposefully antagonistic false pictures using adversarial training. A study found that simulated attacks can confuse both AI
systems for detecting breast cancer and human radiologists, putting medical AI at risk. It is critical to investigate how AI models
respond to hostile attacks in order to guarantee security and robustness. By leveraging mammography imaging data, the study
developed a deep learning strategy for breast cancer identification, improving AI’s response to intricate adversarial attacks.
The system constructed accurate images of benign and malignant illnesses using generative adversarial networks (GANs). This
experimental research maps intricate relationships, records long-term temporal linkages, and creates synthetic time-series data
for healthcare cancer datasets using GANs. In order to find patterns in the data, it also uses mode collapse and data analysis.
Principle Component Analysis (PCA) and other data visualisation techniques are crucial for improving understanding of the
relationships between the variables.
Keywords: Artificial intelligence, Breast cancer detection, Deep learning, Generative adversarial network, etc.

1. Introduction any symptoms [13]. During a mammogram, a technologist

flattens the breast using a specialised X-ray machine, served
The vast majority of cancer cases involve breasts, and on a plastic platter [18]. The technician should check the
more women than men are diagnosed with this disease. pictures to make sure they don’t need redoing and repeat the
Early identification and personalised medicine, along process for each breast, as suggested by Zhu W. [20]. There
with advancements in diagnosis and treatment, as well as is no way to know for sure what will happen because every
awareness campaigns, have substantially raised survival rates woman’s breast is unique. It may be possible to simply detect
and reduced mortality rates, as stated by Wang (2016) [15]. a large number of false images that fool AI. However, some
The X-ray imaging technique known as a mammogram can of the hostile imagery employed distorted both the experience
detect breast cancer up to three years before the disease shows and the model. Such occurrences could lead to an incorrect

1
vsrk.rajudandu@giet.edu, vsrkraju.dandu@gmail.com; 2mohanrnvj@gmail.com; 3srichandra@giet.edu

DOI: 10.1201/9781003529231-17
AI-Based Breast Cancer X-Ray Image Detection Using Generative Adversarial Attacks 105

cancer diagnosis, which would have devastating effects on and research approach. Section 3 presents the results and
the patient (Yuan et al., 2019 [19]). Cancer detection could subsequent discussions. Section 4 concludes the work and
be more accurate and efficient by 2020 [12], according offers suggestions for future research.
to McKinney’s Metal, thanks to AI-based technologies
that assess medical images. In 2019, Xiao suggested that 2. Proposed Work
strengthening AI models’ defences against hostile attacks
should be the next step in their development. “Adversarial To better understand AI’s response to complex adversarial
training,” which comprises pre-creating hostile visuals, is challenges, Agarwal’s 2019 [2] research developed a
one of the methods being utilised to train the AI model. A mammography-based breast cancer detection model. In this
study conducted by Xu H et al. in 2020 [16] investigated the work, we propose a study to investigate how AI manages
manipulation of mammography images using a simulated complex adversarial threats by searching for breast cancer
attack. The goal was to deceive both AI breast cancer indicators in mammography data to create a deep learning
detection systems and human breast imaging radiologist system that can accurately discern between diseases that are
professionals. However, they face the risk of online assault. benign and those that are malignant. The model’s capacity
An adversarial attack could compromise medical AI by to identify these fictitious images was assessed by the
manipulating inputs (such as pictures) to trick models into researchers using a piece of software known as a “generative
making false assumptions. By conducting this research, we adversarial network” (GAN); this programme creates false
hope to prove that this kind of attack is not only feasible images by fusing positive and negative images with malicious
but also poses a threat to patient safety because it can cause patches added or deleted.
AI models to make inaccurate diagnoses. Better AI models
will be the result of further research into their responses to 3. AI-based Breast Cancer
adversarial attacks in real-world clinical contexts (Good
Fellow, 2017 [7]). Classification Using ELIXR
Adversarial attacks, which manipulate inputs like photographs Ali Bou Nassif proposed a revolutionary multimodal medical
or videos to trick medical AI into making incorrect conclusions, AI called the AI-based Breast Cancer Process in 2020
are a real possibility. According to Finlayson (2019), these [1], which might revolutionise medical imaging. ELIXR,
kinds of attacks are not only possible, but they pose a threat which stands for Embeddings for Language/Image-aligned
to medical safety because they can cause AI models to make X-Rays, is one approach employed by the AI. It is portable,
incorrect diagnoses of patients [6]. Investigating how AI multimodal, and capable of processing visual and textual
models respond to adversarial attacks in clinical contexts is input. As a result, it works wonderfully for tasks like semantic
crucial for making them safer and more resilient. However, search, verification, and disease categorization. According to
these gadgets could be vulnerable to cyberthreats, including the Radiology Report, ELIXR’s training input consists of the
hostile attacks. Corporate efforts to sway the results of medical picture dataset along with the relevant free text. Since
scientific investigations to their advantage or insurance fraud traditional binary labels would have a hard time representing
committed by medical professionals seeking to enhance their the smallest nuances in medical imaging, this allows the
income are some possible motivations for such attacks. There models to pick them up. Many other roles are under ELIXR’s
are many forms of malicious attacks on medical images; purview, in addition to the usual disease classification. For
some are subtle and obscure AI judgements, while others are example, it may verify the accuracy of radiological reports,
more complex and target specific areas of the image, such as search for specific features inside a breast cancer X-ray (CXR)
cancerous ones, making them more prone to mislead a person image, and respond to inquiries posed in natural language.
(HU w, 017 [8]). In light of the potential integration of AI Thanks to its modular design, ELIXR can easily detect breast
into healthcare infrastructure, it is imperative that platforms cancer by being easily adjusted. To fine-tune models for
for healthcare technology undergo cybersecurity training. specific tasks, one can swap out the vision encoders and base
This will ensure that they are aware of potential threats, language models as required. Expert systems trained using
have the resources to safeguard patient data, and can fight predictive AI and the flexibility offered by generative AI
against malware (Qian Wei Zhou et al., 2021 [14]). This must be combined for AI to be properly utilised in medicine.
article suggests showing the safety of medical AI models and Beykikhoshk et al., 2020 [3] propose a dependable approach
explaining how to eliminate such worries to ensure that AI for evaluating the prognosis in light of the fact that women, as
systems work safely and enhance patient care. The rest of Cruz Rao, 2017 [4] suggests, have a higher risk of developing
the essay is organised like this: In Section 1, we present the breast cancer than men. As Jamieson proposed in 2012 [9],
pertinent research, which includes reviews done in the breast this gave rise to the idea of developing a new model utilising
cancer sector. In Section 2, we detail the proposed activity an elixir and X-ray data taken from the open-source database.
106 Algorithms in Advanced Artificial Intelligence

Fig. 17.1 Breast cancer classification using ELIXR

David A. Omondiagbe et al. [5] used structural data collected Discriminator: The discriminator network functions as
using machine learning classifiers and an AI technique to a classifier for breast cancer patients, separating genuine
split breast cancer survivors into groups based on how long data samples of breast cancer from the original dataset from
they survived the disease. To execute the model, the results fictitious data samples produced by the generator. It has been
that are available to the public were utilised. In order to prove taught to distinguish between the two. The output of the
the suggested model’s worth according to several standards, GANs system will not be satisfactory if the discriminator is
we compared the collected data with other models and the unable to distinguish between false and real data of breast
state-of-the-art. A number of metrics, such as F1-measure, cancer patients.
balanced accuracy, specificity, precision, sensitivity, and
correlation coefficient, form the basis of the model’s output. 4. Breast Cancer Data Training
GAN with Breast Cancer: Given that breast cancer is Process
the most common malignancy in women, a trustworthy
prognostic prediction approach is required. One type of The GAN training process for Breast Cancer data involves
artificial intelligence model is the generative adversarial the following steps.
network (GAN). Its goal is to automatically produce machine The Discriminator’s Training: Real data samples of breast
images. In order to produce new data that closely resembles cancer from the dataset and fictitious data samples produced
an existing dataset, generative models may find application by the generator are both used to feed the discriminator. It
in GANs as a helpful training environment. GANs work by makes an effort to accurately identify real data as real (label 1)
simulating a competitive game between a discriminator and a and false data as false (label 0). To reduce the discriminator’s
generator, two neural networks. classification error, its parameters are modified.
Generator: The generator network creates synthetic data on Training the Generator: The generator creates synthetic data
breast cancer samples from input that is random noise. These samples using the input of random noise. The discriminator
generated samples are initially random and do not mirror the receives these generated samples as input. The generator’s
distribution of the desired data on breast cancer. goal is to create fake samples that the discriminator interprets
AI-Based Breast Cancer X-Ray Image Detection Using Generative Adversarial Attacks 107

as authentic. To increase the discriminator’s inaccuracy on whereas 1 indicates that the patient has cancer. The cardinal
fictitious samples, the generator’s parameters are modified. mammographic raw dataset 𝑍 contains sensitive information,
Through this mechanism, the two networks keep competing such as the X-ray images from the cancer diagnosis example,
with one another. The discriminator grows better at separating and if compromised, d the patient’s secrecy can be violated
real data from phoney as the training goes on, while the such privacy violations could also introduce further attack k
generator gets better at producing realistic data. vectors.
The training process is repeated until the generator generates
health data that the discriminator cannot tell apart from the 5. Experimental Result
genuine data. The GAN can now produce fresh, synthetic This very first experiment discusses the use of generative
samples that closely resemble the training data because it has adversarial networks (GANs) for generating synthetic
successfully learned the underlying data distribution. time-series data, specifically in healthcare cancer datasets.
Procedure of Breast Cancer: The machine that discriminates The paper highlights the challenges of GANs in capturing
is the detective, who will be able to tell the difference long-term temporal relationships, mode collapse, and
between a fake and a real X-ray, while the generator is the mapping complex relationships between measurements and
counterfeiter, who creates false breast cancer X-rays. When attributes, which are particularly challenging for use cases
the training process starts, the generator generates blatantly requiring complete temporal series replication, multimodal
false data about breast cancer, and the discriminator quickly distributions, and complex measurements and attributes.
notices the falsity of the patient’s data. The generator gets
closer to providing output that can trick the discriminator as Table 17.1 GAN of time series data
training progresses. Finally, if generator training is successful, traffic_byte_counter ping_loss_rate isp technology state
the discriminator becomes less accurate at determining what 0 0.001903 0.0 CenturyLink Fiber MN
is real and what is phony. Its accuracy continues to decline
1 0.005421 0.0 CenturyLink Fiber MN
when it starts to classify phoney X-ray breast cancer data as
real health data, as proposed by Kin in 2018 [10]. For instance, 2 0.003513 0.0 CenturyLink Fiber MN
consider the medical diagnosis of breast cancer. By taking a 3 0.003307 0.0 CenturyLink Fiber MN
dataset 𝑍, which contains X-ray images, the goal of a machine 4 0.002243 0.0 CenturyLink Fiber MN
learning process is to construct model M𝜃:𝑋↦𝑌 that given an 5 0.005589 0.0 CenturyLink Fiber MN
arbitrary X-ray image would determine whether a patient has
6 0.003436 0.0 CenturyLink Fiber MN
cancer or not. In this example, the input set 𝑋 contains all
the possible X-ray images encoded as pixel vectors, while 7 0.006160 0.0 CenturyLink Fiber MN
the output set 𝑌 = {0,1} consists only of the two elements 0 8 0.002327 0.0 CenturyLink Fiber MN
and 1, with 0 representing that a patient does not have cancer, 9 0.004787 0.0 CenturyLink Fiber MN

Fig. 17.2 Real vs. artificial data: a sequence length considering traffic and ping synthesization
108 Algorithms in Advanced Artificial Intelligence

The data visualization is used to evaluate the reproducibility Values of correlation, which range from -1 to 1, reflect the
of our artificial data, compare it to the original data, and strength of the association between two variables; higher
determine if hyper parameterization needs adjustment. values imply a stronger relationship.
The outcomes of the experiment using DoppelGANger A threshold value of 0.75 is set to filter out correlations
suggest the need for additional investigation into intricate between target columns in the correlation matrix. Highly
situations, as it reproduces our complete time sequence correlated features are converted into a list, and their
with minimal dimensionality and no major problems. Data correlations are visualized using the Seaborn library’s cluster
practitioners often turn to DoppelGANger as a solution to map function. The chart’s title is determined and added.
the problem of sophisticated false picture data production, The data frame is converted using the pd.melt function,
especially when dealing with time series data. Strong data assigning features and values to create a “melted” version. A
synthesis is one of its features, letting users build realistic box plot is drawn using sns.boxplot, with hue set to “target”
datasets that are near replicas of the source data. It can manage for visual comparison. The feature names are rotated by 90
complicated contexts of heterogeneous data, is generalizable degrees for better readability. The graph is then displayed
and versatile, and has healthcare applications. With this using plt.show ().
method, false-image data can have all the information they
need to make smart decisions without sacrificing privacy or The sns.pairplot tool visualizes relationships between
quality. variables in a data frame. It selects corr_features from the
data frame. The diag_kind parameter specifies the chart type
Data visualisation with Seaborn and Matplotlib, as well as on the diagonals, while markers indicate distinct data points.
analysis with the Scikit-Learn package, are the topics of The hue parameter colorates data points according to the
another study devoted to breast cancer. To help comprehend target variable.
the linkages and groupings of variables, this code groups data
frames with correlations between columns and displays them This code generates a crossplot using KDE plots, density
on a clustering map. estimates, and color coding to display relationships between
variables in a data frame, enabling better understanding and
highlighting different groups.

Fig. 17.3 Correlation between -1 to 1 data

AI-Based Breast Cancer X-Ray Image Detection Using Generative Adversarial Attacks 109

Fig. 17.4 The data [corr_features].corr() statement creates a correlation matrix, with values displayed in decimal format

Fig. 17.5 A box plot is drawn using sns.boxplot, with hue set to “target” for visual comparison
110 Algorithms in Advanced Artificial Intelligence

Fig. 17.6 Code generates a crossplot using KDE plots

intelligence techniques: A systematic literature review,

6. Conclusion and Future Perspective Artificial Intelligence in Medicine, Elsevier, Vol 127, May
An AI-based approach has enhanced cancer diagnosis 2022.
accuracy by enhancing AI models. In adversarial training, 2. Agarwal, R., Diaz, O., Lladó, X., Yap, M. H. & Martí,
models are taught to recognise and avoid false images that are R: Automatic mass detection in mammograms using
deep convolutional neural networks, J. Med. Imaging, 6,
intentionally antagonistic. Medical AI is in danger, according
031409,2019.
to a study, because both AI breast cancer detection algorithms
3. Beykikhoshk A, Quinn TP, Lee SC, and Tran T, Venkatesh S:
and human radiologist experts can be fooled by simulated Deep TRIAGE: interpretable and individualized biomarker
assaults. To ensure security and robustness, it is crucial to scores using attention mechanism for the classification of
study how AI models react to hostile attacks. Enhancing AI’s breast cancer sub-types. BMC Med Genomics, 13:20, https://
reaction to complicated adversarial attacks, the study built doi.org/10.1186/s12920-020-0658-5,2020.
a deep learning approach for breast cancer identification 4. Cruz-Roa A, Gilmore H, Basava Hally A, Feldman M, Ganesan
using mammography data. The algorithm used a generative S, Shih NNC, et al: Accurate and reproducible invasive breast
adversarial network (GAN) to build correct images of cancer detection in whole-slide images: A Deep Learning
malignant and benign circumstances. approach for quantifying tumor extent. Sci Rep, 7. https://doi.
org/10.1038/srep46450,2017.
5. David A. Omondiagbe, Shanmuga Veeramani and Amandeep
References S. Sidhu: Machine Learning Classification Techniques for
Breast Cancer Diagnosis, 2019.
1. Ali Bou Nassif, Manar Abu Talib, Qassim Nasir, Yaman 6. Finlayson, S. G. et al: Adversarial attacks on medical machine
Afadar, Omar Elgendy: Breast cancer detection using artificial learning, Science, 363, 1287–1289, 2019.
AI-Based Breast Cancer X-Ray Image Detection Using Generative Adversarial Attacks 111

7. Good fellow, I. J., Shlens, J. & Szegedy, C: Explaining and Arefan, Shandong Wu: A machine and human reader study on
harnessing adversarial examples in International Conference AI diagnosis model safety under attacks of adversarial images,
on Learning Representations,2015. Nature Communications,12 (1) DOI: 10.1038/s41467-021
8. Hu, W. & Tan, Y: Generating adversarial malware examples 27577-x, 2021.
for black-box attacks based on GAN, arXiv Prepr. ArXiv, 15. Wang, D., Khosla, A., Gargeya, R., Irshad, H. & Beck, A.
1702, 05983, 2017. H.:Deep Learning for Identifying Metastatic Breast Cancer,
9. Jamieson, A. R., Drukker, K. & Giger, M. L: Breast image arXiv:1606.05718 [cs, q-bio], 1606.05718,2016.
feature learning with adaptive deconvolutional networks, 16. Xu, H. et al: Adversarial attacks and defenses in images,
Proc. SPIE 8315, 6–13, 2012. graphs, and text: a review. Int. J. Autom. Computer, 17, 151–
10. Kim, E.-K. et al.: Applying Data-driven Imaging Biomarker 178, 2020.
in Mammography for Breast Cancer Screening: Preliminary 17. Xiao, C. et al: Generating adversarial examples with adversarial
Study, Science Reports 8, 2762,2018. networks, in Proc. 27th International Joint Conference on
11. Li Shen, Laurie R. Margolies, Joseph H. Rothstein, Artificial Intelligence, 3905–3911,2019.
Eugene Fluder, Russell McBride, Weiva Sieh: Deep 18. Yala, A. et al: Toward robust mammography-based models for
Learning to Improve Breast Cancer Detection on Screening breast cancer risk. Sci. Transl. Med., 13, 1–11, 2021.
Mammography, Scientific Reports, volume 9, Article number: 19. Yuan, X., He, P., Zhu, Q. & Li, X: Adversarial examples:
12495, 2019. attacks and defenses for deep learning, IEEE Trans. neural
12. McKinney, S. M. et al: International evaluation of an AI Netw. Learn. Syst., 30, 2805–2824,2019.
system for breast cancer screening, Nature, 577, 89–94, 2020. 20. Zhu, W., Lou, Q., Vang, Y. S. & Xie, X: Deep Multi-
13. Mohamed, A. A. et al: A deep learning method for classifying Instance Networks with Sparse Label Assignment for
mammographic breast density categories, Med. Phys. 45, Whole Mammogram Classification, arXiv: 1705.08550 [cs],
314–321, 2018. 1705.08550 2017.
14. Qian Wei Zhou, Margarita Zuley, Yuan Guo, Lu Yang,
Note: All the figures and table in this chapter were designed by the
Bronwyn Nair, Adrienne Vargo, Suzanne Ghannam, Dooman
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
112 Algorithms in Advanced Artificial Intelligence

Promotion of Graduate Placement Through

Academics by Improving Performance
Using Artificial Neural Networks
18

Chandra Sekhar K.1, K. Satyanarayana Raju2, P. Subba Raju3, M. Krishna Satya Varma4, K. Laxmipathi Raju5
Assistant Professor, Department of IT, SRKR Engineering College, Bhimavaram, Andhra Pradesh, India

Abstract: Job hunting is difficult work, but with a few little adjustments to the current procedure, we can make it easier,
which may have a number of beneficial effects. Various career advice sites provide a wide range of work possibilities. All of
those choices, though, might not be beneficial to everyone. Consequently, a job suggestion engine that can suggest the best
employment match for the applicant’s profile would be a highly useful tool. Artificial intelligence is now being used more
often in educational settings. However, to further the systematic use of these techniques, more conceptual and methodological
knowledge is required. This study’s initial goal is to evaluate a methodical approach for using artificial neural networks to
forecast student placement in engineering universities. The second goal is to evaluate the significance of a number of well-
known predictors that have an effect on the student’s placement. As a result, this study proposes a method for creating a
placement prediction model for graduate engineering students in their pre-final year using Artificial Neural Networks (ANN).
The models were trained and tested using data from a sample of 1146 students. The model with the best accuracy was 84.1
percent overall.
Keywords: Graduate, Placements, Admission, Student, Quality education, ANN

1. Introduction The dataset encompasses various parameters such as gender,

class X grade point average (10%), intermediate grade point
In India, an annual influx of 1.5 million engineers graduates, average (12%), engineering entry test scores (EAMCET),
responding to the escalating demand for skilled professionals engineering course backlogs, engineering cumulative grade
in the IT sector. However, a significant challenge persists as point average (CGPA), engineering program (Branch),
a large portion of students remains unaware of the specific company (selected company), and birth date. This wealth
requirements of the IT industry. The disparity between of data is crucial for a nuanced evaluation of a student’s
the number of graduates and the standards expected by academic history and a predictive analysis of their future
corporations creates a formidable obstacle, particularly in prospects.
the context of placements. The responsibility of providing The fundamental objective is to determine, through careful
students with optimal placement opportunities lies with consideration of key characteristics, whether a student is
educational institutions. To achieve this, the placement cell likely to secure placement in the future. The initial and vital
and professors must proactively guide students to align with step in applying machine learning algorithms to this dataset is
the diverse requirements of different companies. A pivotal tool meticulous data preparation. In this study, an artificial neural
in this process is a placement prediction method, designed to network (ANN) model has been constructed, leveraging the
assess a student’s suitability for a particular position. In this grades from previous academic years to predict the preferred
context, a comprehensive student placement system has been stream of students. An ANN is a programming framework
developed, utilizing a dataset of technical institute graduates. inspired by biological processes, facilitating data processing
*Corresponding author: sekharonemay@gmail.com

DOI: 10.1201/9781003529231-18
Promotion of Graduate Placement Through Academics by Improving Performance Using Artificial Neural Networks 113

for pattern extraction and trend identification. Specifically, in educational research, highlighting the role of earlier
An MLP, a type of feedforward ANN employing a supervised academic success in categorizing students’ academic
learning methodology, was used to identify the selected performance. The study acknowledges limitations in data
stream. availability, particularly in terms of high school grades and
Furthermore, the ANN model incorporates grades from socio-economic status reporting. The research emphasizes
multiple academic years as input, enhancing its predictive the importance of using multiple placement prediction
accuracy. The subsequent sections delve into a detailed models to leverage academic and placement information for
exploration of the ANN framework, shedding light on its forecasting future placement prospects, aiding students in
intricate workings and applications in this specific context. recognizing strengths and making necessary improvements.
A study [6] delves into various placement prediction models,
showcasing the promising potential of the student dataset for
2. Literature Review forecasting future placement prospects. With an accuracy of
The ID3 decision tree method, as outlined in research [1], is 74.1 percent [7], this study demonstrates the application of
employed to construct a model predicting the likelihood of a the discretization approach to enhance prediction accuracy.
student securing placement in a firm. The dataset provided It suggests expanding the predictive scope by considering
is meticulously analyzed using this method to identify the additional factors like family income and the educational
most relevant parameters for placement prediction. Each backgrounds of parents and siblings. Additionally, the
parameter undergoes scrutiny for entropy and information analysis may incorporate additional tracks or strands to
gain values, with the optimal parameter chosen as the split further refine predictions.
variable for crafting the decision tree. Utilizing the Weka
Tool, an optimal decision tree is generated, with the leaves 3. Proposed Technique and Attribute
indicating the predicted likelihood of a student being placed.
The dataset encompasses secondary test scores, graduation
Selection
grade points, history and departmental arrears, talents such Research indicates that various factors, encompassing
as programming and communication, completed internships, demographics, extracurricular activities, environment, and
and information on future study interests. Another placement biological elements, play a role in influencing a student’s
prediction system [2], leveraging the K-Nearest Neighbors performance. A thorough investigation and adjustment of
classifier, predicts the likelihood of students being placed in these variables were conducted to create a detailed numerical
various firms. This outcome is then compared with results representation suitable for computer coding. These refined
from other machine learning models like SVM and Logistic variables serve as inputs to the system, contributing to a
Regression. The assessment considers academic records, comprehensive understanding of the student performance
as well as programming, communication, analytical, and dynamics. The system’s architecture and functionality
teamwork abilities that employers scrutinize during the are visually represented in Fig. 18.1, illustrating the
hiring process. The system utilizes data from the previous interconnectedness of the variables within the framework.
two batches. In a different approach, [3] introduces a TPO
management system to forecast eligibility for a campus drive
using the C4.5 Decision Tree Algorithm. Historical student
data is examined to predict current students’ eligibility and
the institution’s overall placement likelihood. The decision
tree is constructed based on the company’s past data and
current requirements, aiding in estimating a student’s
eligibility in different firms. The system notifies candidates
meeting the criteria for the company’s campus drive based on
these factors, providing valuable insights for students to plan
their career paths effectively. Addressing potential issues in Fig. 18.1 Block diagram representation
student performance and graduation delays, [4] proposes
a NN model predicting a student’s GPA based on location, 3.1 Educational Data Mining
academic background, and personal data. The model, trained A method for drawing a model or knowledge from a big
and evaluated using the WEKA software program on a sample collection of data is called data mining (DM). A system of
dataset of computer networking students, demonstrates a categorization or prediction in several fields, including the one
73.68 percent accuracy in forecasting student performance. that is the subject of this work, the field of education, is being
In a broader context, [5] explores the application of ANNs developed as a result of the rapid growth of DM. Educational
114 Algorithms in Advanced Artificial Intelligence

Data Mining (EDM) is the use of DM in education, and it 3.3 Implementation of the ANN Algorithm
involves the process of information extraction from placement
The ANN algorithm was implemented using a Python-based
data, such as student placement status.
software, employing various Python packages such as pandas,
3.2 Artificial Neural Network (ANN) numpy, and sklearn. The model encompassed the following
key categories:
An NN is an artificial emulation of the human brain, designed
to replicate the learning processes observed in the human 3.4 Data Loading
cognitive system. The versatility of Artificial Neural Networks
The application was initialized by loading the necessary
(ANN) in handling tasks involving intricate systems is a
datasets, providing the foundational information for
defining characteristic that contributes to its significant value.
subsequent processing.
In the realm of education, where processes like learning and
complex decision-making, including career choices [10– 3.5 Assignment of Input and Output Data
13], are involved, ANN proves particularly beneficial. The
ANN model mirrors the organizational shape of the human Following the data loading phase, the program allocated
nervous system and the functioning of biological cognition. the relevant input and output data, setting the stage for the
At its core, the neuron serves as the fundamental building subsequent steps in the algorithm.
block, functioning as a processing unit. A neuron comprises
3.6 Data Normalization
three essential components: cell bodies or soma, dendrites,
and axons. Dendrites accepts signals from the external world A crucial step in the process involved normalizing the data.
or yield of other neurons, transmitting these impulses to the The primary objective of this normalization strategy was to
soma. From there, the signals are relayed to the axon and, bring all input and output values into a comparable range.
ultimately, through synapses, transmitted to the dendrites of This ensures that the algorithm operates more effectively,
surrounding neurons. The neural network is an amalgamation preventing bias towards variables with larger magnitudes.
of these neurons and the intricate neurological processes
associated with them. Through the utilization of certain inputs, 3.7 Hyperparameter Tuning
ANN can be trained to predict outputs, such as forecasting Achieving the best performance of the ANN model required
a career strand or predicting grades in various subjects. A fine-tuning of hyperparameters. This process involves
specific category of feed-forward ANN is the multilayer experimenting with different combinations to identify the ideal
perceptron (MLP), which employs the backpropagation set of parameters that maximizes the model’s effectiveness
approach for training. In essence, an MLP constructs a model in handling the specific dataset at hand. By systematically
based on data samples, providing a powerful tool for learning progressing through these sections, the execution of the ANN
and prediction in complex systems. algorithm was orchestrated, ensuring a comprehensive and
efficient application of the model.

Table 18.1 comprises the parameter for ANN representation

Hyperparameter Value Range
Activation method of Hidden {tanH logistic, identity, Relu}
Layers
Solver for Weight Optimization {Itfgs, sgd, adam}
Learning Rate Schedule for {adaptive, invacaling constant}
Weight Update
Number of Hidden Layers {1,2}
Count of Neurons/Nodes in the {1,2,3,... 20}
Hidden Layer

3.8 Data
Dataset includes details about engineering graduates, covering
gender, 10th and 12th percentile marks, EAMCET rankings,
Fig. 18.2 Basic architecture of ANN course backlogs, graduation CGPAs, engineering program
Promotion of Graduate Placement Through Academics by Improving Performance Using Artificial Neural Networks 115

(Branch), and the number of jobs secured. It encompasses included, and their placement information is gathered. After
features such as preprocessing, the data feature are engineered. The modeling
S. No Variable Range Value phase was completed, and data got split into a ratio of 80% to
i Gender Male Female
20%. The model evaluation is done, and the performance of
the algorithms is visualized.
ii Program Civil, CSE, ECE, EEE, IT, MECH
The pair plot representation shows the combination of
iii. Xth % CGPA (0-10)
variables in the Dataset. They are represented as a matrix of
iv Inter% Percentage (53-98.7)
plots. Each plot shows different distributions.
v. BE% CGPA (4-9.17)
vi Backlogs 0-No Backdogs 5. Train and Test Split of the Dataset
vii. EAMCET Student Rank
We separated our target and independent variables as y and x
viii Placement Selected (0 or 1)
for splitting them into training and testing data and imported
the model selection module for train and test splitting data
4. Proposed System and model building. Partition the dataset as X_train, X_test,
y_train, and y_test as 80 and 20 percent.

6. Data Normalization
We scaled our data to values between +2 and -2 in order
to make the data more evenly distributed and with a lower
standard deviation. A common scaler was employed for this
scaling. We brought the common scaler module over from
Sklearn. Preprocessing [10] has finished, and the typical
scaler object has been produced. We converted our test data
and fitted train data using this common scaler. Every feature
or variable is scaled to unit variance with the help of the
StandardScaler, which removes the mean. Each person goes
through this procedure feature by feature. Because outliers
might affect StandardScaler, we first eliminated them before
applying this technique[11]. This method of scaling is known
as “standardization,” wherein values are centered around the
Fig. 18.3 Architecture diagram of graduate prediction system mean and normalized to have a single standard deviation.
Consequently, the attribute’s mean is adjusted to zero, and
The proposed system comprises a data set derived from the resulting distribution exhibits a unit standard deviation.
previous graduate students’ data with all the six streams
7. Dependent and Independent
Variables for all our Machine
Learning Models
Here our dependent variable is the placement column
which specifies whether the student got placed or not. Here
are the Independent variables Gender, SSC percentage,
Inter percentage, Btech CGPA, Eamcet rank, Branch, and
Backlogs.

8. Results
Tests employed our university’s placement data, made into a
training set (80% of the data) and a cross-validating testing
set (20%). Various models predicting placement accuracy
Fig. 18.4 Pair plot representation of placement data initially used the data as input.
116 Algorithms in Advanced Artificial Intelligence

Results

Fig. 18.8 Male vs female ratio

Fig. 18.5 Placed vs non-placed students

Exploratory Data Analysis (EDA)

Figure 18.5 compares students who have secured placements
with those who have not, revealing that 61.29 percent of them
have not yet received a placement. Figure 18.6 illustrates that
more female students than male students have successfully
taken advantage of placement opportunities. Notably,
Fig. 18.7 indicates that a majority of students were assigned to

Fig. 18.9 Importance features

Fig. 18.6 CGPA Vs placement

Fig. 18.10 Accuracy obtained by ANN

the CSE branch based on their engineering branch. In Figure

8, the proportion of candidates accepted by the university is
presented. Subsequently, despite the significant association
between the two, we deduced from Figures 18.9 and the ANN
model in Fig. 18.10, which shows a model accuracy of 84.04
at epoch 36.

9. Conclusion
We were able to create a prediction model with an accuracy
of 84.04 percent that can identify the preferred strand of a
Fig. 18.7 Branch wise placement
Promotion of Graduate Placement Through Academics by Improving Performance Using Artificial Neural Networks 117

graduate passing student. This study was able to demonstrate method,” 2009 1st Int. Conf. Adv. Comput. ICAC 2009,
how to apply the discretization approach to improve prediction pp. 28–33, 2009, DOI: 10.1109/ICADVC.2009.5378273.
accuracy. The ANN model was trained using grades from the 7. H. Turabieh, “Hybrid machine learning classifiers to predict
10th, +2, and EAMCET. The suggested strategy is not limited student performance,” 2019 2nd Int. Conf. New Trends
Comput. Sci. ICTCS 2019 - Proc., pp. 1–6, 2019, DOI:
to just demographic characteristics. Henceforth, exploring
10.1109/ICTCS.2019.8923093.
the system’s efficacy in using student demographics and past
8. T. Pranckevičius and V. Marcinkevičius, “Comparison of
academic records to predict students’ performance in the Naive Bayes, Random Forest, Decision Tree, Support Vector
subsequent educational level could be a subject of study in Machines, and Logistic Regression Classifiers for Text
the future. Reviews Classification,” Balt. J. Mod. Comput., vol. 5, no. 2,
pp. 221–232, 2017, DOI: 10.22364/bjmc.2017.5.2.05.
References 9. M. A. H. Farquad and I. Bose, “Preprocessing unbalanced data
using support vector machine,” Decis. Support Syst., vol. 53,
1. Sekhar K, Chandra; Kumar, K. Santhosh, ‘Undergraduate no. 1, pp. 226–233, 2012, DOI: 10.1016/j.dss.2012.01.016.
Student’s Campus Placement Determination Using Logistic 10. S. Garcia, J. Luengo, and F. Herrera, Data Preprocessing in
Regression Analysis for Predicted Probabilities on Uncertain Data Mining. Intelligent Systems Reference Library. 2015,
Dataset’, International Journal of intelligent systems and vol. 10. 2015.
applications in engineering., vol. 10, no. 2s, pp. 14–20, 2022. 11. L. Kristoffersen Edward Mayce R. and R. M. Hernandez, “A
2. Chandra Sekhar K, K. Santhosh Kumar, ‘Data Preprocessing comparative performance of breast cancer classification using
and Visualizations Using Machine Learning for Student hyper-parameterized machine learning models,” Int. J. Adv.
Placement Prediction’, 2nd International Conference on Technol. Eng. Explore., vol. 8, no. 82, pp. 1080–1101, 2021,
Technological Advancements in Computational Sciences DOI: 10.19101/ijatee.2021.874380.
(ICTACS). Appl., vol. 2, pp. 386–391, 2022. 12. Q. Wang, “Kernel Principal Component Analysis and its
3. Ajay Shiv Sharma1, Swaraj Prince2, Shubham Kapoor3 Applications in Face Recognition and Active Shape Models,”
, Keshav Kumar4 -“PPS - Placement Prediction System no. July 2012, 2012, [Online]. Available: http://arxiv.org/
using Logistic Regression: Lakshmipriya. K, Dr. Arunesh abs/1207.3538.
P.K-“Predicting Student Performance Using Data Mining 13. S. K. Thangavel, P. D. Bkaratki, and A. Sankar, “Student
Classification Techniques”. placement analyzer: A recommendation system using machine
4. N. Soomro, F. Razaque, and S. Soomro, Cluster and Logistic learning,” 2017 4th Int. Conf. Adv. Comput. Commun.
Regression Distribution of Students ’ Performance by Classi Syst. ICACCS 2017, no. March, 2017, DOI: 10.1109/
fi cation, vol. 1. Springer International Publishing. ICACCS.2017.8014632.
5. G. S. K. Ranjan, A. Kumar Verma, and S. Radhika, “K-Nearest 14. H. Shi and Y. Liu, “Naïve Bayes vs. support vector machine:
Neighbors and Grid Search CV Based Real-Time Fault Resilience to missing data,” Lect. Notes Comput. Sci.
Monitoring System for Industries,” 2019 IEEE 5th Int. Conf. (including Subsea. Lect. Notes Artif. Intell. Lect. Notes
Converg. Technol. I2CT 2019, no. June 2020, 2019, DOI: Bioinformatics), vol. 7003 LNAI, no. PART 2, pp. 680–687,
10.1109/I2CT45611.2019.9033691. 2011, DOI: 10.1007/978-3-642-23887-1_86.
6. M. J. Meena and K. R. Chandran, “Naïve Bayes text Note: All the figures and tables in this chapter were designed by
classification with positive features selected by statistical the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
118 Algorithms in Advanced Artificial Intelligence

Open AI’s Large Language Model to

Improve Payroll and HR Processes 19

Lokesh Sai Kiran Vatsavai1

Information Technology, SRKR Engineering College, Bhimavaram, India
Srihari Varma Mantena2
Computer Science and Engineering, SRKR Engineering College, Bhimavaram, India

Abstract: Payroll entails keeping track of workers’ hours worked, computing their compensation, and transferring funds to
their bank accounts or direct deposit. One deep learning technique that can handle a wide range of natural language processing
(NLP) problems is the large language model (LLM). Transformer models are used in LLM training, where large employee
payrolls are used. As a result, they can now recognize, translate, project, or produce text or other content. The article discusses
the use of Open AI’s Large Language Model to enhance payroll and HR processes. In this paper, AI-powered chatbots and
virtual assistants, like ChatGPT, are revolutionizing HR by automating payroll processes, streamlining employee support, and
saving time and resources. OpenAI’s large language model (LLM) can generate human-like responses, revolutionizing payroll
calculations and HR support.
Keywords: Artificial intelligence, ChatGPT, Deep learning, Large language model, National language processing, Payroll and
HR

1. Introduction or request time off. Also, it has the ability to automate HR

operations, which means HR experts will have less work to
Usually, accounting or human resources are in charge of do. Improve employee satisfaction and free up HR specialists
a company’s fixed financial responsibility, payroll. As a to focus on more difficult duties with ChatGPT’s easy-to
result of digital documentation and streamlined procedures, use interface for routine HR chores. It has the potential to
outsourcing is on the rise. When taken out of gross profits, reduce By saving staff members’ time and alleviating the
it’s a hefty charge for the company. By taking into account burden on HR departments, this solution aims to streamline
things like hours worked, overtime, commissions, and operations and improve efficiency.
deductions, ChatGPT can automate payroll computations. New hires can get answers to their queries regarding business
HR experts save time and reduce mistakes. With ChatGPT, an policy and procedure using ChatGPT’s onboarding service.
employee may inquire about their net income after taxes and Additionally, it offers training materials such as interactive
deductions, and the virtual assistant can handle complicated lessons and tests. It aids workers in accomplishing new jobs
calculations and rules, like whether they are eligible for in a timely and productive manner. ChatGPT can assess a
overtime pay. Payroll automation is a breeze with its built- worker’s abilities, duties, and performance data to suggest
in natural language processing features. ChatGPT is a great courses that would be most beneficial. With ChatGPT,
tool for employees who have questions about their HR, such managers may more easily carry out performance reviews,
as how to view their pay stubs, change their profile details, provide constructive criticism, and recommend training

1
lokesh3069@gmail.com, 2mshv@srkrec.ac.in

DOI: 10.1201/9781003529231-19
Open AI’s Large Language Model to Improve Payroll and HR Processes 119

programmes tailored to individual employees’ requirements. training dataset is referred to as “model selection.” If there
It compiles a thorough report on an employee’s strengths, are features X and a target Y, the best transformation F can be
shortcomings, and improvement opportunities by analysing determined from the data by: Y = F(X). The word “optimal”
data such as job history, duties, and feedback. Managers are denotes the existence of a model performance metric, and the
able to better engage their employees and keep them around “optimal” model is the one that maximizes that statistic. It is
when they can provide them with constructive criticism important to consider a number of axes in order to improve
and chances to grow. Even with the most intricate payroll our model:
systems, ChatGPT can keep correct records and update 1. The model parameter space: Use statistical learning
payroll calculations. Nevertheless, HR professionals should to “train” a model and “optimize” this “space”. The
view technology as an adjunct to their work rather than a parameters are learned using an optimization approach,
substitute. Human resource experts are still vital in handling such as the maximum likelihood estimation principle.
complex HR tasks.
2. The Model Paradigm Space: It is possible to employ
a variety of supervised learning algorithms to address
2. Proposed Work the same issue. Depending on the particular dataset,
The use of AI-powered chatbots and virtual assistants is algorithms like Naive Bayes, XGBoost, or Neural
rapidly expanding in the human resources sector. Businesses Network may perform significantly differently.
can save time and money with these tools, which automate 3. The Hyperparameters Space: Make these choices in
essential payroll and HR tasks, accelerate employee support, order to put up our training run, even though statistical
and more. One promising tool in this sector is OpenAI’s learning cannot enhance these model parameters.
massive language model, ChatGPT. Because it understands 4. The Model Architecture Space: This applies more
spoken language and can give replies that sound human, to neural networks. A set of Hyperparameters can
ChatGPT is a great way to automate HR tasks and provide be used to describe the model architecture, but the
support to employees. search is typically more involved than with ordinary
Understanding Payroll Using Random Forest Hyperparameters. The size of the search space can
Classification: When it comes to employee payroll, Random reach 10^40.
Forest is a popular ensemble technique for both classification 5. The feature space: The proper feature must be chosen
and regression issues. To process payroll, one must maintain to feed our model. Depending on the features that can
track of working hours, calculate compensation, and be used, different models will respond in different
distribute monies. All employees’ financial transactions, ways. Excessive characteristics and possible overfit. It
including payroll, taxes, bonuses, overtime, sick leave, might not fit if there aren’t enough features.
vacation pay, and government taxes for programmes like 6. The Feature Transformation Space: To enhance the
Social Security, Medicare, and unemployment, must be performance of our model, take into account several
recorded by organisations. Employees can input their hours transformations as the Box-Cox transformation and
worked using an API, and any company can use the payroll feature encoding.
administration software. Large and medium-sized companies Large Language Model for Employee Payroll: Large
often use outside firms to handle their payroll processing so language models (LLMs) are AI systems that use deep learning
that they can save time and effort. Businesses deduct taxes techniques and massive data sets to understand, synthesise,
from employees’ gross wages, keep track of their working create, and predict new content. LLMs lay the groundwork
hours, report this data to payroll processors, and then pay for conversation and the development of new ideas. Language
their employees. models like LLMs, which were among the first AI language
models, find widespread use in NLP systems that allow users
to input questions using natural language. Advancements in
AI’s language model idea have greatly increased the amount
of data available for training and inference, hence enhancing
the capabilities of AI models. Learning Language Models
(LLMs) can benefit from databases, yet these tools are not
without their limitations. The payroll model embeds and
indexes data into a vector database. The questions asked
Fig. 19.1 Employee payroll using random forest classification
by users are transformed into embeddings, which are then
Employee Payroll Optimizing a model: The procedure used to find other embeddings that are comparable to them.
of selecting the best employee payroll model for a specific However, the system may produce results that are not relevant
120 Algorithms in Advanced Artificial Intelligence

Fig. 19.2 Optimizing employee payroll model: Different organization axes

Source: Linkedin

to the employee’s Details that are closely related should The code uses -1 for the last column and 1 for the second
be presented in a concise and focused manner. To prevent column, removing the last column and starting indexing from
irrelevant or watered-down publications, break down the data zero, resulting in outputs for p and q. The code will generate
into chunks of a few paragraphs each. Preventing erroneous a dataset consisting of x-test, x-train, and y-test, y- train
query results can be achieved by limiting inquiry types. images, as shown in Fig. 19.4.
Experimental Result: The two components of the Employee The image displays the extraction of p and q variables from a
payroll dataset are work experience and remuneration. dataset, split into a test and training set with 20 observations
Finding the greatest fit line and determining the connection for training and 10 for testing.
between the two attributes are the challenges at hand. The # the dataset is divided into training and a test set.
relationship between the parameters will be displayed using from sklearn.model_selection import train_test_split
a Python linear regression model.
p_train, p_test, q_train, q_test=train_test_split
Step-1: Pre-processing of Payroll Data: Import three (p, q, test_size=1/3, random_state=0)
necessary libraries in order to load the dataset, produce
graphs, and develop the model for a linear regression. Our payroll dataset is ready for linear regression; but, because
of how Python libraries handle specific scenarios, feature
import numpy as nm scaling will not be employed.
import matplotlib.pyplot as mtp
Step-2: Fitting the Training Set with the Linear Regression:
import pandas as pd To create a regressor object and import the linear regression
The payroll dataset can be loaded with the following code: class from scikit-learn, use the provided code #Fitting the
data_set = pd.read_csv (‘payroll_data.csv’). The variable Simple Linear Regression model to the training dataset.
explorer feature on our Spyder IDE screen allows us to use from sklearn.linear_model import Linear
the code (ctrl+ENTER) to read the dataset.
Regression
The payroll dataset consists of salary and experience, and the
code is used to extract independent and dependent variables. Regressor= Linear Regression ()

p= data_set.iloc [:,:-].values regressor fit (p_train, q_train)

q= data_set.iloc [:,1].value
Open AI’s Large Language Model to Improve Payroll and HR Processes 121

Fig. 19.3 Dataset description using Python code

Fig. 19.4 Trained dataset description

In order to allow the model to learn correlations, the code The code will generate salary predictions for the training and
passes p_train and q train as the dependent and independent test sets in the variable explorer options using variables q
variables when fitting a Simple Linear Regression object to a pred and p_pred.
training set using the fit() method. Output: The values of p_test and q pred can be used to
Output: Out [7]: Linear Regression (copy_P=True, compare outcomes, and the variable can be inspected using
fit_intercept=True,n_jobs=None, normalize=False) the IDE's variable explorer option.
Step-3. Forecast of the test set outcome: The model is ready The describes how to use the scatter () function in the pyplot
to use experience and salary to forecast fresh observations. To library to display the training set results. The plot is made up
evaluate the model's accuracy, a test dataset is given, which of a title for the plot, a regression line, and an observation
yields the prediction vectors p_pred and q pred. scatter plot. The employee pay is shown on the y-axis, while
#Prediction of Test and Training set result the years of experience and wage are shown on the x-axis.
q pred regressor. Predict (p_test) The data is plotted on a graph using show() once the labels
p_pred regressor. Predict (p_train) for the p and q axes are assigned.
122 Algorithms in Advanced Artificial Intelligence

mtp.scatter(p_train, q_train, color="green") mtp.pl

otp_train, p_pred, color="red")
mtp.title("Salary vs Experience (Training Dataset)"
mtp.plabel("Years of Experience")
mtp.qlabel("Salary(In Rupees)")
mtp.show()
Output: This explains how to show the outcomes of the
training set using the scatter() function of the pyplot package.
The components of the plot include the title, the regression
line, and the observation scatter plot. The x-axis displays the
employee’s wage and years of experience, while the y-axis
displays their income. After the p and q axes are labelled, Fig. 19.6 Execution results
the data is plotted on a graph using show(). In the plot, green
dots represent actual values and red lines represent predicted
values; this shows how the two variables are related. The 3. Conclusion
majority of results demonstrate a good fit between the data
ChatGPT simplifies support processes, boosts employee
and the model.
engagement, and automates payroll and human resources.
Step-4: displaying the test set outcomes: Using p_test and There will likely be additional ground-breaking developments
q_test rather than p_train and q_train, the payroll model's in HR and payroll software as AI develops further.
performance on the training set will be visualized on the test
set, and the color of the regression line and observations will
be altered.
References
#visualizing the outcomes of the test set 1. Abdul-Kadar Masum,”A Holistic Decision Support
mtp.scatter(p_test, q_test, color="blue") Framework for HR Excellence”The International Arab
mtp.plot(p_train, q_pred, color="red") Journal of Information Technology, Vol. 15, No. 1, January
2018. https://iajit.org/PDF/January%202018,%20No.%20
mtp.title("Salary vs Experience (Test Dataset)")
1/9605.pdf
mtp.xlabel("Years of Experience") 2. Kritika Mahajan, Shilpa Shukla, Nitasha Soni (2015) “A
mtp.ylabel("Salary(In Rupees)") Review of Computerized Payroll System” University of
mtp.show() Lingaya, Department of Computer Science. https://ijarcce.
When the code is run, the following results will be obtained: com/upload/2015/january/IJARCCE1 M.pdf
3. Pavitra Rani Gautam, Sugadev Ragumani and Y.K. Sharma
Department of Bioinformatics-“A System for Payroll
Management” Journal of Computer Science 6 (12): 1531
1534, 2010 ISSN 1549-3636 © 2010 SciencePublications.
https://thescipub.com/abstract/jcssp.2010.1531.1534
4. MD. SAJJAD HOSAIN “The Impact of E-HRM on
Organizational Performance: Evidence from Selective Service
Sectors of Bangladesh” International Journal of Human
Resources Management (IJHRM)-2107. https://ssrn.com/
abstract=2965293
5. Bidisha Lahkar Das, “Employee Retention: A Review of
Literature” e-ISSN: 2278-487X, p-ISSN: 2319- 7668. Volume
14, Issue 2 (Nov. - Dec. 2013), PP 08–16. https://www.
iosrjournals.org/iosr-jbm/papers/Vol14-issue2/B01420816.
pdf
6. Bondarouk, T.V., & Ruël, H. M. (2005). “Does E- HRM
Fig. 19.5 Execution results contributes to HRM effectiveness? Results from a
quantitative study in a Dutch ministry”. Paper presented
The plot’s blue observations and red regression lines at the 4th International conference of the Dutch HRM
demonstrate how well the Linear Regression model predicts Network, November 4–5, 2006, Enschede,TheNetherlands.
the future. https://www.emerald.com/insight/content/doi/10.110
8/01425450710741757/full/html
Open AI’s Large Language Model to Improve Payroll and HR Processes 123

7. Samaduzzaman, M. & Zaman, F. (2012). “E-HRM in Conference on Emerging Technologies, IEEE, 2010.
Bangladesh”. IOSR Journal of Business and Management, doi: 10.1109/ICET.2010.5638465
4 (6), 32–36. https://www.iosrjournals.org/iosr-jbm/papers/ 10. A.F.M. Sultanul Kabir, M. Ahmed Shorif, H. Li, and Q. Yu,
Vol4-issue6/F0463236.pdf “A Study of Secured Wireless Sensor Networks with Xbee
8. A. Sonkamble, “Automation of Attendance System using and Arduino”, 2nd International Conference on Systems and
RFID, Biometrics, GSM Modem with .Net Framework”, Informatics, IEEE, 2014. doi: 10.1109/ICSAI.2014.7009337.
International Conference on Mutimedia Technology, IEEE, Note: All the figures in this chapter were taken from https://
2011. https://ieeexplore.ieee.org/document/6002032 www.linkedin.com/posts/damienbenveniste_machinelearning
9. Z. Rashid, A. Basit, and Z. Anwar, “TRDBAC: Temporal datascience-artificialintelligence-activity-7112101911918522368
Reflective Database Access Control”, 6th International IgBl
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
124 Algorithms in Advanced Artificial Intelligence

A Novel Blockchain-Based Approach for

Secure and Efficient Electronic Medical
Record Sharing
20

Hussein EL Ghor1, Mohamed Daher2, Bilal Nakhal3

CyberVision Lab Beirut Arab University Beirut, Lebanon

Abstract: Sharing of Electronic Medical Records (EMRs) between doctors and medical institutions can now be done using
Blockchain - a disruptive approach to the exchange of EMRs. Blockchain can enhance the accuracy of medical decisions
and improve public health significantly. However, there is a need to ensure that sensitive information is retrieved from the
correct encrypted EMRs, and it is even more difficult to dynamically update the user attributes of authorized users. To this
end, we propose a secure data sharing approach that uses blockchain and encryption techniques to ensure secure, efficient,
and patient centric data sharing. We came up with a never-before- seen approach to integrate attribute-based encryption,
searchable encryption, and robust access control mechanisms to update user attributes with top-notch security measures in
place. Additionally, we delve into the hurdles encountered and ways to make amends to the strengths and potential areas for
improvement. Our proof of consistency demonstrates the impact of adding a consortium blockchain on securing the shared
electronic medical records and the consequential improvements it yields for the healthcare industry.
Keywords: Access control, Attribute-based encryption, Blockchain, Dynamic user attributes, EHR data sharing, Searchable
encryption

1. Introduction a secure, decentralized, and tamper- proof ledger for all

medical records, this innovation is expected to escalate
Traditional medical systems have been unable to keep up with diagnostic accuracy significantly (Nakamoto 2008).
the pace of contemporary convenient life. The emergence However, the practical implementation of blockchain in
of electronic medical records has more effectively solved healthcare faces significant challenges, particularly in
the problems of storage, query, data sharing and medical ensuring secure, accurate retrieval of encrypted medical
errors of patient diagnosis information (Shahnaz A 2019). records and managing dynamic updates (Kuo 2017).
Electronic medical records enable patients to have a more
comprehensive diagnostic information, allowing doctors to The primary challenge lies in the secure retrieval of
understand the patient’s past conditions more quickly and encrypted medical records. Blockchain technology employs
accurately and give new diagnosis results. cryptographic techniques to secure data, but the retrieval of
this encrypted data in a usable form is a complex process
The use of blockchain technology in the medical industry (Zyskind 2015). Attribute-based encryption (ABE) (Ka
has been identified as a transformative approach to sharing 2022) is useful in this situation. ABE has gained popularity
electronic medical records. This breakthrough innovation in recent years as a sophisticated encryption technology that
holds tremendous potential to revolutionize the healthcare allows for fine-grained access control to encrypted data. It has
sector. been extensively utilized in searchable encryption on critical
This cutting-edge technology has the potential to advance data and regulated sharing of medical records (L. D. Li H
public medical services in remarkable ways. By deploying 2018), (Y. Y. Li H 2020). The patient’s individual identifiers,
1
h.elghor@bau.edu.lb, 2m.daher@bau.edu.lb, 3b.nakhal@bau.edu.lb

DOI: 10.1201/9781003529231-20
A Novel Blockchain-Based Approach for Secure and Efficient Electronic Medical Record Sharing 125

such as name, date of birth, or medical record number, could more convenient. In (Capece 2020), authors examined the
be unique identifiers in the context of EMRs. Patient privacy blockchain’s potential applications in healthcare, targeting
and data security are maintained by ABE to ensure that electronic medical records in particular. They go over how
only authorized individuals can decrypt the medical records blockchain improves data security, integrity, and access
(Li 2015). management.
Additionally, a cryptographic method, known as searchable Authors in (Han Y 2022) discussed the potential of blockchain
encryption, enables users to look for certain data without in addressing the interoperability and privacy issues of
having to first decrypt it completely (Cash 2014). In EMRs, electronic medical records (EMRs). They highlight the need
where healthcare practitioners may need to locate specific to overcome challenges related to data access and sharing
patient records from a large, encrypted dataset, this method is in blockchain based EMR schemes. They also discussed
of utmost relevance. Blockchain technology can dramatically ongoing challenges in data management efficiency, fairness
increase the security and speed of medical record retrieval by of access, and trust in the systems.
combining ABE and searchable encryption. Combining these Reegu et al. (Reegu 2023) proposes a blockchain-based
methods guarantees authorized users’ access, privacy, and framework named Ancile for secure interoperability and
confidentiality (Kamara 2014). organized access to medical records. The authors emphasized
Another challenge for blockchain use in healthcare is how to the potential of blockchain in revolutionizing the exchange
manage dynamic updates to medical records. A strong access and processing of EMRs. Yan et al. (Yan 2023) focused
control system is needed to handle this dynamic nature since on attribute-based searchable encryption and blockchain
patient attributes are continually changing. This mechanism access control in a cloud environment. The authors suggest
should allow for the addition, modification, and deletion a fine-grained access control system that integrates IPFS
of user attributes without compromising the security of the (Interplanetary File System), attribute-based encryption, and
blockchain (Xu 2018). blockchain technology.
To address these challenges, this paper proposes an innovative These papers provide insights into the use of attribute-
attribute model that allows for the dynamic updating of user based encryption and searchable encryption in the context
attributes while maintaining the integrity of the blockchain. It of blockchain technology for electronic medical records
ensures that only authorized users can update their attributes and data sharing in healthcare. They discuss the challenges,
and that these updates do not affect the security of the opportunities, and potential improvements in this field.
encrypted medical records.
Furthermore, this paper seeks to perform theoretical 3. System Model
evaluations of both security and performance aspects to
showcase the durability and reliability of the proposed We suppose that a consortium chain is formed by multiple
blockchain system for EMRs. As a result, we are tackling hospitals, where each hospital has a local server and several
challenges related to retrieving encrypted health records clients that are operated by the doctor. Each hospital builds
securely and accurately while incorporating dynamic updates. its own private chain, while multiple private chains build a
Thus, providing a practical way for blockchain technology consortium chain.
implementation in the medical sector. Before entering the system, patients, doctors, and data users
The rest of the paper is summarized as follows: Section 2 need to register and generate their own public-private key
presents the most relevant related words. The system model pairs. Among them, the patient’s electronic medical record
for the proposed solution is stated in Section3. Section 4 ciphertext is stored on the hospital server, the hash value and
proposed the EMR storage scheme based on consortium keyword index of the electronic medical record ciphertext
blockchain. The conclusion and future work are then are stored on the hospital private chain, and the security
presented in section 5. index composed of the private chain block identification,
patient pseudo-identity and keyword index is stored on the
consortium chain. The system mainly includes 6 entities
2. Related Works such as patients, doctors, data users, hospital servers, private
In recent years, EMR data sharing has become a hot spot in chains, and consortium chains (see Fig. 20.1).
the field of public health and smart healthcare. (LIU Gechang • Patient: When a patient is admitted to the hospital,
2019) proposed a data privacy protection mechanism based on they begin by registering on the hospital’s server. Once
searchable encryption. The system is applied to the personal the registration process is complete, the hospital server
medical data blockchain, which makes the private data search assigns a unique number plate to the patient, serving as
126 Algorithms in Advanced Artificial Intelligence

Fig. 20.1 Electronic medical record system model

Source: Designed by the author

their medical card. It is crucial for the patient to keep trap and uploads it to the consortium chain. The nodes
this number plate confidential and present it during on the consortium chain perform the search, and upon
consultations. Doctors, on the other hand, generate finding the corresponding patient ciphertext, the relevant
electronic medical records and associated keywords for node on the consortium chain takes appropriate action
each patient. These records are then encrypted using the based on the search results.
patient’s public key. In the event that the patient visits • Hospital Server: As the doctor attends to the patient and
another hospital, and a doctor needs access to their creates the electronic medical record, the hospital server
medical history, a search trap is generated by the patient. retrieves the block identity from the private chain, along
The search trap is uploaded to the consortium chain. with the patient’s identity and keyword index. Utilizing
• Doctor: In the hospital, there is a local server along with this information, the hospital server constructs a new
multiple clients, which are operated by doctors. When a transaction on the consortium chain. The remaining nodes
doctor sees a patient, they create an identity, electronic within the consortium chain undertake the crucial task of
medical record ciphertext, keyword ciphertext, and validating the transaction. Once the verification process
evidence for the patient. The doctor then uploads the is successfully completed, a fresh block is generated on
electronic medical record ciphertext to the hospital the consortium chain, securely incorporating the updated
server. Simultaneously, the doctor uploads the hash information.
value of the encrypted medical record, and the keyword • Private Chain: To initiate new transactions, the
index, which consists of the encrypted keyword, to the doctor uploads the hash value of the electronic medical
private chain. To ensure the integrity of the data, the record ciphertext and the keyword index, consisting
doctor generates a new transaction and broadcasts it to of the keyword ciphertext, to the private chain. Upon
the network. The other nodes on the private chain take receiving a transaction created by the doctor, the node
on the responsibility of validating the transaction. Once on the private chain diligently verifies its authenticity.
the verification process is successfully completed, a new Subsequently, the hospital server leverages the private
block is added to the private chain, further securing the chain’s block identity, patient identity, and keyword
information. index to construct a new transaction on the consortium
• Data Users: When external institutions or individuals, chain. During the data acquisition phase, if the search
referred to as data users, need access to patient data, they proves to be successful, the node on the consortium
must obtain authorization from the patients themselves. chain extracts the secure index from the block and
To initiate this process, the patient generates a search obtains the private chain’s block identity. By utilizing
A Novel Blockchain-Based Approach for Secure and Efficient Electronic Medical Record Sharing 127

the private chain block identification, the nodes on the 2. To validate and confirm transactions and blocks added
consortium chain gain access to the hash value of the to the chain, the private chain can employ a consensus
medical record ciphertext, enabling them to retrieve the mechanism, like Proof of Work (PoW) or Proof of
necessary information securely. Stake (PoS) algorithms. These mechanisms ensure that
• Consortium Chain: As the search process unfolds, all participants in the chain agree on the validity of the
the nodes on the consortium chain receive the trapdoor data being added thereby maintaining consistency.
transmitted by the patient and execute the search 3. In order to securely store records their ciphertext, hash
algorithm. After a successful search, the data user node value and keyword index are stored in a protected
obtains the security index and private blockchain ID manner within the private chain. Encryption techniques
from the consortium blockchain. It then retrieves the are used to safeguard the confidentiality and integrity
hash value of the encrypted medical record and sends of this data ensuring that authorized entities can access
it back to the hospital server. The hospital server then and modify it.
compares this hash value with the one associated with
the electronic medical record ciphertext to ensure their 4.2 Proof of Consistency of Consortium Chain
consistency. If there is a match, the data user node sends In the meticulously devised algorithm pertained to the
the medical record ciphertext back to the patient. In impervious realm of data sharing, the consortium blockchain
cases where a third-party data user needs access to a and encryption techniques stand as the gatekeepers. As the
patient’s electronic medical record, the consortium chain beacon of reliability and integrity, the consortium chain
node acts as an intermediary. The consortium chain node ensures the consistency and validity of the keyword indexes
plays an intermediary role when a third-party data user and search trapdoor. The secure index, which consists of
requests access to a patient’s electronic medical record. keyword indexes, is kept in the consortium chain.
In order to maintain privacy and safeguard the data, the
To achieve proof of consistency on the consortium chain, the
node creates an agent re-encryption key and performs
system constructs a polynomial (𝑥) using hash functions. The
proxy re-encryption on the existing electronic medical
polynomial (𝑥) is designed to represent the set of keywords
record ciphertext. The newly formed re- encrypted
𝑊 = {𝑤1, 𝑤2 ⋯ 𝑤𝑛} that contains a description of all the
ciphertext is then transmitted to the third- party user
symptoms that the patient is likely to have. The coefficients
through a secure channel.
of the polynomial are derived from the hash values of the
keywords.
4. Consortium Mechanism The steps listed below can be used to show the proof of
Based on searchable encryption, the consortium mechanism consistency:
utilizes a security index consisting of keyword indexes that’s 1. Construction of Polynomials: Using the hash values of
held on the consortium chain. Whenever there’s a need for the keywords, (𝑤𝑖), the system creates a polynomial
electronic medical records data by either a patient or data 𝑓(𝐻(𝑤𝑖)), where 1 ≤ 𝑖 ≤ 𝑛. Every keyword is hashed,
user, they use their private key to generate a search trapdoor. and the resulting hash value functions as a polynomial
The search trapdoor is then sent to the consortium chain coefficient. The polynomial is built in a manner
where nodes would then perform the search. that allows it to represent the set of keywords. The
polynomial can be represented as:
4.1 Proof of Consistency of Private Chain
(𝑥) = (𝑥 − 𝐻(𝑤1))(𝑥 − 𝐻(𝑤2)) ⋯ (𝑥 − 𝐻(𝑤𝑛)) (1)
The method suggested for secure sharing of data involves
The polynomial (𝐻(𝑤𝑖)) = 0∀𝑖. (𝑥) can also be
the utilization of encryption techniques and blockchain
expressed as:
technology. A crucial aspect of this approach is the
implementation of a private blockchain that is customized (𝑥) = a0 + a1 × 𝑥 + a2 × 𝑥2 + ⋯ + a𝑛 × 𝑥𝑛 (2)
for each hospital and plays a major role in maintaining the Where (𝑤) is the hash value of keyword 𝑤 and 𝑎0,
validity and coherence of each participant’s blockchain copy. 𝑎1 ⋯, 𝑎𝑛 are coefficients derived from the hash values
The private blockchain functions as a storage mechanism for of the keywords.
encrypted medical records, hash values, and keyword indexes. 2. Create a search trapdoor utilizing the private key
Consistency in the private blockchain can be preserved by whenever a patient or data user needs to access
following a set of sequential procedures. information from an electronic medical record. By
1. The private chain relies on a structure that is “append changing the value of 𝑥 in the polynomial (𝑥) to (𝑟),
only.” This means that once a block is appended to a value known as the search trapdoor is obtained. The
the chain, it is irrevocable and cannot be deleted or consortium chain will process this search trapdoor
modified. further after receiving it.
128 Algorithms in Advanced Artificial Intelligence

3. Consistency Verification: The nodes on the consortium 2. 𝐾𝑒𝑦𝐺𝑒𝑛 algoritℎm: Tℎe 𝐾𝑒𝑦𝐺𝑒𝑛 algorithm in this scheme
chain perform calculations using the search trapdoor is described in Algorithm 2 as follows: In this algorithm, the
and the polynomial 𝑓(𝑥). By substituting the search input is the system parameters 𝑃𝑃, the master key 𝑀𝑠𝑘, the
trapdoor value into the polynomial, the nodes can patient attribute 𝑎, and the doctor attribute 𝑑. The output is
verify if the resulting value matches the hash value the secret key 𝑆𝑘.
of any keyword in the secure index. If a match is Algorithm 2: 𝐾𝑒𝑦𝐺𝑒𝑛 Algorithm
found, it indicates that the patient’s search trapdoor
corresponds to a keyword in the secure index, ensuring 1: Input: System parameters 𝑃𝑃, Master key 𝑀𝑠𝑘, Patient
the consistency of the consortium chain. attribute 𝑎, Doctor attribute 𝑑
Suppose there is a vector 𝑏 = [𝑏 0, 𝑏 1, ⋯, 𝑏 𝑛], the system 2: Output: Secret key 𝑆𝑘
can then introduce new polynomial 𝑔(𝑥) such that: 3: Choose a random exponent 𝑟 ∈ 𝑍𝑞∗.
(𝑥) = 𝑏 0 + 𝑏 1 × 𝑥 + 𝑏 2 × 𝑥2 + ⋯ + 𝑏 𝑛 × 𝑥𝑛 (3) 4: Compute ℎ = 𝑔1𝛼𝑑𝑔2𝑟.
𝑎1 𝑎1 𝑎1
If you set the vector 𝑏 = [𝑏 0 = a , 𝑏 1 = a , ⋯, 𝑏 n = a ], 5: Compute 𝐾 = 𝐻2(𝑎 || 𝑑 || ℎ).
0 0 0
then (𝐻(𝑤i)) = 1. 6: Compute 𝑆𝑘 = (ℎ, 𝐾).
Consider also that ℎ = [(𝑤1), (𝐻(𝑤2))2, ⋯, (𝐻(𝑤𝑛))𝑛]
7: Output 𝑆𝑘.
represents the vector of hash values, then it is easy to
verify that 𝑏 × ℎ = 1. If the keywords used in the data Algorithm 2 starts by choosing a random exponent
encryption process belong to a keyword 𝑠𝑒𝑡𝑊 = {𝑤1, 𝑟 ∈ 𝑍q∗. It then computes h = 𝑔1𝛼𝑑𝑔2𝑟, where 𝑔1 and 𝑔2 are
𝑤2, ⋯ 𝑤𝑛}, then equation 𝑏 × ℎ = 1 holds. the generators of 𝐺1 and 𝐺2 respectively. 𝛼𝑑 is the master
key component corresponding to the doctor attribute 𝑑. The
5. EHR Storage and Sharing Scheme algorithm computes 𝐾 = 𝐻2(𝑎 || 𝑑 || ℎ), where 𝐻2 is the hash
Based on Consortium Blockchain function defined in algorithm, 1 and || denotes concatenation.
Finally, the algorithm outputs 𝑆𝑘 = (ℎ, 𝐾).
The EHR storage and sharing scheme based on a consortium 3. Encryption: To encrypt the files based on the specified
blockchain can be divided into three phases: system access structure, the patient executes the 𝐸𝑛𝑐𝑟𝑦𝑝𝑡 algorithm
establishment, data encryption and storage, and data search with the system parameters 𝑃𝑃, the master key 𝑀𝑠𝑘, the file
and decryption. set 𝐹 = {𝑓1, 𝑓2 ⋯ 𝑓𝑚}, the keyword set 𝑊 = {𝑤1, 𝑤2 ⋯ 𝑤𝑛},
and a random value 𝜎 as inputs.
5.1 System Establishment
The algorithm outputs a tuple (𝑠𝑖𝑔, 𝐶, 𝐶̂, 𝐼, ∅𝑗) that contains
This phase consists of two steps: initialization and key
the signature 𝑠𝑖𝑔, ciphertext, 𝐶 indexed ciphertext 𝐶,̂ access
generation.
structure, (𝑓) and policy parameters ∅𝑗 for each file 𝑓𝑗.
1. Initialization: In this step, the system parameters are
Specifically, the Encrypt algorithm takes the system
generated. The input to this step is a security parameter 𝜆, and
parameters 𝑃𝑃, the master key 𝑘, the file set 𝐹, the keyword
the output is the system parameters 𝑃𝑃 and the Master key
set 𝑊 , and a random value 𝜎 as inputs. It generates a digital
𝑀𝑠𝑘. The Initialization algorithm in this scheme is described
signature 𝑠𝑖𝑔 for the file set 𝐹, a keyword index (𝑓𝑗) and an
in Algorithm 1 as follows:
encrypted file key 𝐸(𝑓𝑗) for each file 𝑓𝑗 using the CP-ABE
Algorithm 1: Initialization Step Algorithm algorithm with the access structure as the policy. It combines
1: Input: Security parameter 𝜆 the encrypted file key 𝐸(𝑓𝑗), the keyword index 𝐼(𝑓𝑗), and the
2: Output: System parameters 𝑃𝑃, Master key 𝑀𝑠𝑘 signature 𝑠𝑖𝑔(𝐹) to form the ciphertext 𝐶(𝑓𝑗). The indexed
ciphertext 𝐶(𝑓̂ 𝑦) is then encrypted using the symmetric key
3: Choose two cyclic groups 𝐺1 and 𝐺2 of prime order 𝑞,
and a bilinear map 𝑒: 𝐺1 × 𝐺1 → 𝐺2. 𝑘𝑠𝑦𝑚. The access structure 𝐼 and the policy parameters ∅
is then generated for the CP-ABE algorithm. The 𝐸𝑛𝑐𝑟𝑦𝑝𝑡
4: Choose a generator 𝑔 of 𝐺1.
algorithm outputs the tuple (𝑠𝑖(𝑓𝑗), 𝐶(𝑓𝑗), 𝐶(𝑓𝑗), 𝐼(𝑓𝑗), ∅𝑗)
5: Choose random exponents 𝛼, 𝛽 ∈ 𝑍∗. for each file 𝑓𝑗.
6: Compute 𝑔1 = 𝑔𝛼 and 𝑔2 = 𝑔𝛽 .
Steps to encrypt a file 𝑓𝑗 are (see algorithm 3):
7: Choose two hash functions 𝐻1: {0, 1}∗ → 𝐺1 and 𝐻2:
{0, 1}∗ → {0, 1}𝑛. (a) For each file 𝑓𝑗 in the file set 𝐹, the data owner (DO),
i.e. patient, generates a keyword index using the AES
8: Set 𝑃𝑃 = {𝐺1, 𝐺2, 𝑞, 𝑔, 𝑒, 𝐻1, 𝐻2} and 𝑀𝑠𝑘 = {𝛼, 𝛽 }.
algorithm. The keyword index is a binary string that
A Novel Blockchain-Based Approach for Secure and Efficient Electronic Medical Record Sharing 129

represents the presence or absence of each keyword in 6: Encrypt file key (𝑓𝑖) using CP-ABE algorithm with
the file. The keyword index is denoted as (𝑓𝑗). access structure as policy.
(b) DO generates a digital signature 𝑠𝑖𝑔 for the file set 𝐹 7: Combine 𝐸(𝑓𝑖), 𝐼(𝑓𝑗), and 𝑠𝑖𝑔(𝐹) to form ciphertext
using the master key 𝑀𝑠𝑘. The signature is denoted as 𝐶(𝑓𝑗).
𝑠𝑖(𝐹).
4. Trapdoor Generator: The trapdoor generator is a
(c) For each file 𝑓𝑗 in the file set 𝐹, DO executes the CP
component of the searchable encryption scheme that allows
ABE algorithm with the access structure as the policy
authorized users to generate trapdoors also called (search
to encrypt the file key. The access structure is a Boolean
tokens) for specific keywords, while preventing unauthorized
formula that specifies the attributes required to decrypt
users from doing so. The trapdoor generator is based on
the file. The encrypted file key is denoted as (𝑓𝑗).
the computational Diffie-Hellman (CDH) problem and the
(d) DO combines 𝐸(𝑓𝑗), 𝐼(𝑓𝑗), and the signature 𝑠𝑖(𝐹) to hash function collision resistance. In our approach, access
form the ciphertext 𝐶(𝑓𝑗). The binary string, known as control is enforced using attribute-based encryption (ABE)
the ciphertext (𝑓𝑗), not only symbolizes the encrypted and a master key 𝑀𝑠𝑘. The master key is used to generate
file key, but also encompasses the keyword index and the private keys for the different entities in the system, such
the file signature. The computation of the ciphertext as patients, doctors, and data users. Each entity is associated
(𝑓𝑗) follows a particular process. with a set of attributes, and access to the health records is
(𝑓𝑗) = (𝐸(𝑓𝑗), 𝐼(𝑓𝑗), 𝑠𝑖𝑔(𝐹)) (4) granted based on the attributes of the entity and the attributes
(e) The encryption process undertaken by DO is performed associated with the health records.
by using a symmetric key, denoted as 𝑘𝑠𝑦𝑚, to encrypt Assuming that the patient has the attribute 𝑎 and wants to
the indexed ciphertext 𝐶(𝑓 ̂ ).
𝑦 search for health records containing the keyword w, the
The aforementioned ciphertext, represented as a binary patient generates a random value 𝑟 ∈ 𝑍∗, where q𝑞 is a large
string, comprises the encrypted file key and the file’s prime number, and calculates the trapdoors as follows:
signature, both of which are organized and indexed by (a) The trapdoor algorithm begins with the user selecting
keywords. a keyword 𝑤 to search for in the electronic health
̂ ) = (𝐸(𝑓 ), 𝑠𝑖𝑔(𝐹)) ⊕ 𝑘 (𝐼(𝑓 ))
𝐶(𝑓 (5) records.
𝑦 𝑖 𝑠𝑦𝑚 𝑗
(f) DO generates the access structure 𝐼 and the policy (b) The authorized user generates a random value 𝑟 and
parameters ∅𝑗 for the CP-ABE algorithm. The access computes the hash value of the keyword w using a
structure 𝐼 is a Boolean formula that specifies the secure hash function. The hash value is denoted as
attributes required to decrypt the files. ∅𝑗 includes the 𝐻2(𝑤).
threshold value and the coefficients of the polynomial (c) Compute the Access Structure (𝐴𝑆) based on the
that defines the access policy. access control policy 𝐴𝐶 and the patient attributes 𝑎.
(g) DO outputs the tuple (𝑠𝑖𝑔, 𝐶, 𝐶,̂ 𝐼, ∅𝑗) for each file 𝑓𝑗. 𝐴𝑆 = 𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝐴𝑐𝑐𝑒𝑠𝑠𝑆𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒(𝐴𝐶, 𝑎)
The 𝐸𝑛𝑐𝑟𝑦𝑝𝑡 algorithm ensures that only authorized users (d) Generate the patient’s private key (𝑃𝑟𝑎) and public key
with the correct attributes can decrypt and access the files (𝑃𝑏 𝑎) using the key generation function. (𝑃𝑟𝑎, 𝑃𝑏 𝑎) =
based on the specified access structure. To decrypt a file, a 𝐾𝑒𝑦𝐺𝑒𝑛(𝑆𝑘, 𝑀𝑠𝑘, 𝑎)
user must have the attributes that satisfy the access structure (e) Compute the temporary value (𝑇1) using the Diffie-
𝐼 and the policy parameters ∅𝑗. The user can use the CP-ABE Hellman problem and the public key (𝑃𝑏 𝑎) with the
algorithm to decrypt the encrypted file key (𝑓) using their generator (𝑔).
attributes. They can then use the decrypted file key to decrypt
the file using the symmetric key 𝑘𝑠𝑦𝑚. 𝑇1 = 𝑔𝑃𝑏 𝑎 (6)
(f) Compute the hash value (𝐻) of the concatenation of
Algorithm 3 𝐸𝑛𝑐𝑟𝑦𝑝𝑡 Algorithm 𝐴𝑆, 𝑟, and 𝑇1 using the hash function 𝐻.
1: Input: System parameters 𝑃𝑃, the master key 𝑀𝑠𝑘, the 𝐻 = (𝐴𝑆 || 𝑟 || 𝑇1) (7)
file set 𝐹, the keyword set 𝑊 , and a random value 𝜎.
(g) Compute the second part of the trapdoor (𝑇2) using the
2: Output: (𝑠𝑖𝑔, 𝐶, 𝐶,̂ 𝐼, ∅𝑗) hash function (𝐻2), the random value (𝑟), the patient’s
3: Generate digital signature 𝑠𝑖𝑔 for file set 𝐹 using the private key (𝑃𝑟𝑎), and the generator (𝑔).
master key 𝑀𝑠𝑘.
𝑇2 = 𝐻2(𝑟, 𝑃𝑟𝑎)(1/𝑃𝑟𝑎) × (𝑃𝑟𝑎)𝑟 × 𝑔𝑤 (8)
4: for each file 𝑓𝑗 do
(h) Compute the final trapdoor (𝑇2) by raising 𝑇2 to the
5: Generate keyword index (𝑓𝑗). power of 𝐻.
𝑇2 = 𝑇2𝐻 (9)
130 Algorithms in Advanced Artificial Intelligence

(i) Encrypt the trapdoor (𝑇1, 𝑇2) using the 𝐸𝑛𝑐𝑟𝑦𝑝𝑡 6. Cℎeck negation of attribute: If a satisfies the attribute in
function. 𝑐, the Result is set to 𝐹𝑎𝑙𝑠𝑒 and the loop is terminated.
In attribute-based encryption (ABE) schemes, the construction 7. Return Result: The function returns the Result, which
of the Access Structure (𝐴𝑆) ensures that only authorized is 𝑇𝑟𝑢𝑒 if the access control policy is satisfied by the
users with the necessary attributes can access sensitive patient attributes, and 𝐹𝑎𝑙𝑠𝑒 otherwise.
information. It chooses the characteristics that are necessary
Algorithm 4 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒(𝐴𝐶, 𝑎) function
to decrypt the encrypted data. Based on the access control
policy 𝐴𝐶 and the patient attributes 𝑎, 𝐴𝑆 is calculated. 1: Input: Access Control policy 𝐴𝐶, the patient attributes
𝑎, attributes condition 𝑐.
In order to calculate 𝐴𝑆, the access control policy is assessed
using the user attributes. Logical operators like AND, OR, and 2: Output: 𝑇𝑟𝑢𝑒 if 𝐴𝐶 is satisfied by the patient attributes
NOT can be used to carry out this evaluation. The following 𝐴 and 𝐹𝑎𝑙𝑠𝑒 otherwise).
equation is used to calculate 𝐴𝑆: 3: 𝑅𝑒𝑠𝑢𝑙𝑡 = 𝑇𝑟𝑢𝑒
𝐴𝑆 = 𝐸𝑣𝑎𝑙𝑢𝑎𝑡(𝐴𝐶, 𝑎) (10) 4: for condition 𝑐 in 𝐴𝐶 do
5: if 𝑐 is a single attribute then
Now, let’s break down the equation and explain each part:
6: if 𝑎 does not satisfy 𝑐 then
• The 𝐸𝑣𝑎𝑙𝑢𝑎𝑡(𝐴𝐶, 𝑎) function takes as input the user
7: 𝑅𝑒𝑠𝑢𝑙𝑡 = 𝐹𝑎𝑙𝑠𝑒
attributes (𝑎) and the access control policy (𝐴𝐶). In order
to combine and compare the attributes, the evaluation 8: 𝐵𝑟𝑒𝑎𝑘
function may utilize logical operations like AND, OR, 9: else if 𝑐 = 𝐴𝑁𝐷(𝑎) then
and NOT. A boolean value indicating if the access control 10: if 𝑎 does not satisfy all attributes in 𝑐 then
policy is met based on the user attributes is the function’s 11: 𝑅𝑒𝑠𝑢𝑙𝑡 = 𝐹𝑎𝑙𝑠𝑒
output. The evaluation may involve logical operations 12: 𝐵𝑟𝑒𝑎𝑘
like AND, OR, and NOT to combine and compare the
13: end if
attributes. The output of this function is a boolean value
indicating whether the access control policy is satisfied 14: else if 𝑐 = 𝑂 𝑅(𝑎) then
based on the user attributes. 15: if 𝑎 does not satisfy at least one attribute in 𝑐 then
• The Access Structure, which is the result of the 16: 𝑅𝑒𝑠𝑢𝑙𝑡 = 𝐹𝑎𝑙𝑠𝑒
examination of the access control policy, is represented 17: 𝐵𝑟𝑒𝑎𝑘
by item 𝐴𝑆. 𝐴𝑆 is regarded as true or met if the user 18: else if 𝑐 = 𝑁𝑂 𝑇(𝑎) then
attributes satisfy the access control policy or false if the 20: if 𝑎 satisfies the attribute in 𝑐 then
user attributes do not satisfy the access control policy.
21: Result = False
Algorithm (4) describes the 𝐸𝑣𝑎𝑙𝑢𝑎𝑡(𝐴𝐶, 𝑎) function: 22: Break
1. Initialization: The function takes the patient attributes 𝑎 23: end if
and the access control policy 𝐴𝐶 as inputs. The result’s 24: end if
initial value is set to 𝑇𝑟𝑢𝑒.
25: end if
2. Loop over conditions: Each condition 𝑐 in the access
26: end for
control policy 𝐴𝐶 is iterated over by the function. Textit
attributes include a single attribute, a conjunction of 27: return 𝑅𝑒𝑠𝑢𝑙𝑡
attributes (AND), a disjunction of attributes (OR), and Algorithm 5 explains the function
a negation of an attribute (NOT). 𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝐴𝑐𝑐𝑒𝑠𝑠𝑆𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒(𝐴𝐶, 𝑎).
3. Cℎeck single attribute: The 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 function Algorithm 5 𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝐴𝑐𝑐𝑒𝑠𝑠𝑆𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒(𝐴𝐶, 𝑎) function
determines whether the patient attributes satisfy the 1: 𝐴𝑆 = 𝐹𝑎𝑙𝑠𝑒
single attribute in 𝑐. The Result is set to 𝐹𝑎𝑙𝑠𝑒 and the
2: if 𝐴𝐶 is empty then
loop is ended if 𝑎 does not fulfill the attribute.
3: return 𝐴𝑆
4. Cℎeck conjunction of attributes: If 𝑎 satisfies all of
the combination of attributes in 𝑐, the Result is set to 4: end if
𝑇𝑅𝑈𝐸; otherwise the Result is set to False and the loop 5: if 𝑎 is empty then
is ended. 6: return 𝐴𝑆
5. Cℎeck disjunction of attributes: if 𝑎 fulfills at least one 7: end if
attribute in 𝑐, the Result is set to 𝑇𝑅𝑈𝐸; otherwise, 8: 𝐴𝑆 = 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒(𝐴𝐶, 𝑎)
Result is set to 𝐹𝑎𝑙𝑠𝑒 and the loop is terminated. 9: return 𝐴𝑆
A Novel Blockchain-Based Approach for Secure and Efficient Electronic Medical Record Sharing 131

5.2 Data Generation and Storage in EMR 2. The patient generates a secret key 𝑆𝑘 by using the
(DGS- EMR) 𝐾𝑒𝑦𝐺𝑒𝑛 function and stores the key on the consortium
blockchain.
Data Generation and Storage in EMR (DGS- EMR), begins
with the patient generating a secret key 𝑆𝑘 using the 𝐾𝑒𝑦𝐺𝑒𝑛 3. When the patient requires access to their EMRs, they
function. The patient then registers and uploads secret key to make a request to the hospital by providing their patient
the consortium blockchain. ID.
4. The hospital confirms the patient’s identity and retrieves
In the Data Generation and Storage in EMR (DGS- EMR)
the patient’s EMRs from the private blockchain.
algorithm, the system parameters are first initialized using
algorithm 1. The patient generates a secret key 𝑆𝑘 using 5. The EMRs and the generated search trapdoors are
the 𝐾𝑒𝑦𝐺𝑒𝑛 function and then registers and uploads secret encrypted using the Encrypt function with the CPABE
key to the consortium blockchain. When the patient requires algorithm.
access to their electronic medical records (EMRs), they make 6. The access structure is computed using the
a request by providing their patient ID to the hospital. The 𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝐴𝑐𝑐𝑒𝑠𝑠𝑆𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒 function; and the access
hospital next confirms the patient’s identity before retrieving control policy is evaluated using the 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 function
their electronic medical records from the private blockchain. with the patient’s attributes 𝑎 and the Access Control
policy 𝐴𝐶.
The EMRs and the generated search trapdoors are encrypted
using the 𝐸𝑛𝑐𝑟𝑦𝑝𝑡 function (algorithm 3), and the access 7. The encrypted EMRs and search trapdoors are stored
structure is computed using the 𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝐴𝑐𝑐𝑒𝑠𝑠𝑆𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒 on IPFS.
function (algorithm 5). The access control policy is evaluated 8. The data user node on the consortium blockchain
using the 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 function (algorithm 5). The encrypted receives the trapdoor transmitted by the patient and
EMRs and search trapdoors are stored on IPFS. Finally, the executes the search algorithm.
encrypted EMRs and search trapdoors stored on IPFS. 9. After a successful search, the data user node obtains
The patient goes through a two-step process to make sure the security index and private blockchain ID from the
that his/her EMRs are shared securely with authorized consortium blockchain.
users. The search trapdoor is first generated and uploaded 10. The data user node retrieves the hash value of the
to the consortium chain. By using this trapdoor on the encrypted medical record from the private blockchain
same consortium chain, the authorized user can search for using the private chain block identification.
and retrieve the desired EMRs. A private blockchain uses a 11. The data user node sends the hash value of the
“append-only” method to run its structure. This ensures the encrypted medical record back to the hospital server
integrity and consistency of the recorded data since once a for verification.
block is added to the chain, it cannot be changed or removed. 12. The hospital server compares the hash value received
Using a consensus mechanisms like Proof of Work (PoW) or from the data user node with the one associated with
Proof of Stake (PoS) can confirm and validate the transactions the electronic medical record ciphertext to ensure their
and blocks added to the blockchain. These procedures make consistency.
sure that all participants in the private blockchain agree that 13. If there is a match, the hospital server sends the
the newly added data is accurate, and hence maintaining encrypted medical record ciphertext back to the data
consistency. user node.
The private chain secures the storage of the ciphertext, hash 14. In cases where a third-party data user needs access to
value, and keyword index of records. To ensure confidentiality a patient’s electronic medical record, the consortium
and integrity, encryption techniques are used, allowing chain node acts as an intermediary.
authorized entities to access and modify the data as needed.
Theorem 1. Tℎe DGS-EMR blockcℎain-based metℎod using
Hence, the DGS-EMR blockchain-based method offers a attribute-based encryption, searcℎable encryption, and
secure and efficient way to share electronic medical records robust access control mecℎanisms provides a secure and
while maintaining patient privacy and guaranteeing data efficient way to sℎare medical records wℎile ensuring privacy,
integrity. confidentiality, and integrity.
The steps related to the DGS-EMR are as follows: Proof. To be proved.
1. Use the system initialization algorithm that takes the
security parameter 𝜆 to generate the system parameters
𝑃𝑃 and the master key 𝑀𝑠𝑘.
132 Algorithms in Advanced Artificial Intelligence

5. Kamara, S., & Lauter, K. 2014. “Cryptographic cloud storage.”

6. Conclusion In Financial Cryptograpℎy and Data Security 136–149.
The presented study introduces a distinctive technique for 6. Kuo, T. T., Kim, H. E., & Ohno-Machado, L. 2017. “Blockchain
attribute-based encryption (ABE), and searchable encryption distributed ledger technologies for biomedical and health care
(SE) that enhances the security and efficient exchange of applications.” Journal of tℎe American Medical Informatics
Association, 24(6) 1211–1220.
electronic medical records (EMRs) utilizing blockchain
7. Li H, Liu D , Daiy ,et al. 2018. “Personalized search over
technology. Our proposed methodology tackles successfully
encrypted data with efficient and secure updates in mobile
the issue of accessing encrypted medical records securely clouds.” IEEE Transactions on Emerging Topics in Computing,
and accurately, incorporating dynamic updates. We outline 6(1) 97–109.
a system model meant for a consortium chain created by 8. Li H, Yang Y , Dai Y, et al. 2020. “Achieving secure and
several hospitals, each with its own local server and multiple efficient dynamic searchable symmetric encryption over
clients employed by doctors. Furthermore, we put forward medical cloud data.” IEEE Transactions on Cloud Computing,
an innovative model based on attributes which permits 8(2) 484–494.
user attribute updating in a dynamic manner, maintaining 9. Li, J., Huang, Q., Chen, X., Chow, S. S., Wong, D. S., & Yiu,
blockchain integrity. S. M. 2015. “Multi-authority ciphertext-policy attribute-based
encryption with accountability.” In Proceedings of tℎe ACM
Indeed, we conducted theoretical evaluation to review security Symposium on Information, Computer and Communications
as well as performance aspects of our suggested solution. Security 386–397.
Our outcomes prove the robustness and dependability of 10. LIU Gechang, LI Qiang. 2019. “Blockchain data privacy
our approach ensuring authorized users’ access to privacy protection mechanism based on searchable encryption.”
alongside confidentiality. .Journal of Computer Applications, 39(S2) 140–146.
11. Nakamoto, S. 2008. Bitcoin: A Peer-to-Peer Electronic Casℎ
In regards to forthcoming tasks, we aim to investigate the
System.
scalability of our approach and evaluate its performance 12. Reegu, Faheem Ahmad, Hafiza Abas, Yonis Gulzar, Qin Xin,
in real-world scenarios. We plan to conduct extensive Ali A. Alwan, Abdoh Jabbari, Rahul Ganpatrao Sonkamble,
experiments on a large scale to demonstrate the feasibility of and Rudzidatul Akmam Dziyauddin. 2023. “Blockchain-
our solution in a practical setting. Based Framework for Interoperable Electronic Health Records
for an Improved Healthcare System.” Sustainability 15(8).
13. Shahnaz A, Usman Q, Ayesha K. 2019. “Using blockchain
References for electronic health records.” IEEE Access, vol. 7 147782
1. Capece, Guendalina, and Francesco Lorenzi. 2020. 147795.
“Blockchain and Healthcare: Opportunities and Prospects for 14. Xu, R., Chen, Y., Blasch, E., & Chen, G. 2018. “BlendCAC: A
the EHR.” Sustainability 12(22): 9693. smart contract enabled decentralized capability-based access
2. Cash, D., Jaeger, J., Jarecki, S., Jutla, C., Krawczyk, H., Rosu, control mechanism for the IoT. Computers.”
M. C., & Steiner, M. 2014. “Dynamic searchable encryption in 15. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
very-large databases: Data structures and implementation.” In in Health Care Using Hybrid Model Algorithms”,Journal of
Proceedings of tℎe Network and Distributed System Security Artificial Intelligence, Machine Learning and Neural Network
Symposium. (JAIMLNN), Volume 3, 2023.
3. Han Y, Zhang Y, Vermund SH. 2022. “Blockchain Technology 16. Yan, L., Ge, L., Wang, Z. et al. 2023. “Access control
for Electronic Health Records.” Int J Environ Res Public scheme based on blockchain and attribute- based searchable
Healtℎ. 19(23). encryption in cloud environment.” J Cloud Comp, 12(61).
4. Ka, Ahmad Khoureich. 2022. “Easy-ABE: An Easy 17. Zyskind, G., Nathan, O., & Pentland, A. S. 2015.
Ciphertext-Policy Attribute-Based Encryption.” International “Decentralizing privacy: Using blockchain to protect personal
Conference on Information Tecℎnology and Communications data.” In Proceedings of tℎe IEEE Security and Privacy
Security. Switzerland: Springer. Worksℎops 180–184.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

A Classifying Gender Crimes with

AdaBoost and Back Propagation
Algorithms
21

Dileep Kumar Kadali1

Research Scholar, Dept. of CSE, GIET University-Gunupur, Odhisha
R. N. V. Jagan Mohan2
Associate Professor, Sagi Rama Krishnam Raju Engineering College, Bhimavaram
M. Chandra Naik3
Professor, Dept. of CSE, GIET University-Gunupur, Odhisha

Abstract: In today’s culture, images and videos are crucial for effective work and security surveillance. By studying CCTV
data, a prediction algorithm may ascertain a person’s age, gender, location, and sexual orientation. This technology can make
the world safer by enabling the identification of runaways. The technology, integrated into security cameras within a mile, can
screen suspects, such as a fugitive who stole millions from a typical bank. The paper explores the development of technologies
that can determine a person’s age, face image, gender, and location through cameras, pictures, or videos Using Deep Learning
techniques. A machine learning algorithm AdaBoost is for binary classification tasks, combining weak classifier predictions to
create a strong classifier that performs well on the given data. The research focuses on data parallelism and model parallelism
as two methods for distributing backpropagation computation among GPUs or nodes using Distributed Back Propagation. The
study compares AdaBoost and Back Propagation in gender crime classification. The study investigates technologies like t-SNE,
PCA, and ICA in daily life and their potential applications in criminal face identification, emphasizing the need for precise
experimental results.
Keywords: AdaBoost Classification, Crime Classification, Distributed Back Propagation, t-SNE, PCA, ICA, etc.

1. Introduction and deep-learning algorithms to extract features from

query images, compared to database criminal images by
Artificial Intelligence (AI) has gained traction in capturing Forradellas,2020[6]. Pre-trained models are preferred by
complex patterns and dependencies in historical data, users, reducing feature extraction time. The development
particularly in robotics automation, optical character of technologies that can determine a person’s age, gender,
recognition, handwriting recognition, and face identification location, and sexual orientation through CCTV data using
by Saravanan,2021[12]. Image predictive analytics uses Deep Learning techniques by Bandekar,2020[3]. According
image data analysis, computer and machine vision, AI, to Gao, 2019 [7], India registered 60,96,310 crimes in 2021,
and statistical models to predict future outcomes by Kim, comprising 36,63,360 crimes under the Indian Penal Code
2019[10]. Computer vision focuses on taking, processing, (IPC) and 24,32,950 crimes under Special and Local Laws
analyzing, and understanding digital pictures for decision- (SLL). The crime rate per 100,000 people dropped from 487.8
making, object recognition, video tracking, and picture to 445.9, a yearly decline of 7.65% from 2020, however,
restoration. Advanced systems use machine-learning it was still much higher than in 2019. Human body crimes

1
dileep.kadali@giet.edu, dileepkumarkadali@gmail.com; 2mohanrnvj@gmail.com; 3srichandra2007@gmail.com

DOI: 10.1201/9781003529231-21
134 Algorithms in Advanced Artificial Intelligence

accounted for 30% of the total, followed by property crimes distributed back propagation method is proposed method is
(20.8%) and other IPC offences (29.7%). Kidnapping was the covered in section 2.2., section 2.3 compares the image of
highest crime rate at 2.1 per 100,000. 7.4 per 100,000, and crime persons and the experimental results in section 3. The
rape 4.8 per 100,000. The UN reported a homicide rate of final one is section 4’s conclusion, which section 5 references.
2.95 per 100,000 in 2020, down from 5.46 per 100,000 in
1992 Jha, 2019[8]. The rate of investigation for IPC crimes 2. Proposed Work
is 64.9% in 2021, with a charge sheet rate of 72.3% and a
conviction rate of 57.0%. compares AdaBoost and Back Crime Person classification faces challenges due to large
Propagation in gender crime classification and investigates crime person images, high-dimensional data, and a lack of
technologies like t-SNE, PCA, and ICA in daily life and their labelled data, as each image contains numerous features
potential applications in criminal face identification. The and lacks explained data Shukla, 2020[13]. Our picture
study emphasizes the need for precise experimental results. categorization system should adapt to illumination variations,
The NCRB’s report compared crime rates between 1953 assigning the same label to two pictures of the same item
and 2006 in India. The data indicated a drop of 79.84% in with varying brightness levels, ensuring they are categorized
burglaries, a rise of 7.39% in murders, a 47.80% increase in accurately. The paper investigates the AdaBoost algorithm’s
kidnappings, a 28.85% decline in robberies, and a 10.58% effectiveness in gender crime classification, focusing on
decline in riots. There were 5,102,460 cognizable crimes image classification using t-SNE, PCA, and ICA on a law-
in 2006, up 1.5% from 2005. These included 1,878,293 enforcement dataset working on Data parallelism and
offences under the Indian Penal Code (IPC) and 3,224,167 model parallelism are the two approaches for distributing
crimes under the Special and Local Laws (SLL). Delhi saw backpropagation computation among GPUs or nodes.
the most increase in crime in 2019 out of all Indian states,
going from 1342.5 to 1586.1. Northeast India had the lowest 2.1 Understanding Face Person Classification
crime rates in 2018, with four out of five states having the AdaBoost algorithm
lowest. Uttar Pradesh reported the most crimes, while A machine learning algorithm AdaBoost is for binary
Maharashtra and Kerala had the lowest. Violent crime rates classification tasks, combining weak classifier predictions to
were highest in Assam, Tripura, Haryana, West Bengal, and create a strong classifier that performs well on the given data.
Arunachal Pradesh. Kolkata was the safest city in India in AdaBoost focuses on misclassified data points in classifiers,
2021, but experts doubt its accuracy by Kadar, 2019[9]. Out increasing their importance in training. This adaptive
of 19 cities with more than two million residents, Pune and process improves classification performance by focusing on
Hyderabad had the lowest rates of crime. The only megacities challenging data points, leading to highly accurate results
with lower crime rates than their respective states have been when combined with multiple weak classifiers by Azwad
Mumbai and Kolkata. For the fourth consecutive year, Delhi Tamir et al, 2021[2].
was India’s most criminalized metropolitan area, accounting
for more than 82% of 290,000 recorded crimes. Kochi
continued to rank second in their jurisdiction for the most
cases of reckless driving. For the second year, Jaipur had
the third-highest crime rate. The Statistics Knowledge Act
2000, passed in 2000 to regulate cyber crimes and facilitate
e-commerce, has been criticized for not effectively addressing
emerging crimes like cyber harassment and defamation. In
2021, 52,974 cybercrime cases were registered in India, with
Telangana reporting the highest number. These crimes were Fig. 21.1 Crime classification of AdaBoost algorithm
motivated by deception, which was followed by extortion and
sexual exploitation. In terms of cybercrimes against health The AdaBoost method involves initialization, repeated
systems, India came in second place worldwide. Personal iterations, classifier weight computation, weight update,
data, hospital log-in credentials, and immunization records ensemble construction, weight normalization, final
were among the breaches that were compromised. In India, classification, and final ensemble. It uses a weak classifier
preventing crime is essential to upholding law and order, to focus on incorrectly labelled data points, with weights
but other factors that may affect crime rates include poverty, adjusted based on accuracy by Anshul Saini, 2023[1]. The
unemployment, and lower per capita income. Z. Li, 2021 training dataset is weighted with identical weights, with a
[11]. The paper is set up as follows: The introduction is in weak classifier focusing on incorrectly labelled data points
Part 1. This article’s Part 2 proposed work in 2.1 provides by E. Ahishakiye,2017[5]. Weights are adjusted based on
an AdaBoost algorithm of age and location for a face. The categorization accuracy. Each weak classifier contributes
A Classifying Gender Crimes with AdaBoost and Back Propagation Algorithms 135

Fig. 21.2 Gender data process using distributed backpropagation

to the final ensemble by how well it performs. On fresh, person’s image data points. The initial assumption was for a
unforeseen data, predictions are made using the final non-symmetric Gaussian distribution for gender image data
ensemble. point pair distances in both original and reduced spaces. They
then used a Cauchy distribution, t-SNE, to model pairs in the
2.2 Gender Crime Classification Using reduced space and minimize the KL distance between the two
Distributed Back Propagation Process spaces, preserving the original pairwise proximity. t-SNE is
Data parallelism and model parallelism are the two now accessible in sci-kit-learn and Tensor Flow’s embedding
approaches for distributing backpropagation computation toolbox, enhancing data visualization tools despite being less
among GPUs or nodes. In data parallelism, the model weights glamorous than Deep Learning discoveries.
are duplicated into many processes utilizing different pieces
of hardware, with a parameter server serving as the source of 3. Experimental Result
truth. Each model gets its mini-batch of gender crime data
by Bowen, 2018[4], runs forward and backward passes, and AdaBoost predicts a person’s age and location using a law-
calculates gradients. The gradients are then averaged and enforcement dataset with decisions as weak classifiers in the
distributed once again to all worker nodes. Each iteration of experiment.
decentralized back propagation employs a distinct mini-batch Step-1 Initialization: The training dataset consists of 10 data
of data, with a master process broadcasting model weight. points and their corresponding labels (1 for “criminal” and -1
This approach may result in a quicker implementation of the for “not criminal”).
algorithm by Soni Upadhya, 2023[14].
Table 21.1 Training Data Set
2.3 Comparative Analysis of Crime Person
Data point Age Label
Classification
1 21 1
Initially naive of t-SNE, it gained popularity a few years ago
2 12 -1
and is now utilized in visually appealing plots. t-SNE, similar
3 25 1
to PCA or ICA, is a dimensionality reduction algorithm that
Gender crime classification data into a subspace that captures 4 11 -1
more intrinsic crime data meaning than the original space. 5 24 1
The three main statistical analysis methods are PCA, ICA, 6 13 -1
and t-SNE by Sarvakar Siva, 2020[15]. PCA maximizes 7 28 1
variance in each direction, resulting in statistically orthogonal
8 10 -1
new variables. ICA minimizes pairwise covariate Mutual
Information, resulting in independent components. t-SNE 9 32 1
maximizes the similarity between the original and new crime 10 10 -1
136 Algorithms in Advanced Artificial Intelligence

Fig. 21.3 Comparative analysis of crime persons

Step-3 Weighted Training: Now, The weights of misclassified

data points (1 and 2) will be increased, while those correctly
classified will be decreased.
Updated weights:
• w2 = w4 = 0.25 (increased due to misclassification)
• w1 = w3 = w5 = w7 = w9 = 0.1667 (decreased due to
correct classification)
Step-4 Classifier Weight: Calculate the weight of the weak
classifier based on its accuracy:
Fig. 21.4 Training dataset of criminal labels • Weak classifier weight = 0.5 * ln((1 — error)/error)
• Error = (w2 +w4) = 0.25 + 0.25 = 0.5
All data points are initially given equal weights like:
• Weak classifier weight ≈ 0.693
w1 = w2 = w3 = w4 = w5 = w6 = w7 = w8 = w9 = w10 = 0.2
Step-5 Updating Weights: Normalize the updated weights so
Step-2 Iterative Training: Train multiple ineffective that they sum up to 1:
classifiers, each focusing on a single concept, such as “age”
• w2 = w4 = 0.25 / (0.25 + 0.1667 + 0.1667) ≈ 0.3846
and “Location”, for simplicity.
• w6 = w8 = w10 = 0.1667/(0.25 + 0.1667 + 0.1667) ≈
Weak Classifier 1 (Decision): 0.2154
• If age ≤ 12, predict -1(may not criminal). Step-6 Ensemble Creation: Combine the predictions of all
• If age >13, predict 1(may be criminal). weak classifiers using their weights to form the ensemble’s
This classifier misclassifies data points 1 and 2. prediction:
A Classifying Gender Crimes with AdaBoost and Back Propagation Algorithms 137

• Ensemble prediction = sign(Σ(weight * classifier The ICA is a technique that reduces extraneous noise in input
prediction)) criminal data to identify independent components. If linear
• Ensemble prediction ≈ sign(0.693 * (-1) + 0.693 * 1) = and nonlinear dependencies of two input features are zero,
sign(0) = -1 they are considered independent. ICA can be used to identify
distinct input data in an authored, condensing the dataset to
Step-7 Final Classification: The ensemble predicts that the
three.
person will “not be a criminal” the person.
In the following figure, class-0 represents age, class-1
Step-8 Next Iteration and Final Result: AdaBoost uses
represents label and axes are represented like X-axes as C1,
adaptive boosting of misclassified data points to create a
Y-axes as C2 and Z-axes as C3.
strong ensemble classifier. Iterations focus on misclassified
data points, with weights adjusted for the next weak classifier.
The final prediction is a sum of all weak classifier predictions.
AdaBoost is a robust ensemble-learning algorithm that
enhances model accuracy, but it’s noise sensitivity and weak
classifier selection are limitations, necessitating careful
consideration.

3.1 Relationship Analysis of PCA, ICA, t-SNE

The comparative analysis of t-SNE is a dimensionality
reduction algorithm that reshapes gender crime classification
data into a subspace that captures more intrinsic crime data
meaning. It is similar to PCA and ICA but uses a Cauchy
distribution to model pairs in the reduced space. It is now
available in Scikit-learn and Tensor Flow’s embedding
toolbox, enhancing data visualization tools. The PCA is an
unsupervised learning method that uses original crime data
to select a feature combination that minimizes dimensions
and maximizes variances. It ranks orthogonal axes based Fig. 21.6 Independent component analysis (ICA) on crime
on relative value, focusing on variation rather than labels, persons
which can lead to misclassification. In the following figure,
The t-SNE is a non-linear dimensionality reduction method
class-0 represents age, class-1 represents label and axes are
used in speech processing and natural language processing to
represented like X-axes as C1, Y-axes as C2 and Z-axes as
visualize high-dimensional datasets. It reduces the difference
C3.
between a distribution and its corresponding low-dimensional
distribution using Kullback-Leibler divergence and gradient
descent.
In the following Fig. 21.7, class-0 represents age, class-1
represents label and axes are represented like X-axes as C1,
Y-axes as C2 and Z-axes as C3.
The framework applying dimensionality decreases procedures
like PCA, ICA, and t-SNE to wrongdoing information, taking
into account the idea of the information and the particular
objectives of the analysis is significant.
PCA accepts linearity and spotlights on catching worldwide
fluctuation. It can assist with distinguishing significant
examples in wrongdoing information and decrease the
dimensionality while keeping up with interpretability.
For instance, it could uncover regions with high or low
general crime percentages. It can be valuable for grasping
in general examples of wrongdoing across various locales.
Computationally proficient and reasonable for enormous
Fig. 21.5 principal component analysis (PCA) on crime datasets, making it down to earth for investigating broad
persons image
wrongdoing datasets.
138 Algorithms in Advanced Artificial Intelligence

Face identification. Precision and competence tests must be

addressed for accurate forecasts of the experimental result.
t-SNE is a dimensionality reduction algorithm that reshapes
gender crime classification data, capturing intrinsic crime
data meaning, and is now available in sci-kit-learn and Tensor
Flow’s embedding toolbox.

References
1. Anshul Saini: AdaBoost Algorithm: Understand, Implement
and Master AdaBoost, Analytics, September 21st, 2023.
2. Azwad Tamir et al: Crime Prediction and Forecasting using
Machine Learning Algorithms, International Journal of
Computer Science and Information Technologies, Vol. 12 (2),
26–33, 2021.
3. Bandekar, S. R., & Vijayalakshmi, C: Design and analysis of
machine learning algorithms for the reduction of crime rates
in India, Procedia Computer Science, 172, 122–127. https://
Fig. 21.7 t-distributed stochastic neighbor embedding (t-SNE) doi.org/10.1016/j.procs.2020.05.018,2020.
on crime persons 4. Bowen, D. A., Mercer Kollar, L. M., Wu, D. T., Fraser, D.
A., Flood, C. E., Moore, J. C., Mays, E. W., & Sumner, S. A:
Ability of crime, demographic and business data to forecast
ICA could be valuable assuming there are free wellsprings of
areas of increased violence. International Journal of Injury
crime that are not directly related. Nonetheless, interpretability
Control and Safety Promotion, 25(4), 443–448, https://doi.org
may be a test as the parts address genuinely free sources, which /10.1080/17457300.2018.1467461,2018.
may not straightforwardly compare to effectively justifiable 5. E. Ahishakiye, E. Opiyo, and I. Niyonzima: Crime Prediction
wrongdoing designs. Spotlights on catching autonomous Using Decision Tree (J48) Classification Algorithm,
wellsprings of variety, possibly uncovering particular sorts International Journal of Computer and Information
of crimes. It can be computationally costly, particularly for Technology (ISSN: 2279 – 0764), 05/15, 2017.
countless autonomous parts. 6. Forradellas, R. F. R., Alonso, S. L. N., Rodriguez, M. L.,
& Jorge-Vazquez, J. (2021). Applied machine learning in
t-SNE is successful in envisioning complex, non-straight
social sciences: Neural networks and crime prediction.
connections. It tends to be advantageous for recognizing Social Sciences, 10(1), 1–20. https://doi.org/10.3390/
neighbourhood examples or bunches of comparative socsci10010004,2020.
wrongdoing events however may miss the mark on the 7. Gao, Y., Wang, X., Chen, Q., Guo, Y., Yang, Q., Yang, K.,
worldwide interpretability of PCA. Stresses the conservation & Fang, T: Suspects prediction towards terrorist attacks
of neighbourhood connections, which can be significant based on machine learning. In Proceedings – 2019 5th
for distinguishing explicit confined examples or bunches of international conference on big data and information analytics,
crime. Computationally costly, particularly for huge datasets, BigDIA 2019 (pp. 126–131). https://doi.org/10.1109/
which might restrict its plausibility for broad wrongdoing BigDIA.2019.8802726,2019.
information. 8. Jha, G., Ahuja, L., & Rana, A: Criminal behavior analysis
and segmentation using K-means clustering. ICRITO
PCA is reasonable for catching crime data, ICA for 2020 - IEEE 8th International Conference on Reliability,
distinguishing free wellsprings of crime, and t-SNE for Infocom Technologies and Optimization (Trends and
envisioning restricted examples or bunches in a non-direct Future Directions), 1356–1360. https://doi.org/10.1109/
design. The decision ought to line up with the objectives ICRITO48877.2020.9197791,2019.
of the investigation and the attributes of the wrongdoing 9. Kadar, C., Maculan, R., & Feuerriegel, S: Public decision
information being analyzed. support for low population density areas: An imbalance-
aware hyper-ensemble for spatio-temporal crime prediction,
Decision Support Systems, 119, 107–117, https://doi.
4. Conclusion org/10.1016/j.dss.2019.03.001,2019.
10. Kim, S., Joshi, P., Kalsi, P. S., & Taheri, P: Crime
Advancements in technologies enable the detection of an
analysis through machine learning. In 2018 IEEE 9th
individual’s age, gender, and location through cameras, annual information technology, electronics and mobile
pictures, or videos using the Backpropagation technique. It communication conference, IEMCON 2018 (pp. 415–420),
highlights the importance of these technologies in daily life https://doi.org/10.1109/IEMCON.2018.8614828,2019.
and their possible applications in criminal identification and
A Classifying Gender Crimes with AdaBoost and Back Propagation Algorithms 139

11. Li, Z., Zhang, T., Jing, X., & Wang, Y: Facial expression 13. Shukla, S., Jain, P. K., Babu, C. R., & Pamula, R: A multivariate
based analysis on emotion correlations, hotspots, and regression model for identifying, analyzing and predicting
potential occurrence of urban crimes. Alexandria Engineering crimes, Wireless Personal Communications, 113(4), 2447–
Journal, 60(1), 1411–1420. https://doi.org/10.1016/j. 2461, https://doi.org/10.1007/s11277-020-07335-w,2020.
aej.2020.10.061,2021. 14. Soni Upadhyay: What is Back propagation Algorithms? Types
12. Saravanan, P., Selvaprabu, J., Arun Raj, L., Abdul Azeez and Examples and in its, Simple Learn, Aug 30, 2023.
Khan, A., & Javubar Sathick, K.: Survey on crime analysis 15. Sarvakar Siva: Dimensionality Reduction for Data
and prediction using data mining and machine learning Visualization: PCA vs TSNE vs UMAP vs LDA, Published in
techniques. Lecture Notes in Electrical Engineering, 688, 435– Towards Data Science, 2020.
448. https://doi.org/10.1007/978-981-15-7241-8_3,2021.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
140 Algorithms in Advanced Artificial Intelligence

Identifying Tremor Disease in Neurological

Disorders Using Finger Gesture Images 22

P. Sumithabhashini*
Professor, Department of Electronics & Communication Engineering,
Holy Mary Institute of Technology & Science, Hyderabad, India
M. V. Vijaya Saradhi2
Professor, HOD CSE & CSE(IoT), Department of Computer Science & Engineering,
ACE Engineering College, Hyderabad, India
Ramesh Alladi
Associate Professor, Department of Computer Science & Engineering,
ACE Engineering College, Hyderabad, India
Swajan Reddy
Masters statistics and data science, University of Houston,
Houston, Texas, USA

Abstract: Neurological problems, resulting from genetic disorders, congenital abnormalities, infections, lifestyle, environmental
issues, malnutrition, and brain injury recognized neurological diseases. Gesture Recognition for Image Processing uses
artificial intelligence (AI) and machine learning for classification, heat/motion detection, and movement analysis. It is based
on advanced technologies and algorithms. The application of finger imaging tremor quantification to neurological disorder
diagnosis is covered in the study. This research presents a hybrid strategy to quantify and objectively assess shake, specifically
for cerebella diseases and finger shake, combining imaging technologies and machine learning approaches. The technique uses
a Gesture Finger pictures Process of Tremor Disease Detection for AI along with image processing to differentiate between
people who are healthy and those who have crucial shaking.
Keywords: Artificial intelligence, CNN, Finger gesture images, Tremor quantification etc.

1. Introduction to user input [15]. Using sensors such as cameras or depth

sensors, the process entails recording and understanding
Image processing is a method for improving an image’s human movements in order to communicate with digital
characteristics by converting it to a two-dimensional systems or devices. The collected data is then processed and
numerical array. It can compress, sharpen, and detect edges analysed. Recent developments in image processing have
using filters and operators. The creation of new output arrays opened up a world of new possibilities for innovation, such as
with the intended results is a common practice in areas such as systems for detecting objects, social distance monitoring, and
computer vision, artificial intelligence, and machine learning fully immersive AR experiences. A technologically evolved
[2]. The term “gesture recognition” refers to a type of input/ world with enhanced visuals is the vision of this exciting
command technology that analyses facial expressions, hand possibility.
movements, and finger movements to comprehend and react
*Corresponding author: pokurisb81@gmail.com

DOI: 10.1201/9781003529231-22
Identifying Tremor Disease in Neurological Disorders Using Finger Gesture Images 141

Tremors can develop from shaking, which is a typical and Section 3 details the experimental results. Finally, section
indication of stress, rage, or sickness. More people over the 5 alludes to the conclusion of section 4.
age of 40 suffer from essential shaking, a tremor disorder
that can impact several regions of the body, including the 2. Proposed Work
hands and arms [1].Toxins in the environment and being
older are both risk factors. Deep brain stimulation or surgery The study explores the use of AI and machine learning in
may be necessary to treat essential tremors. It is prevalent Gesture Recognition for Image Processing, specifically for
in MS patients and serves as an early warning signal of the diagnosing neurological disorders like cerebella diseases.
disease [3–6]. Tremors at rest, as in Parkinson’s disease and It uses advanced technologies to quantify and assess shake,
medication-induced tremors. distinguishing between healthy and crucial individuals. The
Muscle relaxation and resistance to gravity are the root causes use of finger gesture images for tremor quantification in
of several neurological disorders, including progressive diagnosing neurological disorders.
supranuclear palsy, dystonia, rubral tremor, and Wilson’s To analyze and understand human movements while
disease. A few days—or more if drinking excessively or utilizing machine learning and Artificial Intelligence (AI) for
for an extended period of time—long tremors are the first classification and heat/motion disease detection.
symptoms of alcohol withdrawal. Medications that inhibit Gesture Image Using Diffusion model: Diffusion is the
dopamine, a lack of vitamin B12, caffeine, or anxiety can movement of energy from higher to lower concentrations,
all lead to trembling hands. B12 insufficiency impacts the driven by Gibbs free energy or chemical potential. It can
neurological system, whereas medications assist with mood be “uphill” and is a stochastic process, used in fields like
maintenance [7–10]. Drinks like coffee and tea might make statistics, probability theory, information theory, neural
your hands tremble. Several forms of stress, such as worries networks, finance, and marketing.
about money, work, relationships, or health, can exacerbate
tremors. Additional causes of physiologic tremors include The process involves adding noise to a gesture image and
excessive rage, hunger, or lack of sleep. A low blood sugar learning to remove it, then training a machine learning model
level, or hypoglycemia, causes trembling, causing the body to produce a denoised gesture image [17].
to go into a stress response. The symptoms of an overactive The process of learning the mean matrix involves assuming a
thyroid gland in the neck include an irregular heartbeat, normal noise distribution and parametrizing the distribution
tremors, and difficulty sleeping. Hand and foot tremors can mean and standard deviation matrix. This can be divided into
be brought on by a nerve injury. Talk to your doctor about a forward and reverse process. Mathematicians often use
your symptoms and medical history; treatments could differ physical processes to formalize mathematical concepts, such
[11–14].Despite the fact that most videos contain small- as Fick diffusion, heat diffusion, and Brownian motion, by
amplitude tremors, Xini Wang et al. suggested in their 2021 defining the diffusion equation, which equals the first and
article “Hand tremor detection in videos with a cluttered second space derivatives.
background using neural network-based approaches” that it The diffusion equation, a stochastic formulation, is based
is possible to detect hand tremors in videos with a cluttered on the Langevin equation, which is centered on the Wiener
background with a high degree of accuracy [16]. Possible real- process, also known as Brownian motion. This process, also
world uses for this video-based tremor detection technology known as a Random Walk, is normal distributed, making
include healthcare facilities and home monitoring systems. diffusion models intertwined with white noise generation.
Using this technology, researchers and physicians can use
their own smartphones to identify tremor automatically, even
in busy backgrounds. The design, which is based on neural
networks, learns from various training datasets; thus, it may
also be able to identify tremors when hands are moving. One
possible drawback is that it may require a lot of memory,
computing power, hidden layers, and low-resolution films.
Evaluating the system in larger cohorts and assessing factors
like skin colour, ambient lighting, and camera resolution
requires additional research.Here is the structure of the paper:
Part 1 contains the introduction. In order to eliminate picture
noise, the proposed work in Section 2.1 of this paper offers a
gesture image using a diffusion model. Section 2.2 details the
gestural finger image processing for tremor disease detection, Fig. 22.1 Gesture image using diffusion models
142 Algorithms in Advanced Artificial Intelligence

The diffusion models, which utilize a Gaussian prior to by learning vectorial representations of photos and text,
generate data, form the core of text-to-image generative respectively. In order to achieve semantic alignment [15],
models. training vector representations in tandem with picture and text
Diffusion models offer a wide range of degrees of freedom models is necessary. By indexing them into a library of vector
in implementation, allowing for the choice of variances for images of gesture fingers, we hope to quickly find relevant
the forward process and the model architecture and Gaussian ones for the purpose of tremor disease identification [18] and
distribution parameterization for the reverse process. A new store them for future vectorizing ideas. The method of tremor
explicit connection between diffusion models and Denoising quantification is granted by the image-to-text conversion.
score matching leads to a simplified, weighted variation The goal at hand is to compile a set of finger gesture graphics
bound objective for diffusion models. The model design based on the user’s input query. While one provider creates
is justified by simplicity. The forward process variances an embedded encoding, another uses a superior model to
are fixed to constants, while the reverse process entropy is re-rank the closest neighbours, making the ranking more
optimized. The study demonstrates that the reverse process personalised for the user.
mean function approximate can predict with a prediction
parameterization, simplifying the diffusion model’s variation 3. Experimental Result
bound and resembling Denoising score matching.
This study investigates the feasibility of quantifying tremor in
Gesture Finger images Process of Tremor Disease neurological illness imaging using finger scans. Withdrawal
Detection: Machine Search engines that work with gesture from alcohol or narcotics, a lack of vitamin B12, caffeine,
images rank relevant language and the most comparable tension, rage, hunger, insomnia, low blood sugar, or an
photos and then display them to the user. Models learn to overactive thyroid gland are all potential causes of tremors.
extract image characteristics and text features from inputs This research looks at the potential of using pictures of finger

Fig. 22.2 The machine finger gesture images search engine is being developed to detect tremor diseases
Identifying Tremor Disease in Neurological Disorders Using Finger Gesture Images 143

True positives (TP) = 105 and false positives (FP) = 55 are

the model’s results. Precision is equal to TP/(TP + FP) = 105/
(105 + 55) = 105/160 = 0.66 when using the formula. As a
result, the model’s precision is 0.66.

4. Conclusion and Future Perspective

Utilising state-of-the-art technologies for shaking
quantification and tremor quantification, the use of artificial
intelligence and machine learning in the field of gesture
recognition for image processing aims to aid in the diagnosis
of neurological conditions, such as cerebellar ailments. This
research looked at the use of imaging tremor quantification in
the diagnosis of neurological diseases by analysing variables
such as the effects of alcohol withdrawal, medications, stress,
Fig. 22.3 Person with different kinds of gesture images data and thyroid gland activity. Although the model has a 91%
set success rate in classifying tumours as benign or malignant,
it misses 8 out of 9 malignancies since it only detects 1
gestures to diagnose neurological illnesses by identifying malignant tumour out of 9 benign ones. There are 160 samples
tremors. in Dataset 1 that the gesture image model has predicted; 55
Accuracy: The “Rand index” or “Rand accuracy” is a of these have been deemed wrong. With 105 true positives
statistical measure that quantifies the precision with which a (TP) and 55 false positives (FP), the precision is computed
binary classification test can identify or rule out a condition. as 0.65625.
It is a parameter for the test that compares the probability
estimations before and after the test. References
TP + TN 1. Benito-León J., Serrano J.I., Louis E.D., Holobar A., Romero
Accuracy = (1)
TP + TN + FP + FN J.P., Povalej-Bržan P., Kranjec J., Bermejo-Pareja F., Del
Castillo M.D., Posada I.J., et al. Essential tremor severity
where FN = False negative, TN = True negative, FP = False and anatomical changes in brain areas controlling movement
positive, and TP = True positive. sequencing. Ann. Clin. Transl. Neurol. 2019; 6: 83–97.
On the basis of 91 accurate predictions out of 100 cases, 2. Bilge S., Jenq-Neng H., Su-In L., Linda S. Tremor Detection
the model’s accuracy in identifying 100 tumours as benign Using Motion Filtering and SVM; Proceedings of the 21st
or malignant is 91%. But the model gets just one malignant International Conference on Pattern Recognition (ICPR
tumour out of nine benign ones, so eight of the nine cancers 2012); Tsukuba, Japan. 11–15 November 2012; pp. 178–181.
3. Buijink A.W., Contarino M.F., Koelman J.H., Speelman J.D.,
go undetected. This indicates the model isn’t up to scratch
Van Rootselaar A.F. How to tackle tremor -systematic review
compared to one that consistently predicts benign outcomes. of the literature and diagnostic work-up. Front Neurol. 2012;
In a class-imbalanced dataset where the positive and negative 3: 146. doi: 10.3389/fneur.2012.00146.
labels differ significantly, accuracy is not enough. 4. Crawford P., Zimmerman E. Differentiation and diagnosis of
1 + 90 tremor. Am. Fam. Phys. 2011; 83: 697–702.
Accuracy = = 0.91 (2) 5. Dogu O., Sevim S., Camdeviren H., Un S., Louis E.D.
1 + 90 + 1 + 8
Prevalence of essential tremor: Door-to-door neurologic
Precision: Precision is the percentage of correctly classified exams in Mersin Province, Turkey. Neurology. 2003; 61:
instances or samples, calculated using the formula 1804–1806. doi: 10.1212/01.WNL.0000099075.19951.8C.
6. Elble R., Comella C., Fahn S., Hallett M., Jankovic J., Juncos
TP
Precision = (3) J.L., LeWitt P., Lyons K., Ondo W., Pahwa R., et al. Reliability
TP + FP + FN of a new scale for essential tremor. Mov. Disord. 2012; 27:
where FN = False negative, FP = False positive, and TP = 1567–1569. doi: 10.1002/mds.25162.
True positive. 7. Geraghty J.J., Jankovic J., Zetusky W.J. Association between
essential tremor and Parkinson’s disease. Ann. Neurol. 1985;
Out of 160 samples from Dataset -1, 105 of the predictions 17: 329–333. doi: 10.1002/ana.410170404.
made by a gesture image model are accurate, while the 8. Handforth A., Parker G.A. Conditions associated with
remaining 55 are not. Determine this model’s precision value. essential tremor in veterans: A potential role for chronic stress.
144 Algorithms in Advanced Artificial Intelligence

Tremor Other Hyperkinetic Mov. 2018; 8: 517. doi: 10.5334/ 15. Saurabh Adhikari et el: A Novel Machine Learning–Based
tohm.400. Hand Gesture Recognition Using HCI on IoT Assisted Cloud
9. Ishii N., Mochizuki Y., Shiomi K., Nakazato M., Mochizuki Platform, Computer Systems Science & Engineering DOI:
H. Spiral drawing: Quantitative analysis and artificial 10.32604/csse.2023.034431, Tech Science Press, CSSE, vol.
intelligence-based diagnosis using a smartphone. J. Neurol. 46, no. 2, 2023.
Sci. 2020; 411: 116723. doi: 10.1016/j.jns.2020.116723. 16. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
10. Kamble N., Pal P.K. Tremor syndromes: A review. Neurol. in Health Care Using Hybrid Model Algorithms”,Journal of
India. 2018; 66: 36–47. doi: 10.4103/0028-3886.226440. Artificial Intelligence, Machine Learning and Neural Network
11. Louis E.D., Faust P.L. Essential tremor: The most common (JAIMLNN), Volume 3, 2023.
form of cerebellar degeneration? Cerebellum Ataxias. 2020; 17. Xini wang et al:Hand tremor detection in videos with cluttered
7: 1–10, doi: 10.1186/s40673-020-00121-1. background using neural network based approaches, Health
12. Mansur P.H.G., Cury L.K.P., Andrade A.O., Pereira A.A., Information Science and Systems, Springer Nature, 2021.
Miotto G.A.A., Soares A.B., Naves E.L. A review on techniques 18. Yutong Xie, Minne Yuan, Bin Dong, Quanzheng Li: Diffusion
for tremor recording and quantification. Crit. Rev. Biomed. Model for Generative Image Denoising, arXiv: 2302.02398,
Eng. 2007; 35: 343–362. doi: 10.1615/CritRevBiomedEng. https://doi.org/10.48550/arXiv.2302.02398, 2023.
v35.i5.10. 19. Zdenka U., Otakar S., Martina H., Arnost K., Olga U.,
13. Mitsui Y., Ishii N., Mochizuki H., Zin T.T. A Study on Disease Václav H., Chris D.N., Evzen R. Validation of a new tool
Diagnosis by Tremor Analysis. Int. Multi Conf. Eng. Comput. for automatic assessment of tremor frequency from video
Sci. 2018; 1: 14–16. recordings. J. Neurosci. Methods.2011;198:110–113,
14. Sharma S., Pandey S. Approach to a tremor patient. Ann. doi: 10.1016/ j.jneumeth.2011.02.033, 2011.
Indian Acad. Neurol. 2016; 19: 433–443. doi: 10.4103/0972
Note: All the figures in this chapter were designed by the author.
2327.194409.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

An Effective Machine Learning

Technique that uses Emotive Faces in
order to Study Crimes
23

C. Syamsundar Reddy1
Research Schalor, Department of Computer Science, College of Commerce,
Management & Computer Science, Sri Venkateswara University, Tirupathi, Andhra Pradesh
G. Anjan Babu2
Professor, Department of Computer Science, SVU College of Commerce,
Management & Computer Science, Sri Venkateswara University, Tirupathi, Andhra Pradesh

Abstract: Smart cameras in cities monitor potential suspects, enabling decision analysis and the machine learning of visual
data. Federated Learning (FL) is a new approach that removes data borders and protects privacy. Suspicions can be predicted
using emotional image data or suspicious face labeling. Emotional categories, such as facial emotions, reveal negative, neutral,
and pleasant feelings using Neutrosophic logic. Facial recognition measures emotional faces, and Neutrosophic logic regression
analyses crime. Emotional knowledge can aid in more precise crime prediction. The experimental results show that the two
attributes are not independent, indicating they are dependent, and the metrics for CNN at optimal iterations (5) have been
thoroughly examined.
Keywords: Crime face emotions, Convolution neural networks, Federated learning, Neutrosophic logic, Logistic regression

1. Introduction emotions requires an understanding of facial expressions

[2]. It has long been a goal of research to create modern
Federated learning reduces data storage in a centralized data machine-vision systems that can compete with humans.
centre, allowing parallel training across devices. It’s used in Recognition of facial expressions is becoming increasingly
the transportation, industry 5.0, and industry 4.0 sectors. Edge important in applications for crime detection. Machine
computing brings computing resources closer to data sources, learning techniques have been proven efficient for computer
focusing on learning. However, it restricts data analysts from vision applications like object detection and categorization.
viewing unprocessed user data. Face recognition, a biometric It performs better than current machine learning techniques
technology, records facial topographies as face prints and uses and facilitates feature selection. Today’s machine learning
machine learning to match live-captured images to stored face algorithms have developed to the point where they can
prints. Face recognition is less invasive than other biometric categorize photographs faster and more accurately than
traits and can be used in security-related applications like humans have. Learning techniques can also be used to
forensic surveillance systems and airport security. classify facial expressions. There are numerous methods for
Facial emotion detection, a part of face recognition developed identifying facial expressions, according to Boddepalli (2021)
by Adjabi in 2020 [1], analyzes a person’s facial expressions [3]. Prior to separating action units from the person image
in both still photos and moving movies to ascertain their using Neutrosophic logic, it is important to use methods like
emotional state. Facial expressions play a significant role in facial feature point tracking or making use of variations in
everyday interpersonal interactions. Understanding human grayscale. The facial expression recognition classifier then

1
cssreddi@gmail.com, 2gabsvu@gmail.com

DOI: 10.1201/9781003529231-23
146 Algorithms in Advanced Artificial Intelligence

sends the data gathered in this way. To make the forecast,

these methods need a variety of building blocks and a lot of
computation [4].
The study proposes a novel method for extracting facial
emotions from revolving images or predefined visual saliency
photographs using Neutrosophic logic and metric learning
algorithms in machine learning.

2. Machine learning with Image

Processing
Artificial intelligence can produce convincing images but
often struggles with mechanical and anatomical accuracy [5].
Machine vision evaluates digital photographs and videos,
requiring automation for tasks like identifying expressive
faces [14]. To train algorithms, a dataset is used. Emotional
face picture observations are discovered, and trends are
evaluated to determine whose emotional face is visible in a
photograph [6].
The emotive face image may be reduced to a smaller
size, causing unreliable assessments of height and width.
However, proportions remain unchanged even after scaling.
Facial emotions share common traits, but machine-learning
algorithms can only understand numbers. A feature vector
represents a numerical representation of a “face emotional,”
arranged in a specific order.

3. Convolution Neural Networks

(CNNs) in Emotive Image
Classification
An understanding employing Convolution Neural Networks
(CNNs) for image classification by HM Shahzad, 2023 [8]
CNNs, a form of deep learning, have proven crucial to the
development of computer vision since they are specifically
made to process pixel input by Nur Alia Syahirah, 2021[10].
The process of classifying images with CNN involves
multiple steps:
1. Face Expressions Image Input: A matrix of pixel
values is used to represent the image, ranging from 0
(black) to 255 (white), with three values (red, green,
and blue) assigned to each pixel.
2. Convolution Layers: These layers apply filters or
kernels to the input and compute the kernel-input dot
product to produce feature maps. Important details like
borders, lines, and textures are identified through this
procedure.
3. ReLU (Rectified Linear Unit): The non-linear Fig. 23.1 Emotional faces
function max (0, x) is applied to all inputs in this layer.
An Effective Machine Learning Technique that uses Emotive Faces in order to Study Crimes 147

ReLU contributes to the model’s increased nonlinearity • 0 ≤ t + i + f < 3 when all three components are
because images naturally exhibit some nonlinearity. independent.
4. Pooling Layers: By pooling, the dimensionality of • The scenario is 0 ≤ t + i + f < 2 when two components
each feature map is reduced while the most important are interdependent and the third is independent of the
information is retained. An approach known as “max other two.
pooling” extracts the maximum value from the area of • 0 ≤ t + i + f ≤ 1 when all three factors are interrelated.
the image that the filter has filtered.
There is a possibility of incomplete information (sum < 1),
5. Fully Connected Layers: After a number of para-consistent and conflicting information (sum > 1), or
convolution and pooling layers, fully linked layers in complete information (sum = 1) when three or two of the
the neural network are used for high-level reasoning. components T, I, and F are independent. In a similar vein, in
As in conventional neural networks, neurons in a fully the event where T, F, and I are all reliant on one another, there
linked layer are connected to all activations in the is a chance for either whole knowledge (sum = 1) or partial
preceding layer. knowledge (sum < 1).
6. Output Layer: For example, if there are 10 classes
for 10 different sorts of objects, the result will be a
10-element vector in the final layer’s application of the
5. Federated Learning Supports
softmax function to output a probability for each class Cloud-Recognition Based Facial
of the problem. Technology
CNN’s final forecast for Tanoy, 2022 [15], indicates the class
Federated learning improves computers by using Neutrosophic
with the highest probability. The multilayered structure of
data to apply an iterative recognition model by Zhang in 2020
CNNs allows them to learn hierarchical features, which helps
[16], offering an answer to online privacy issues. It works well
them perform exceptionally well in image classification tasks.
with emotive faces due to on-device data and privacy. The
development of cloud-based facial recognition systems has
increased its potential [17], offering fast flexibility, resource
sharing, on-demand self-service and comprehensive network
connectivity. This paradigm is often used for security testing,
where a user takes a picture of a query face.

Fig. 23.3 Increasing distance of feature vectors of emotional

faces
Fig. 23.2 Image classification using CNN
Federated learning allows devices to build a shared prediction
4. Neutrosophic Logic and model using Neutrosophic logic data. The user interface
interacts with a cloud-based web API, which houses a facial
Appearance on the Criminal Face recognition engine and emotional face library. The API
Neutrosophic logic combines fuzzy logic, Intuitionistic processes images to improve on-device data and privacy.
fuzzy logic, para-consistent logic, and Intuitionistic logic. The facial recognition engine compares emotional images to
It describes logical assertions in a 3D Neutrosophic space, the user interface, and if a strong match is found, the face is
representing facial expressions of emotions. T, I, and F are labeled as belonging to a specific person. Cloud-based facial
standard or non-standard real subsets of [0, 1+], with no recognition systems offer real-time processing, on-demand
relation between them, according to R.N.V. Jagan Mohan, self-service, accessible communication, and outstanding
2021 [12]. For criminal face expressions, the standard unit scalability. They provide real-time processing, reliable
interval is used.
148 Algorithms in Advanced Artificial Intelligence

communication, and can accommodate a large user base, Table 23.1 Police records shows the type of crime in four
making them more accessible and adaptable. regions of a west Godavari district
District wise Physical Murder Rape Homicide Total
6. Neutrosophic Logic Regression Regions Assault

Analysis East 162 118 451 18 749

West 310 196 996 25 1527
Regression analysis is a method used to examine the
North 258 193 458 10 919
relationships between a dependent variable and independent
variables, according to R.N.V. Jagan Mohan (2016) [11]. It is South 280 175 390 19 864
used to identify criminals based on facial expressions, where Total 1010 682 2295 72 4059
emotional images indicate the same behavior for the same
person, according to R.N.V. Jagan Mohan [13]. The study aims to determine if crime incidence is influenced
by the region using 0.01 L.O.S.
Yi = β0 + β0Xi1 + ⋯ + βiXij + εi (1)
e11 = 186.73, e12 = 125.85, e13 = 423.49, e14 = 13.29, e21 =
for i = 1, …, n(# of obs) 379.96, e22 = 256.57, e23 = 863.38, e24 = 27.09, e31 = 228.68,
j = 1, …, N(# independent variable) (2) e32 = 154.41, e33 = 519.6, e34 = 16.30, e41 = 215, e42 = 145.
The discrepancy between a dependent variable’s observed 17, e43 = 488.51, e44 = 15.33.
value and its estimated value is often referred to as an Reject N.H. i.e., incidence of crime depends on the region
error and has a normal distribution with zero means in the since X2 = 124.5 > 21.66 = X20.01 with (4-1)(4-1) = 9 dof.
model above. An error, a form of uncertainty, can arise The experiment with the CNN algorithm can only understand
from various factors, including insufficient or incorrectly numbers of emotive face expression image categories like
chosen independent variables, poor fit, and more. The happy, neutral, and angry based on Neutrosophic logic like
parameters indicate the influence of independent factors on true, indeterministic, and false in the FERET database shown
the dependent variable, with the least squares (LS) approach in Fig. 23.1, but CNN’s accuracy, sensivity, specificity,
being the most commonly used method by M. Chandrasekhar precision, and F1-score can scale emotive face images
in 2021 [9]. Neutrosophic logic refers to the discrepancy for unreliable height and width assessments. The study
between observed and estimated values in Neutrosophic implements the true and indeterministic condition of emotive
Linear Regression (NLR), influenced by the logical structure image expression metrics for CNN at optimal iterations (5),
of the Neutrosophic system. resulting in the following results:
Yi* = AiXij (3)
for i = 1, …, n(# of obs) Table 23.2 The metrics obtained for CNN at optimal iterations
(5) have been thoroughly examined
j = 1, …, N(# independent variable) (4)
Algorithm Accuracy Sensitivity Specificity Precision Fl-score
Regression analysis has been revived in recent years by the
CNN 0.97 0.98 0.96 0.96 0.98
construction of numerous types of Neutrosophic regression
models by R.N.V. Jagan Mohan in 2021 [12].

7. Experimental Result CNN

0.985
The experimental result is that following police records 0.98
shows the type of crime in four regions of a west Godavari 0.975
district. The police records have to identify the criminals
0.97
using facial expressions, where emotional images indicate
0.965
the same behavior for the same person. The problem
0.96 CNN
involves classifying crime samples based on two random
0.955
attributes, with the null hypothesis testing if the attributes are
0.95
independent. Then
pij = (Probability of getting value belonging to ith row)
X (Probability of getting value belonging to jth row))
The alternative hypothesis suggests that the two attributes are
not independent, i.e., dependent. Fig. 23.4 The metrics obtained for CNN at optimal iterations
(5) have been thoroughly examined
An Effective Machine Learning Technique that uses Emotive Faces in order to Study Crimes 149

Emotion Recognition Using CNN-Based Features, Appl. Sci.

8. Conclusion 2023, 13(9), 5572; https://doi.org/10.3390/app13095572,
The suggested method uses machine learning’s metric 2023.
learning algorithm and Neutrosophic logic to extract facial 9. Maran Chandrasekaran: Logistic Regression for Machine
emotions from rotating images or from photos of faces with Learning, Capital One, November 8, 2021.
10. Nur Alia Syahirah Badrulhisham and Nur Nabilah Abu
predetermined visual saliency. Edge-federated learning is a
Mangshor: Emotion Recognition Using Convolutional Neural
machine learning architecture that distributes data among
Network (CNN), Journal of Physics: Conference Series,
numerous edge devices while preserving privacy when Volume: 1962, DOI: 10.1088/1742-6596/1962/1/012040,
resources are restricted. Recognizing faces and identifying 2021.
criminals using emotional faces based on Neutrosophic logic. 11. R.N.V.Jagan Mohan: Enhancement of Big Image Processing
Using Naïve based Logistic Regression, Published in
References MAYFEB Journal of Electrical and Computer Engineering,
Canada, Vol-2, Pages:1-7, 2016, Published: 2017-07-19.
1. Adjabi, I., Ouahabi, A., Benzaoui, A., and Taleb-Ahmed, 12. R.N.V.Jagan Mohan: Crime Data Optimization Using
A: Past, present, and future of face recognition: a review, Neutrosophic Logic, Concurrency and Computation Practice
Electronics: 9:1188, DOI: 10.3390/electronics9081188, 2020. and Experience, https:/doi.org/10.1002 /cpe.553,Wiley Online
2. Ajili, I., Mallem, M., and Didier, J. Y: Human motions Library,29,March, 2022,Impact factor:1.536,2020 Journal
and emotions recognition inspired by LMA qualities, Vis. Citation Reports (Clarivate Analytics):69/108 (Computer
Compute 35, 1411–1426, doi: 10.1007/s00371-018-01619-w, Science, Software Engineering) 57/110 (Computer Science,
2019. Theory & Methods) Online ISSN:1532-0634, https://doi.
3. Boddepalli Kiran Kumar: Facial Emotion Recognition and org/10.1002/cpe.6973.
Detection Using CNN, Turkish Journal of Computer and 13. R.N.V.Jagan Mohan: Empirical Analysis on Uncertain
Mathematics Education, Vol. 12 No. 14, 5960–5968, 2021. Crime Data Using Hybrid Approaches, Computer Integrated
4. Chu, William Wei-Jen Tsai, Hui-Chuan, YuhMin Chen and Manufacturing Systems, Vol: 28, No. 12, 2022.
Min-Ju Liao: Facial expression recognition with transition 14. Sarker, I.H: Deep learning: A comprehensive overview on
detection for students with high-functioning autism in adaptive techniques, taxonomy, applications and research directions,
e-learning.” Soft Computing: 1–27, 2017. SN Computer Science, 2, 420, 2021.
5. Dominguez Jimenez, J. A., Campo-Landines, K. C., Martinez- 15. Tanoy Debnath, Md. Mahfouz Reza, Anichur Rahman, Amin
Santos, J. C., Delahoz, E. J., and Contreras-Ortiz, S. H. : Beheshti, Shahab S. Band & Hamid Alinejad-Rokny: Four-
A machine learning model for emotion recognition from layer ConvNet to facial emotion recognition with minimal
physiological signals, Biomed. Signal Process. Control 55: epochs and the significance of data diversity, Scientific
101646, doi: 10.1016/j.bspc.2019.101646,2020. Reports, Data Mining and Knowledge Discovery, 2022.
6. Dubuisson, S., Davoine, F., Masson, M.: A solution for facial 16. Zhang, J., Yin, Z., Chen, P., and Nichele, S: Emotion
expression representation and recognition, Signal Process recognition using multi-modal data and machine learning
Image Communication, 17, 657–673, DOI: 10.1016/S0923 techniques: a tutorial and review, Inf. Fusion 59, 103–126,
5965(02)00076-0, 2002. DOI: 10.1016/j.inffus.2020.01.011, 2020.
7. Gampala, V., Kumar, M.S., Sushama, C. and Raj: Deep 17. Zhenjie Song: Facial Expression Emotion Recognition
learning based image-processing approaches for image Model Integrating Philosophy and Machine Learning Theory,
deblurring, Materials Today: Proceedings, 2020. Frontiers, Psychology, Volume: 12, 2021.
8. H. M. Shahzad, Sohail Masood Bhatti, Arfan Jaffar, Sheeraz Note: All the figures and table in this chapter were designed by the
Akram, Mousa Alhajlah,Awais Mahmood: Hybrid Facial author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
150 Algorithms in Advanced Artificial Intelligence

Increasing the Reliability of Intercropping

in Agriculture Using Machine Learning 24

M. Srikanth1
Research Scholar, Department of Computer Science and Engineering,
GIET University, Odisha
R. N. V. Jagan Mohan2
Associate Professor, Department of Computer Science and Engineering,
GIET University, Odisha
M. Chandra Naik3
Professor, Research scholar, Department of Computer Science and Engineering,
GIET University, Odisha

Abstract: Machine learning has the potential to revolutionize agriculture by helping farmers optimize crop yields, reduce
costs, and improve sustainability. One way to use machine learning in agriculture is to optimize multi-cropping, which involves
growing multiple crops simultaneously on the same piece of land. This paper proposes a new approach to optimizing multi-
cropping using reinforcement learning. Reinforcement learning is a type of machine learning that allows agents to learn to
behave in an environment by trial and error. In the context of multi-cropping, the agent is a machine learning model that is
trying to learn to select the best crops to grow together and how to manage them in order to maximize yield. The proposed
approach uses reinforcement learning to optimize the hyper parameters of the machine learning model. Hyper parameters are
the settings of the machine learning model, such as the number of trees in a random forest model or the learning rate of a neural
network. By optimizing the hyper parameters, the machine learning model can be trained to better predict crop yields and make
better decisions about crop management. The proposed approach was evaluated on a real-world dataset of crop yields from
India. The results showed that the proposed approach was able to significantly improve crop yields compared to traditional
methods of multi-cropping.
Keywords: Machine learning, Agriculture, Multi-cropping, Reinforcement learning, Hyper parameter optimization, Crop yield

1. Introduction alternative to slash-and-burn farming. Planting two crops in

close proximity can enhance their fitness and yield when
Intercropping is a method of cultivating multiple crops they interact in a way that enhances their overall growth. The
simultaneously on a single field, aiming to increase yield tropical multi-tier system, consisting of coconut at the top,
by utilizing resources not typically used by a single crop by banana at the middle, and pineapple, ginger, or leguminous
Abhishek, 2020 [1]. Planning is crucial, considering soil, fodder at the bottom, is an example. Intercropping requires
climate, crops, and varieties, and avoiding competition for both spatial and temporal overlap between crops, with various
space, nutrients, water, or sunlight. Intercropping strategies, types varying temporal and spatial mixtures by Harrell, 2023
such as planting deep-rooted crops with shallow-rooted ones []. Mixed intercropping is a fundamental technique where
or tall crops with shorter ones, are proposed as an eco-friendly multiple crops are freely mixed within the available space.
1
srikanth.mandela@giet.edu, 2mohanrnvj@gmail.com, 3srichandra2007@gmail.com

DOI: 10.1201/9781003529231-24
Increasing the Reliability of Intercropping in Agriculture Using Machine Learning 151

Row crops are crops planted in wide rows, suitable for where, S = Total number of samples, P(yes) = probability of
tilling or cultivation using agricultural machinery, and sown yes, P(no) = probability of no
through drilling or transplanting instead of broadcasting. The Gini index measures impurity in CART algorithm,
Temporal intercropping is a method where a fast- growing favoring low-ranked attributes for binary splits. It’s calculated
crop is sown alongside a slow-growing crop, ensuring that using a formula, ensuring binary splits in decision trees.
the fast-growing crop is harvested before the slow-growing
crop matures. Relay cropping involves sowing a second crop Gini Index = 1 – ∑jPj2 (3)
during the first crop’s growth, allowing the first crop to be
harvested. Intercropping, on the other hand, grows different 2. Multi-Crops Hyperparameters
crops in a sequence of seasons. Multicropping is also known Optimize with Reinforcement
as intercropping, is the practice of growing multiple crops
simultaneously on the same land, doubling crop productivity Learning
and income, and the choice depends on mutual benefits. Reinforcement Learning is utilized to optimize multi-crops
Multiple cropping, a technique involving growing multiple Hyperparameters. Reinforcement Learning (RL) is a method
crops simultaneously, enhances crop yield through the use of for optimizing multicrops in agriculture using machine
advanced technologies like seeds, fertilizers, and pesticides. learning processes, based on a Markov chain of improved
Intercropping Using Decision Tree Classification: Decision model performance. Supervised learning is a significant
Tree is a supervised learning technique used for classification challenge in predicting multi-crops yield based on profit/
and regression problems, with its tree-structured classifier loss harvesting to reach the next state. The model is trained
consisting of internal nodes representing dataset features, to predict the probability of the next crop case using the
branches representing decision rules, and leaf nodes most recent Hyperparameters values and their predictive
representing outcomes. Decision nodes make decisions, performance. The model uses a harvest of multicrops loss
while leaf nodes represent outcomes proposed by R. N. V. function to train on a discretized Hyperparameters space,
Jagan Mohan, 2012 []. A decision tree is a flowchart structure aiming to optimize future crop rewards by tracking past
where each node represents a test on an attribute, each branch agricultural states H:r = M(H).
represents the outcome, and each leaf node represents a class A Reinforcement Learning model R predicts a value q using
label. The text provides an overview of various types of H and r, with q = R(H, r). The optimal action maximizes
multiple crops in agriculture. q, and can be predicted for past H and r using the formula
q’ = R(past H, past r), where r and q represent future values.
The model minimizes mean square error by calculating
q’ – (r + g * max q) ^ 2. The policy gradient is preferred
for multicrops classification due to its ease of management
and compatibility with high Hyperparameters space
dimensionality. To indicate a model’s preference for certain
Fig. 24.1 Types of multiple crops in agriculture Hyperparameters to be 1: L = –1 * logP(next H|current H,
current r), use cross entropy to increase the probability of
Attribute Selection Measures (ASM) are techniques used generating them. Policy gradient weighs the sample with the
in Decision Tree implementation to select the best attribute reward value L = –(next reward) * log P(next H|current H,
for root and sub-nodes, with popular techniques being current r) With next reward=M(next H). The multicrops
Information Gain and Gini Index. model is designed to optimize for Hyperparameters with high
Information gain measures entropy changes after dataset profit rewards and minimal impact on loss reward, each with
segmentation based on attribute. It calculates feature its own multicrops classification option.
information about a class. Decision trees maximize
information gain, splitting nodes based on highest gain. 3. Multi-Crops Optimizing
Information Gain = Entropy(S) – [(Weighted Avg) * Entropy Hyperparameters
(each feature)] (1)
The section discusses the significant role of grid search in
Entropy is a metric that measures the impurity of an attribute
model tuning.
and indicates randomness in data, and can be calculated using
various methods. The Problem: Finding the model Multi-Crops
Hyperparameters for Agriculture machine learning model
Entropy(s) = –P(yes)log2 P(yes) – P(no) log2 P(no) (2)
152 Algorithms in Advanced Artificial Intelligence

can be challenging due to manual tuning and potential missed to as “model selection.” If there are features X and a target
configurations. Y, the best transformation F can be determined from the data
Where it Shows: Suboptimal Multi-Crops Hyperparameters by: Y = F(X)
can cause underfitting or overfitting in a model, impacting its The word “optimal” denotes the existence of a model
ability to accurately predict new crop data. performance metric, and the “optimal” model is the one that
The Effect: Poor Multi-crop Hyperparameter choices can maximizes that statistic. It is important to consider a number
lead to ineffective models, resulting in missed insights and of axes in order to improve our model:
inaccurate predictions. 1. The model parameter space: Use statistical learning
The Solution: Grid Search is an automated method to “train” a model and “optimize” this “space”. The
that systematically explores various Multi-Crops parameters are learned using an optimization approach,
Hyperparameter combinations to find the best group for such as the maximum likelihood estimation principle.
MulticropsHyperparameter optimization. 2. The Model Paradigm Space: It is possible to employ
a variety of supervised learning algorithms to address
Practical Steps:
the same issue. Depending on the particular dataset,
1. Define Multi-Crops Hyperparameter Grid: To algorithms like Naive Bayes, XGBoost, or Neural
optimize Multi-Crops Hyperparameters, specify Network may perform significantly differently.
desired values and range of exploration, like in Random 3. The Hyperparameters Space: Make these choices in
Forest model, like tree number and maximum depth. order to put up our training run, even though statistical
2. Setup Cross-Validation: K-fold cross-validation is learning cannot enhance these model parameters.
a robust technique that divides multicrops data into 4. The Model Architecture Space:This applies more
subsets for training and testing, preventing overfitting to neural networks. A set of Hyperparameters can
during Multi-Crops Hyperparameter tuning. be used to describe the model architecture, but the
3. Run the Grid Search: Grid Search is a method that search is typically more involved than with ordinary
thoroughly tests all possible combinations of Multi- Hyperparameters. The size of the search space can
Crops Hyperparameters, training and evaluating the reach 10^40.
model for each of them. 5. The feature space: The proper feature must be chosen
4. Select the Best Configuration: Grid Search provides to feed our model. Depending on the features that can
the most suitable Multi-crop Hyperparameter set based be used, different models will respond in different
on the chosen evaluation metric, such as accuracy, F1 ways. Excessive characteristics and possible overfit. It
score, or mean squared error. might not fit if there aren’t enough features.
5. Fine-Tune Further: The Multi-Crop Hyperparameters
are optimized and further refined using techniques
like Random Search or Bayesian Optimization for
enhanced performance. Grid Search simplifies Multi-
Crop Hyperparameter tuning, allowing for more
efficient model building and multi-crops data insights
extraction.

Fig. 24.2 Optimizing multicrops hyperparameters

Optimizing a model: The procedure of selecting the best Fig. 24.3 Optimizing machine learning model: The different
model for a specific training multi-crops dataset is referred axes
Increasing the Reliability of Intercropping in Agriculture Using Machine Learning 153

6. The Feature Transformation Space: To enhance the perform cross-validation, and create visualizations. Once a
performance of our model, take into account several reliable machine learning model is developed, deploy it as
transformations as the Box-Cox transformation and part of an agricultural decision support system to provide
feature encoding. recommendations. Continuously collect and update the
models to improve reliability over time. Integrate the system
4. Experimental Result into daily operations, provide training, and implement a
monitoring system for tracking results and user feedback.
To effectively predict crop yields in intercropping in The integration of machine learning and data-driven
agriculture, gather relevant data on crop types, soil types, decision-making in agriculture can lead to a more reliable,
climate conditions, temperature, rainfall, and yields. Prepare sustainable, and efficient system, benefiting both farmers and
the collected data for machine learning by cleaning, handling the environment.
missing values, and encoding categorical variables. Identify
the most relevant features and choose appropriate machine
learning models. Split the dataset into training and testing 5. Conclusion
sets, train the models, and evaluate their performance. This study has shown that reinforcement learning can be
Optimize the models by adjusting Hyperparameters, used to effectively optimize multi-crops Hyperparameters in
154 Algorithms in Advanced Artificial Intelligence

agriculture. The proposed approach was able to significantly number of factors. The proposed approach is also scalable
improve crop yields compared to traditional methods of multi- and can be applied to a wide range of crops and growing
cropping. The significance of this research is that it provides conditions. This means that it has the potential to make a
a new and promising approach to using machine learning to significant impact on the global food system. In the future,
improve the efficiency and productivity of agriculture. The we plan to further evaluate the proposed approach on a larger
proposed approach is particularly well-suited for multi- scale and to develop new reinforcement learning algorithms
cropping, which is a complex problem that involves a large that are specifically designed for multicrops optimization.
Increasing the Reliability of Intercropping in Agriculture Using Machine Learning 155

We also plan to work with farmers to implement the proposed 15. Kumar, J. M. S. V., et al. “Reverse Engineering A Generic
approach in real-world settings. Software Exploration Environment Is Made Of Object
Oriented Frame Work And Set Of Customizable Tools.”
International Journal of Advanced Research in Computer
References Science 2.5 (2011).
16. Kumar, J. M. S. V., et al. “Analyzing the Modern Tool-
1. Abhishek, Aditya: Multiple Cropping- Definition, Benefits
Supported UML-Based Static Reverse Engineering.”
and Selection of Crops, Agriculture Review, Retrieved 2020
International Journal of Advanced Research in Computer
12-14, 2020.
Science 3.4 (2012).
2. Harrell, Stevan: An Ecological History of Modern China,
17. Kumar, J. M. S. V., et al. “Active Scrutiny Techniques for the
Seattle: University Of Washington Press, ISBN 978-0-295
Reconstruction of Architectural Views.” International Journal
75171-9, 2023.
of Advanced Research in Computer Science 3.1 (2012).
3. R. N. V. Jagan Mohan and R. Subbarao and Raja Sekhara
18. N Santha Raju, JMSV Kumar, B Sujatha,”Time series analysis
Rao K.,: Efficient K-Means Cluster Reliability on Ternary
of stock price movements: Insights from data mining using
Face Recognition using Angle Oriented Approach, Published
machine learning”, journal AIP Conference Proceedings,
In the Proceedings of International Conference on Advances
Volume 2492, Issue1, Publisher AIP Publishing,2023.
in Communication, Navigation and Signal Processing
19. Prayaga Atchyut Pavan, Sattibabu Sattibabu, JMSV Kumar
Technically Co-Sponsored by IEEE, Hyderabad Section,
“A deep learning approach to detect malaria “Journal AIP
March 17th-18th, 2012, Dept of ECE, Andhra University
Conference Proceedings, Volume 2492, Issue 1, Publisher AIP
College of Engineering (A).
Publishing, 2023.
4. X. Chen et al., “Optimizing Crop Planning with Reinforcement
20. Ch Bhanu Revathi, JMSV Kumar, B Sujatha” Intracranial
Learning,” in International Conference on Machine Learning
hemorrhage detection in human brain using deep learning “
(ICML), 2017.
Journal AIP Conference Proceedings, Volume 2492, Issue 1,
5. R. Singh and B. R. Borah, “Agricultural Data Analysis Using
Publisher AIP Publishing, 2023.
Machine Learning: A Review,” in 2017 2nd International
21. JMSV RAVI KUMAR” Human Activity Recognition using
Conference on Computational Systems and Information
Machine Learning “ Journal AIP Conference Proceedings,
Technology for Sustainable Solutions (CSITSS), 2017.
Volume 2492, Issue 1, Publisher AIP Publishing, 2023.
6. S. Wang et al., “Agricultural Production Prediction Using
22. J Kumar, A Shahi, R Aytha, G Varri, D Brundavanam “ Vehicle
Machine Learning Algorithms,” in 2019 IEEE International
theft prevention system using IoT “Journal AIP Conference
Conference on Big Data (Big Data), 2019.
Proceedings, Volume 2492, Issue 1, Publisher AIP Publishing,
7. A. Sinha and N. Kumar, “Machine Learning Applications in
2023.
Agriculture: An Overview,” in Journal of Crop Science and
23. J Kumar, TD Nagendra, M Harshitha, AB Prakash “ Fake
Biotechnology, 2019.
image detection using CNN “Journal AIP Conference
8. R. Salim et al., “Application of Machine Learning Techniques
Proceedings, Volume 2492, Issue 1, Publisher AIP Publishing,
in Agriculture: A Review,” in Information Processing in
2023.
Agriculture, 2020.
24. J Kumar, MN Kumar, NV Narendra, P Pradeep “ driver
9. H. Nguyen and S. Shekhar, “Reinforcement Learning
drowsiness monitoring system using machine learning svm
for Precision Agriculture: A Review,” in Computers and
algorithm “Journal AIP Conference Proceedings, Volume
Electronics in Agriculture, 2019.
2492, Issue 1, Publisher AIP Publishing, 2023.
10. B. Basso et al., “Crop Yield Prediction: A Machine Learning
25. JMSV Ravi Kumar “ A Symmetric Searchable Encryption
Approach,” in Computers and Electronics in Agriculture,
Identification of Data on Probabilistic Trapdoors “International
2017.
Journal of Engineering and Advanced Technology (IJEAT),
11. A. Singh et al., “Machine Learning Techniques for Agriculture
ISSN: 2249 – 8958, Volume 9, Issue 3, Publisher Blue Eyes
Crop Yield Prediction,” in 2018 9th International Conference
Intelligence Engineering & Sciences Publication, 2020.
on Computing, Communication and Networking Technologies
26. JMSV Ravi Kumar “Artificial Bee Colony Algorithm: A
(ICCCNT), 2018.
Survey and Recent Applications” published in International
12. S. Patel et al., “Machine Learning-Based Techniques for
Journal of Pure and Applied Mathematics, ISSN 1314-3395,
Agriculture and Crop Yield Prediction: A Review,” in 2019
VOLUME 118, ISSUE 24 , Jul-18.
4th International Conference on Internet of Things: Smart
27. JMSV Ravi Kumar “Authentication for Cloud Services
Innovation and Usages (IoT-SIU), 2019.
using Steganography” published in International Journal of
13. A. Singh et al., “Intelligent Agriculture: A Survey on
Engineering and Technology(UAE)-IJET, ISSN 2227-524X,
the Impact of Artificial Intelligence on Agriculture,” in
Volume 7, Issue 3.49 , Jul-18.
2018 International Conference on Computing, Power and
28. JMSV Ravi Kumar “A review on task scheduling algorithms
Communication Technologies (GUCON), 2018.
in cloud computing and their approaches” published in
14. Kumar, J. M. S. V., et al. “System Testability Assessment and
International Journal of Pure and Applied Mathematics, ISSN
testing with Micro architectures.” International Journal of
1314-3395, Volume 118, Issue 24, Jul-18.
Advanced Research in Computer Science 2.6 (2011).
156 Algorithms in Advanced Artificial Intelligence

29. JMSV Ravi Kumar “Review of Data mining Technique using 42. M. Srikanth, “Tackling Outliers for Predictive Smallholder
SaaS on the Cloud” published in International Journal of Pure Farming Analysis,” in Proceedings of the 2023 3rd International
and Applied Mathematics, ISSN 1314-3395, VOLUME 118, Conference on Smart Data Intelligence (ICSMDI), pp. 93-98,
ISSUE 24 , Jul-18. IEEE Xplore, March 26, 2023. [Scopus]
30. JMSV Ravi Kumar “Smart Controlling, Monitoring and 43. M. Srikanth, “Blockchain-Based Consensus For A Secure
Automation of Street Light System using Raspberry PI Smart Agriculture Supply Chain,” European Chemical
“ published in International Journal of Pure and Applied Bulletin, vol. 12, special issue 4, pp. 8669-8678, 2023.
Mathematics, ISSN 1314-3395, VOLUME 118, ISSUE 24 , [Online]. Available: doi: 10.48047/ecb/2023.12.si4.776.ISSN:
Jul-18. 2063-5346, 2023. [Scopus]
31. JMSV Ravi Kumar “A Survey on Internet of Things for 44. M. Srikanth, “Predict Early Pneumonitis in Health Care Using
Healthcare and Medication Management” was authored by Hybrid Model Algorithms,” Journal of Artificial Intelligence,
JMSV Ravi Kumar published in International Journal of Pure Machine Learning and Neural Network (JAIMLNN), vol. 3,
and Applied Mathematics, ISSN 1314-3395, VOLUME 118, issue 03, pp. 14-26,ISSN: 2799-1172, Apr. 2023.
ISSUE 24 , Jul-18. 45. M. Srikanth, R. N. V. Jagan Mohan, M. Chandra Naik. (2023).
32. JMSV Ravi Kumar “SECRBAC: Secure Data in the Clouds” A New Way to Improve Crop Quality and Protect the Supply
was authored by JMSV Ravi Kumar published in International Chain is to use a Trajectory Network and Game Theory.
Journal of Research, ISSN 2348-6848, VOL 5, ISSUE 15 , Mathematical Statistician and Engineering Applications,
Jul-18. 71(4), 10600–10610. https://doi.org/10.17762/msea.
33. JMSV Ravi Kumar “EBPH MAC: Emergency Based v71i4.1952, ISSN: 2094-0343, 2023 [Scopus]
Priority Hybrid Medium Access Control for Mobility 46. M. Srikanth, “Auction Algorithm: Peer-To-Peer System Based
Aware Cooperative WSN’s In Indoor Industrial Monitoring” on Hybrid Technologies for Smallholder Farmers to Control
published in International Journal of Research, ISSN 2348 Demand and Supply,” International Journal of Research In
6848, VOLUME 5, ISSUE 12 , Jul-18. Science & Engineering (IJRISE), vol. 3, issue 1, pp. 9–23,
34. JMSV Ravi Kumar “Prioritizing software components for 2023.
realistic reuse” published in International Journal of Sciences 47. M. Srikanth, “Smallholder Farmers Crop Registering Privacy-
& Applied Research, ISSN 2394-2401, VOL 4, ISSUE 24, Preserving Query Processing over Ethereum Blockchain,”
Jul-17. Journal of Pharmaceutical Negative Results, vol. 13, issue 7,
35. JMSV Ravi Kumar “Cloud Storage Services and Privacy pp. 5609-5617, Dec. 2022. [Scopus]
Protection” published in International Conference on Research 48. M. Srikanth, “The Early Detection of Alzheimer’s Illness
Advancements in Computer Science and Communication, Using Machine Learning and Deep Learning Algorithms,”
ISSN 978-93-85100- 64-2, VOL 5, ISSUE 3.49, December-16. Journal of Pharmaceutical Negative Results, vol. 13, issue 9,
36. JMSV Ravi Kumar “Analyzing the Modern Tool-Supported pp. 4852-4859, Nov. 2022. [Scopus]
UML-Based Static Reverse Engineering” published in 49. M. Srikanth, “Small Holders Farming Predictive Analysis
International Journal of Advanced Scientific Research and Using Peer-To-Peer Approach,” International Journal of
Technology, ISSN 0976-5697, VOL 3, ISSUE 4, Jul-12. Agriculture and Animal Production, vol. 2, issue 05, pp. 26
37. JMSV Ravi Kumar “Active Scrutiny Techniques for the 37, Sep. 2022.
Reconstruction of Architectural Views” published in 50. M. Srikanth, “Using Machine Learning and Neural Networks
International Journal of Advanced Scientific Research and Technologies, a Bottom-Up Water Process Is Being Used To
Technology, ISSN 0976-5697, VOL 3, ISSUE 1, January-12. Reduce All Water Pollution Diseases,” Journal of Artificial
38. JMSV Ravi Kumar “System Testability Assessment and Intelligence, Machine Learning and Neural Network
testing with Micro architectures” published in International (JAIMLNN), vol. 2, Oct. 2022.
Journal of Advanced Scientific Research and Technology, 51. M. Srikanth, “Blockchain Enable for Smallholder’s Farmers
ISSN 0976-5697, VOL 2, ISSUE 6, December-11. Crop Transaction Using Peer-to-Peer,” Indo-American Journal
39. JMSV Ravi Kumar “Reverse Engineering A Generic Software of Agricultural and Veterinary Sciences, vol. 10, issue 3, pp.
Exploration Environment is made of Object-Oriented 33-43, Sep. 2022.
Frame Work and Set of Customizable Tools” published in 52. M. Srikanth, “Protecting Tribal Peoples Nearby Patient Care
International Journal of Advanced Scientific Research and Centres Use a Hybrid Technique Based on a Distribution
Technology, ISSN 0976-5697, Vol 2, Issue 5, September-2011. Network,” International Journal of Health Sciences, Jun.
40. M. Srikanth, “Integrated Technologies for Proactive Bridge- 2022. [Scopus]
Related Suicide Prevention”, Journal of Namibian Studies, 53. M. Srikanth, “Blockchain-Based Crop Farming Application
Volume 1, Issue 33, Pages 2117-2136, ISSN: 1863-5954, Sep Using Peer-to-Peer,” Journal of Xidian University, Apr. 2022.
2023. [Scopus] 54. M. Srikanth, “Stop Spread Corona Based on Voice, Face and
41. M. Srikanth, “Deep Learning Approaches for Predictive Emotional Recognition Using Machine Learning, Query
Modeling and Optimization of Metabolic Fluxes in Engineered Optimization and Blockchain Technology,” Solid State
Microorganism” International Journal of Research in Science Technology, Vol. 63 No. 6 (2020) [Scopus]
&Amp; Engineering (IJRISE) ISSN: 2394-8299, 3(05), 1–11. 55. M. Srikanth, “Machine Learning for Query Processing System
https://doi.org/10.55529/ijrise.35.1.11, July 2023. and Query Response Time Using Hadoop,” IJMTST, Aug.
2020.
Increasing the Reliability of Intercropping in Agriculture Using Machine Learning 157

56. M. Srikanth, “Block-level Based Query Data Access Service 58. M. Srikanth, “A New Approach for Authorship Verification
Availability for Query Process System,” IEEE, Page 1-9, Jul. Using Information Retrieval Features,” Springer-ICSE, vol.
2020. [Scopus] 74, pp. 23-29. [Scopus]
57. M. Srikanth, “Query Response Time in Blockchain Using 59. M. Srikanth, “An Enhanced and Naive Clustering Algorithm
Big Query Optimization,” The Role of IoT and Blockchain for Text Classification Based on Weight,” International Journal
Techniques and Applications from Computer Science and & Magazine of Engineering, Technology, Management and
Information Management, Apple Academic Press, Exclusive Research, Dec. 2012.
Worldwide distribution by CRC Press Taylor & Francis
Group, Jan. 2022. [Scopus]
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
158 Algorithms in Advanced Artificial Intelligence

Retrieval Augmented Generation

Classification Algorithm for
Fake News Detection
25

Ravisankar Malladi1
Dept. of Computer Science and Engineering,
Koneru Lakshmaiah Education, Vaddeswaram, Guntur
V. T. Ram Pavankumar2
Dept. of Master of Computer Applications,
K. B. N. College, Vijayawada
M. Arulselvi3
Dept. of Computer Science and Engineering,
Annamalai University, Tamilnadu
Konatham Sumalatha4
Dept. of DBS, Vellore Institute of Technology,
Vellore, Tamilnadu

Abstract: False information that is reported as news and frequently intended to harm reputations or make money is called fake
news. It has been used to refer to all types of misleading information since it was first introduced in the 1890s and is frequently
produced by adversarial foreign actors. Due to the variety of fake news kinds, experts increasingly prefer the term “information
disorder” as a neutral and educational word. The study addresses information overload and filtering issues on social media
by introducing a new approach to assessing news reliability. It uses Cosine Similarity based Retriever Augmented Generation
(RAG) technique and classification deep learning algorithm to classify news as fake or real, with the best-performing feature
predicting its authenticity. The system achieved 91% accuracy in testing.
Keywords: Cosine similarity, Deep learning, Fake news, Genuine news, Retriever augmented generation etc.

1. Introduction easily. It is easy to spread and can damage an individual’s or

an organization’s reputation. Despite its benefits, fake news
With the recent growth of social media, particularly the poses a significant threat in the digital age.
Facebook News Feed, the incidence of fake news has The machine learning component of artificial intelligence is
skyrocketed, and this false information is progressively crucial increasing systems [2] for learning and performing
making its way into the mainstream media. A number of tasks. Various algorithms, including supervised, unsupervised,
elements, including political polarization, post-truth politics, and Reinforcement learning is used in various industries. This
motivated reasoning, confirmation bias, and social media project proposes using Naive Bayes, Logistic Regression,
algorithms, have been linked to the propagation of fake Random Forest, SVM, and KNN to identify fake news [3].
news. Fake news, or misleading information, is a prevalent Earlier works on these lines by various authors are as follows,
issue in the digital age [1], spreading false ideas quickly and this section discusses the background work for proving

1
mravisankar@kluniversity.in, 2mrpphd2018@gmail.com, 3marulcse.au@gmail.com, 4konatham.sumalatha@vit.ac.in

DOI: 10.1201/9781003529231-25
Retrieval Augmented Generation Classification Algorithm for Fake News Detection 159

the performance of our proposed method, focusing on the time less complicated. This delivers the highest accuracy in
literature survey, which is crucial in software development. comparison to the present system.
Yafra Khan and Chai Soo See’s article “Predicting and
assessing bogus news” highlights social media as a powerful
medium for self-expression, allowing discussions on identity,
society, religion, and customs. Social media significantly
influences daily lives and society, allowing individuals to
share newsworthy information and stay informed about
global events [4-8]. K.Corradini et al.’s paper combines
machine learning and knowledge engineering to detect bogus
news on social networks. Fake news is a growing concern
due to its rapid dissemination on social and news media,
making it crucial to detect and prevent the spread of false
information [9]. The article by Conroy, Rubin, and Chen
proposes automatic deception detection methods for fake Fig. 25.1 Represent the system architecture
news, emphasizing the importance of modern technologies
Figure 25.1 outlines several necessary steps to achieve the
in identifying and classifying news based on veracity and
current scope. Here, we attempt to take an unstructured news
certainty. Yafra Khan and Chai Soo See [10] proposed a
dataset as input and then extract text data from it. As soon as
method for predicting and analyzing fake news on social
the text data is gathered, text pre-processing is attempted on
media using a religion and politics dataset and ML algorithms,
the resulting file. After extracting the key features, we now
aiming to prevent false information from spreading.
attempt to apply classifiers to the data. The effectiveness of
Machine Learning classification algorithms in determining
2. Related Work news authenticity can be determined by applying various
Current systems struggle to identify fake news on social methods. For example, if a fake news question requires
media, leading to rapid propagation of false information and scanning all documents, similarity searches may not be
unreliable papers [11]. This has resulted in many honest users helpful. Misinformation, also known as fake news or hoaxes,
making incorrect conclusions due to the limitations of the is the accidental sharing of false information without intent
current system. The existing system has a limit has follows to cause harm, while disinformation is deliberately shared to
the current system lacks a classification algorithm capable mislead and cause harm. For instance, Russia’s invasion of
of automatically detecting fake news items in published Ukraine in February 2022 led to a massive disinformation
content. The current system is time-consuming. The process campaign, with News Guard identifying 311 websites
of distinguishing between bogus and legitimate news requires publishing pro-Russian propaganda to justify Moscow’s
more time. False information is a significant waste of storage aggression.
space.
3.1 Fake News Detection Using Cosine Similarity
The cosine similarity index calculates how similar two
3. Proposed Methodology vectors in an inner product space are to one another. It
Re-using passwords on multiple websites causes serious establishes whether two vectors are roughly pointing in the
security problems. If a hacker can connect into one account same direction by calculating the cosine of the angle between
using login credentials, they can access all accounts that them. In text analysis, it is frequently used to gauge document
utilize that password [12]. But not only people are subject to similarity [13].
the threat. When employees use the same password at work For instance, the cosine similarity between two proportional
and at home, the security of the entire company is at risk. We vectors is 1, that between two orthogonal vectors is 0, and
avoided using the same passwords for two or more accounts that between two opposite vectors is -1. In some situations,
on the same or other websites in an effort to lessen shadow the vectors’ component values cannot be negative, in which
attacks, which rely on password reuse [10]. The benefits of case the cosine similarity is constrained to the range [0, 1].
the proposed system include the suggested classification
technique can be made less temporally difficult. The RAG 3.2 Fake News Identification Using Retriever
technique similarities between legitimate news and false Augmented Generation (RAG)
news are straightforward to classify. Any dataset gathered
Retrieval Augmented Generation (RAG), a technology that is
from the real world can be used for this. This will make
constantly advancing in the field of artificial intelligence, is
160 Algorithms in Advanced Artificial Intelligence

creating waves. With this novel method, factual data retrieval if it’s broken down into large chunks, potentially containing
is combined with the strength of huge language models. We unrelated information. To avoid diluted information and
will go into the nuances of RAG [14]. irrelevant documents, break down the data into a few
The use of Fake news database in augmenting LLMs is paragraphs per chunk, ensuring uniqueness. The RAG
a valuable method, but it has several significant flaws. approach emphasizes limiting the type of questions asked by
The debate between fine-tuning and Retriever Augmented the LLM. Aggregating data across the database may lead to
Generation (RAG) with LLMs is ongoing, with RAG being incorrect answers, while similarity searches may find local
better for enhancing LLMs with small additional data. RAG information.
encodes data into embeddings and indexes it into a vector
fake news database. Users ask fake news questions, which 4. Experimental Result and Discussion
are converted into embeddings and used to search for similar
This work utilizes Python as the programming language and
embeddings. Prompts provide context for LLM answers,
Google Colab as the working platform for developing and
usually using cosine similarity metric. The problem lies in
executing the Fake News application. The following two
the search’s ability to retrieve documents with similar words
or context without providing relevant information, leading steps are as follows
to an excess of irrelevant documents showing higher cosine
4.1 Load the Dataset and Categorize
similarity than the actual answer. High cosine similarity in
Transformers does not necessarily imply semantic similarity; To detect fake news in Python, preprocess input text; obtain
it can also indicate the high co-occurrence of two terms within numerical features, and train machine learning models like
the same training data. The data’s indexing can cause issues RAG to predict news reliability.

Fig. 25.2 Fake news with RAG

Retrieval Augmented Generation Classification Algorithm for Fake News Detection 161

4.2 Apply RAG Classification Algorithms more advanced and trustworthy techniques to protect the
accuracy of information distribution. Even though there are
Machine learning is an AI subset that uses algorithms to
still difficulties, the development and improvement of such
analyze vast data, analyzing it holistically and recognizing
sophisticated algorithms represent a critical advancement in
important information boundaries. The dataset is loaded,
strengthening our defenses against the damaging impacts of
revealing the number of unique categories and the application
fake news, eventually supporting the basis of a society that is
of the RAG classification algorithm. Cosine Similarity
better informed and resilient.
outperforms other machine learning algorithms in ML
classification, making it the optimal choice for identifying
fake news from social media. Measurement is essential References
for comprehending the outside world, but it also brings
1. A. Douglas: News consumption and the new electronic media,
inaccuracy, or uncertainty. When collecting measurements,
The International Journal of Press /Politics, vol. 11, no. 1, pp.
accuracy is an important feature to take into account because 29–52, 2006.
it indicates how closely a measurement resembles a known 2. J. Wong: Almost all the traffic to fake news sites is from
or accepted value. An indicator of the accuracy of a series Facebook, new data show, 2016.
of observations of their closeness, accuracy is a measure of 3. M. J. Lazer, M. A. Baum, Y. Benkler et al: The science of fake
observational error. news, Science, vol. 359, no. 6380, pp. 1094–1096, 2018.
4. S. A. García, G. G. García, M. S. Prieto, A. J. M. Guerrero, and
Accuracy: Accuracy is a statistical measure of how well a
C. R. Jimenez: The impact oterm fake news on the scientific
binary classification test identifies or excludes a condition,
community scientific performance and mapping in web of
often referred to as the “Rand accuracy” or “Rand index.” science, Social Sciences, vol. 9, no. 5, 2020.
It compares pre- and post-test probability estimates and is a 5. Holan, 2016 Lie of the Year: Fake News, Politifact,
test parameter. Based on 91 accurate predictions of fake news Washington, DC, USA, 2016.
out of 100 cases, the RAG algorithm classified 100 pieces 6. Robb: Anatomy of a fake news scandal,” Rolling Stone, vol.
of news with a 91% accuracy rate. However, only one out of 1301, pp. 28–33, 2017.
the nine is correctly identified by the model, leaving 8 out of 7. J. Soll: The long and brutal history of fake news,” Politico
9 instances of bogus news undetected. This implies that the Magazine, vol. 18, no. 12, 2016.
model is less efficient than one that consistently makes good 8. J. Hua, R. Shaw: Corona virus (covid-19) “infodemic” and
predictions. When working with a class-imbalanced data set emerging issues through a data lens: the case of China,
International Journal of Environmental Research and Public
where the discrepancy between positive and negative labels is
Health, vol. 17, no. 7, p. 2309, 2020.
substantial, accuracy alone is insufficient.
9. N. K. Conroy, V. L. Rubin, and Y. Chen: Automatic deception
TP +TN detection: methods for finding fake news, Proceedings of the
Accuracy = (1) Association for Information Science and Technology, vol. 52,
TP +TN + FP + FN
no. 1, pp. 1–4, 2015.
Where TP = True positive; FP = False positive; TN = True 10. F. T. Asr and M. Taboada,: Misinfotext: a collection of news
negative; FN = False negative articles, with false and true labels,” 2019.
1 + 90 11. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu: Fake news
Accuracy = = 0.91 detection on social media, ACM SIGKDD Explorations
1 + 90 + 1 + 8 Newsletter, vol. 19, no. 1, pp. 22–36, 2017.
12. S.Vosoughi, D. Roy, and S. Aral: The spread of true and false
news online, Science, vol. 359, no. 6380, pp. 1146–1151,
5. Conclusion 2018.
An approach to detect misleading information The RAG 13. Schubert, Erich; Lang, Andreas; Feher, Gloria, Reyes, Nora;
technique divides user input into true and false categories Connor, Richard; Kriege, Nils; Kazempour, Daniyal; Bartolini,
as a defense against false information. Combining the Ilaria; Schubert, Erich; Chen, Jian-Jia: Accelerating Spherical
k-Means., Similarity Search and Applications. Lecture
advantages of classification algorithms for discernment,
Notes in Computer Science. Cham: Springer International
generation models for content synthesis, and retriever models
Publishing, 13058: 217–231, ArXiv: 2107.04074, doi:
for information retrieval, this novel approach provides a 10.1007/978-3-030-89657-7_17, ISBN 978-3-030-89657-7,
solid foundation for identifying and halting the spread of S2CID 235790358.
false information. Promising results from extensive testing 14. Soumyadarshan Dash: Retrieval Augmented Generation
and review suggest that it can effectively identify deceptive (RAG), Published On September 28, 2023 and Last Modified
material. This algorithm offers a ray of hope in the face of On September 29th, 2023.
the ongoing threat of disinformation, opening the door for
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
162 Algorithms in Advanced Artificial Intelligence

Predictive AI Treatment for

Kidney Tumors with Privacy Protection 26

K. V. Nageswari1
Research Scholar, GIET University, Gunupur, Odisha State
R. N. V. Jagan Mohan2
Associate Professor, SRKR Engineering College, Bhimavaram
Bhramara Bar Biswal3
Associate Professor, GIET University, Gunupur, Odisha State

Abstract: Urologic cancers, particularly renal pelvis cancer, affect the urinary system and male reproductive system.
Clinical trials are underway to develop personalized, evidence-based treatments for advanced urologic cancer treatments.
The personalized urologic cancer treatment, including chemotherapy, radiation therapy, surgery, and supportive care. Patients
collaborate with doctors to select the optimal plan, which includes behavioral medicine, nutrition, pain management, and social
support. AI integration in urologic cancer detection enhances diagnosis, early detection, and personalized treatment, potentially
improving patient survival rates and overall well-being. The study explores the use of predictive AI treatment for kidney tumors,
ensuring privacy protection. The study aims to select a guideline-directed therapy for advanced Kidney cancer using patient
characteristics and clinical data, and to implement a collaborative approach using Machine Learning.
Keywords: Artificial intelligence, Clinical trail, Reinforcement learning, Renal (Kidney) urologic cancer etc.

1. Introduction support services including behavioral medicine, nutrition, pain

management, palliative care, social support, and interpreter
Urologic cancers affect the organs and structures of the male services, and interpreter services to help patients manage
and female urinary system and the male reproductive system treatment, improve quality of life, and manage practical
[1]. These cancers are fairly common. The Renal (kidney) aspects. AI has significantly improved cancer detection and
cancer forms in the small tubes that clean the blood of the treatment by leveraging computer science, machine learning,
kidneys [2]. Renal pelvis cancer is a rare form of this disease. and deep learning principles, revolutionizing the field [14].
Cancer is the result of fast and abnormal cell growth. Clinical Computer science forms the foundation of AI algorithms,
trials are being conducted to develop personalized, evidence- processing and analyzing vast datasets like medical images
based treatments for advanced urologic cancer treatments [3]. and patient records. Machine learning techniques, including
The treatment plan for urologic cancer is influenced by the deep learning, create predictive models, identifying
type and stage of the cancer, diagnostic test results, and overall patterns and anomalies [10]. AI integration in urologic
health [4]. The patient collaborates with their doctor to select cancer detection improves diagnosis, early detection, and
the optimal treatment plan, which may involve chemotherapy, personalized treatment, potentially leading to better survival
radiation therapy, surgery, and supportive care [5]. The rates and overall well-being for patients. Machine Learning
Hospital provides personalized urologic cancer treatment, (ML) in AI uses massive data collection to build predictive

1
ranikondaveti2011@gmail.com, 2mohan.rnvj@srkrec.edu.in, 3bhramarabarbiswal@giet.edu

DOI: 10.1201/9781003529231-26
Predictive AI Treatment for Kidney Tumors with Privacy Protection 163

models, but it can also pose privacy threats. Privacy- Implement a collaborative approach to manage the side
preserving machine learning aims to balance privacy with effects of advanced cancer treatment with patients.
ML benefits, ensuring data protection rules and privatizing To compare and select the most suitable guideline-directed
data acquisition [11]. Urologic malignancies, affecting both therapy for an advanced treatment plan based on patient
male and female urinary systems and reproductive organs, are characteristics and clinical data.
common [7]. Treatment for bladder and renal cell carcinoma
has improved with innovative checkpoint inhibitors, targeted Implement a joint approach with patients to effectively
therapy combinations, oral tyrosine kinas inhibitors, and manage the side effects of treatment.
antibody-drug conjugates [8]. The text discusses the use of Renal (Kidney) Optimize treatment Hyperparameter
machine learning in treating urology cancer, emphasizing the with Reinforcement Learning (RL): Kidney Cancer
need for effective coordination and communication between in Urologic Hyperparameters control machine-learning
patients and specialists [12]. A CT scan utilizes X-rays and a processes. Reinforcement Learning (RL) is a challenging
computer to create three-dimensional images of the kidneys framework for Hyperparameters optimization, following a
to detect urologic cancer invasion or spread to other organs Markov chain of improved model performance. Supervised
or lymph nodes [9]. learning (RL) is a challenging approach that predicts
treatment plan actions based on a series of “Cure/Abnormal
2. Literature Review State” to reach the next “state”. The model is trained to
predict the probability of the next urologic Renal or Kidney
Earlier works on these lines by various authors as shown in Hyperparameter, using the up-to-date Hyperparameters
Table 26.1. values and predictive performance. The Renal in Urologic
treatment plan model trains on discretized Hyperparameters
3. Proposed Work like behavioral medicine, nutrition, pain management, and
palliative care space, optimizing future treatment plans using
The proposed work aims to select a guideline-directed therapy predictive performance [15].
for advanced Kidney cancer based on patient characteristics
H:r = M(H) (1)
and clinical data at the end of the activity.

Table 26.1 Authors published works

Authors Model Used Discussion
DONGLIU Recurrence waming system They collected the data of more than 700 renal cancer patients and analyze
integrated with the Internet of seven indicators of renal cancer from seven aspects: tumor module, basic module,
Things microenvironment module, immune module, nutrition module, psychological module,
and exercise module. They constructed five leaming algorithms for renal cancer
recurrence prediction models to predict the time of renal cancer recurrence. Through
model evaluation and comparison, it is found that the prediction accuracy of the
convolutional neural network is 9235%, which is significandy higher than other models,
and the stability is higher.
PEDRO A. MORENO- Clinical prediction model Their work presents the development and evaluation of an explainable prediction
Sanchez. model for CKD early diagnosis. The main goal is to show how XAI contributes to
improving prediction models used in the medical field.
MD. RASHED-AL XAI, Machine learning (ML) The primary objective of this study was to identify important clinical test attributes not
MAHFUZ. only to enable efficient computer aided CKD screening but also to help reduce the
costs of CKD diagnosis. Results obtained using their framework indicate that the ML
models showed better CKD and non CKD classification with a considerably reduced
number of attributes, 13 out of 24 that were employed.
BILAL KHAN. ML techniques They mainly focus on the empirical comparisons of seven ML algorithm. For this
purpose, we select NB, LR, MLP, J48, SVM, NBTree and CHIRP. The CKD prophecy
results of experiments show better performance for CHIRP on an average using
different evaluation metrics. This study prescribed the CHIRP is the best technique
that can be utilized by practitioners so as to eradicate diagnostic and treatment errors.
GUOZHEN CHEN. CNN They presented the Adaptive Hybridized Deep Convolutional Neural Network
(AHDCNN) for the early prediction and diagnosis of Chronic Kidney Disease (CKD). A
deep learning system is used for identifying the distinctive subtypes of lesions from CT
images in renal cancer.
164 Algorithms in Advanced Artificial Intelligence

A Reinforcement Learning model R predicts a value q using optimizes for Hyperparameters with a prominent treatment
H and r, with q = R(H, r). The optimal action maximizes q, plan, with minimal impact on optimized Hyperparameters,
and can be predicted for past treatment H and r using the each with its own classification option. The policy gradient
formula q’=R(past H, past r), where r and q represent future is preferred for treatment plan classification due to its ease of
treatment plan values. The model minimizes mean square pain management and compatibility with good treatment plan
error by calculating q’ – (r + g * max q) ^ 2. Hyperparameters space dimensionality [15].
To indicate a treatment plan model preference for certain A clinical trial involving 70 participants with kidney tumors
Hyperparameters to be 1: L = –1*logP(next H|current H, found no return or cancer-related deaths during the treatment.
current r), use cross entropy to increase the probability of At 5 years post-treatment, only 15 had cancer evidence. Most
generating them. Policy gradient weighs the sample with the patients had decreased kidney function, but only with limited
reward value L = –(next Action Plan) * log P(next H|current further reductions. The trial results were already beating
H, current r) With next Action Plan = M(next H). The model surgical numbers for those with tumors 9 centimeters or less.

Fig. 26.1 Treatment study of clinical trail process of renal in urologic cancer
Predictive AI Treatment for Kidney Tumors with Privacy Protection 165

The average age was 77, with many being clinically obese S. No. Age T1a (Up to 4cm) T1b(4-7cm) T2a(7-10cm)
and having other medical conditions. Participants were
6 38 0 0 9
treated Bollineni hospital, Rajmundry, ensuring effective
radiation doses and quality treatment. Localized kidney 7 41 0 0 8.2
cancer is classified into three stages: T1a (up to 4 cm), T1b 8 31 0 6.5 0
(4-7 cm), and T2a (7-10 cm) to aid oncologists in determining 9 32 0 5.5 7.3
treatment options. 10 30 0 0 8.2
11 28 0 0 8.5
12 34 0 0 9
13 37 0 6.4 0
14 27 4 0 0
15 42 0 6.5 0

4. Multiple Linear Regressions

Multiple Regressions is a statistical technique that Renal
in Urologic cancer treatment predicts a response variable’s
outcome by combining multiple explanatory variables. Linear
regression is a strategy that models the relationship between
a dependent variable and independent features, assuming a
linear relationship, to make forecasts based on new or unseen
data.
y = β0 + β1x1 + β2x2 + β3x3, …, + βnxn (2)
Fig. 26.2 T2a:9 cm size of kidney cancer test image Here, y is the dependent variable.
x1, x2, x3, … are independent variables.
Table 26.2 Different patients size of kidney cancer tumor
b0 = intercept of the line.
S. No. Age T1a (Up to 4cm) T1b(4-7cm) T2a(7-10cm)
b1, b2… are coefficients.
1 32 3.2 0 0
2 29 0 7 0
5. Experimental Result
3 35 0 6.5 0
The renal urologic cancer data set from a hospital in
4 33 3.5 0 0
Karaikudi, India, contains 400 instances with 25 attributes
5 36 4 0 0
for classification problems. It includes medically relevant

Fig. 26.3 Different patients size of kidney cancer tumor

166 Algorithms in Advanced Artificial Intelligence

Table 26.3 Reinforcement learning predictive treatment of renal in urologic cancer

Performance Metric Methods Accuracy Sensitivity Specificity Precision F1-score
Reinforcement Learning 0.98 0.98 0.98 0.98 0.99

variables associated with kidney disease, with some variables model using new data to improve prediction accuracy.
potentially correlated. The dataset helps train supervised Continuously monitor and refine the privacy protection
algorithms, with a small percentage of missing values for mechanisms to adapt to evolving threats. Evaluate the AI
learning without noise data. The study assesses reinforcement model’s performance using metrics like accuracy, precision,
learning models’ accuracy, sensitivity, specificity, precision, recall, and F1 score. Assess the effectiveness of privacy
and F1-score in renal urologic cancer datasets, focusing on protection mechanisms. Develop a user-friendly interface
positive sample recognition and true nativities prediction. for healthcare professionals to interact with the AI system.
Ensure proper authentication and authorization to control
0.992 access to sensitive patient data. Deploy the AI system in a
0.99 healthcare environment, ensuring integration with existing
0.988 systems. Validate the system’s performance in a real-
0.986 world clinical setting. Ensure compliance with healthcare
0.984 regulations and standards. Keep abreast of evolving
0.982
privacy and security standards in the healthcare sector.
0.98
0.978
Provide training to healthcare professionals on using the AI
0.976 system and interpreting its recommendations.
0.974 The method outperforms other methods in terms of accuracy
measurements for Reinforcement Learning in Renal in the
Urologic Cancer graph. Implementation of AI in healthcare
requires careful consideration of ethical, legal, and regulatory
aspects. Regular evaluation, validation in real-world
Fig. 26.4 Reinforcement learning predictive treatment of clinical settings, and continuous collaboration between AI
renal in urologic cancer developers, healthcare professionals, and patients are critical
for achieving positive outcomes.
Collect patient data, including demographics, medical
history, diagnostic test results, and treatment outcomes.
Ensure compliance with data protection regulations (e.g.,
HIPAA) and anonymize or pseudonymize patient data to
protect privacy. Clean and preprocess the data, handling
missing values and outliers. Split the dataset into training
and testing sets. Choose appropriate machine learning
models for predictive analysis. In your case, you might use
regression models for predicting treatment outcomes. Train
the models using the training dataset. Implement privacy-
preserving machine learning techniques to protect sensitive
patient information. Consider using techniques like federated
learning, homomorphic encryption, or differential privacy
to ensure data privacy during the model training process.
Use interpretable models or techniques to make the AI
system more understandable to healthcare professionals.
Provide insights into the factors influencing treatment
recommendations to build trust. Implement a collaborative
decision-making system where AI provides treatment
suggestions, but the final decision involves collaboration
between AI systems and healthcare professionals. Include
features for doctors to input additional clinical expertise
that the AI model may not capture. Regularly update the
(a)
Predictive AI Treatment for Kidney Tumors with Privacy Protection 167

2. E. Tjoa and C. Guan: A survey on explainable artificial

intelligence(XAI): Toward medical XAI, IEEE Trans. Neural
Netw. Learn. Syst., vol. 32, no. 11, pp. 1–21, Nov. 2021,
doi: 10.1109/TNNLS.2020.3027314.
3. G. Murshid, T. Parvez, N. Fezal, L. Azaz, and M. Asif,:Data
mining techniques to predict chronic kidney disease,
International Journal Scientific Research Computer Science
Engineering Information Technology, vol. 5, no. 2, pp. 1220–
1226, Apr. 2019.
4. I. Ameer, G. Sidorov, and R. M. A. Nawab: Author profiling
for age andgender using combinations of features of various
types, J. Intell. FuzzySyst., vol. 36, no. 5, pp. 4833–4843, May
2019.
5. L. J. Rubini and E. Perumal: Efficient classification of chronic
kidney disease by using multi-kernel support vector machine
and fruit fly optimization algorithm, Int. J. Image Syst.
Technol., vol. 30, no. 3, pp. 660–673, Sep. 2020.
6. M. A. Hossain, T. A. Asa, M. R. Rahman, and M. A. Moni:
(b) Network_x0002_based approach to identify key candidate
genes and pathways shared bythyroid cancer and chronic
kidney disease, Informat. Med. Unlocked,vol. 16, Jan. 2019,
Art. no. 100240, 2019.
7. N.Lei, X. Zhang, M. Wei, B. Lao, X. Xu, M. Zhang, H. Chen,
Y. Xu,B. Xia, D. Zhang, C. Dong, L. Fu, F. Tang, and Y. Wu:
Machine learningalgorithms’ accuracy in predicting kidney
disease progression: A systematic review and meta-analysis,
BMC Med. Information Decision Making, vol. 22, no. 1,
p. 205, Aug. 2022, doi: 10.1186/s12911-022-01951-1, 2022.
8. P. Cockwell and L.-A. Fisher: The global burden of chronic
kidneydisease, Lancet, vol. 395, no. 10225, pp. 662–664, Feb.
2020, doi: 10.1016/S0140-6736(19)32977-0, 2020.
9. Qezelbash-Chamak, S. Badamchizadeh, K. Eshghi, and
Y. Asadi: A survey of machine learning in kidney disease
diagnosis, Mach. LearnAppl., vol. 10, Dec. 2022, Art. no.
100418, doi: 10.1016/j.mlwa.2022.100418, 2022.
10. R. Gupta, N. Koli, N. Mahor, and N. Tejashri: Performance
analysisof machine learning classifier for predicting chronic
(c) kidney disease, in Proc. Int. Conf. Emerg. Technol. (INCET),
Fig 26.5 (a) pair plot for various features and target variable, Jun. 2020, pp. 1–4, doi:10.1109/INCET49848.2020.9154147.
(b) Residual Plot on Linear regression, (c) Feature 11. S. Bashir, U. Qamar, F. H. Khan, and L. Naseem, HMV:A medical
importances on Random Forest Model decision support framework using multi-layer classifiers for
disease prediction, Journal of Computer Science, vol. 13, pp.
10–25, Mar. 2016, doi:10.1016/j.jocs.2016.01.001,https://
6. Conclusion ieeexplore.ieee .org/document/9094581/, 2016.
12. S. A. Ebiaredoh-Mienye, T. G. Swart, E. Esenogho, and I. D.
A study using reinforcement machine learning aims to
Mienye: A machine learning method with filter-based feature
improve patient survival rates in urologic cancers, particularly selection forimproved prediction of chronic kidney disease,
renal pelvis cancer, through AI integration in early detection Bioengineering, vol. 9,no. 8, p. 350, Jul. 2022, doi: 10.3390/
and personalized treatment. bioengineering9080350, 2022.
13. World Health Organization. (2019). World Health Statistics
References 2019: Moni_x0002_toring Health for the SDGs, Sustainable
Development Goals. Accessed: Feb. 7, 2023, Available:
1. A. N. Muiru et al.: The epidemiology of chronic kidney https://apps.who.int /iris/handle/10665/324835, 2019.
disease (CKD)in rural east Africa: A population-based study, 14. Wang Qiang, Zhan Zhongli: Reinforcement learning model,
PLoS ONE, vol. 15, No.3, Mar. 2020, Art. no. e0229649. algorithms and its application, IEEE Xplore: 22 September,
C.P. Kovesdy: Epidemiology of chronic kidney disease: 2011,DOI: 10.1109/MEC.2011.6025669, 2011
An update2022, Kidney Int. Supplements, vol. 12, no. 1, Note: All the figures and tables in this chapter were designed by
pp. 7–11, Apr. 2022, doi:10.1016/j.kisu.2021.11.003. the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
168 Algorithms in Advanced Artificial Intelligence

Developing a Hybrid Approach to Assess

Changes in Pomegranate Quality 27

Sai Prapulla Seshank Adivi1

M.S. Data Science, State University of New York, University at Buffalo, Buffalo, NY
V. M. N. S. S. V. K. R. Gupta2
Associate Professor, Dept. of computer science and engineering, SRKR Engineering College, Bhimavaram
A. Bala Krishna3
Professor, Dept. of Mechanical Engineering, SRKR Engineering College, Bhimavaram

Abstract: Pomegranate quality is an issue in the food supply chain, and as a result, a lot of fruit goes to waste. Through
continuous monitoring and prediction of the condition of fresh fruit, a digital twin—the virtual twin of a crop—can assist in
the reduction of wasted pomegranates. The introduction of the thermal camera as a data-gathering instrument was due to its
capacity to identify surface and physical changes in stored fruits. Using SAP’s smart technologies, we trained a model with four
distinct sets of temperature data for this experiment. We tracked the fruits’ condition by training a deep convolutional neural
network (DCNN) with temperature data. The technology’s achievement of 0.99 prediction accuracy shows that it has great
potential for creating digital twins of fruits. It is possible to decrease food waste in the supply chain by making a digital copy
of fruit using thermal photography and machine learning algorithms.
Keywords: Convolutional neural network, Digital twin, Machine learning, Pomegranate fruit etc.

1. Introduction to address this issue, researchers have been looking into state
of-the-art approaches like spectroscopy, imaging, and sensor
Agriculture is the backbone of the economies of most technologies to enhance the accuracy and reliability of quality
developing nations in Southeast Asia. Fifteen percent of assessments. Although they have certain limitations, non
India’s gross domestic product comes from the agricultural destructive methods that integrate chemometrics with near-
sector, which provides sustenance for about half of the infrared spectroscopy (NIRS) could be useful for evaluating
population. The agro-ecological practices in India are varied various agricultural goods. However, the true power lies in
and extensive [1]. The pomegranate stands out among fruits integrating all of these methods into one cohesive framework
due to its numerous health benefits, vivid colour, and rich to create a hybrid approach that accurately evaluates
flavour. The increasing demand for high-quality fruit in pomegranate quality variances. A more practical alternative
global markets has intensified the need to ensure the quality to damaging measurements, NIRS now incorporates machine
of pomegranates. In addition to being delicious, visually learning, ANN, regression models, processing capacity,
appealing, healthy, and safe, pomegranates offer many more and other prediction techniques. Researchers have also
great attributes. Though practical, conventional methods demonstrated the practical feasibility of NIRS by employing
of determining fruit quality often lack the precision and appropriate preprocessing techniques and wavelength
thoroughness needed to meet modern benchmarks. In order selection approaches. A more comprehensive understanding of

1
saiprapu@buffalo.edu, 2guptavkrao@gmail.com, 3prof.adavi@gmail.com

DOI: 10.1201/9781003529231-27
Developing a Hybrid Approach to Assess Changes in Pomegranate Quality 169

the fruit’s characteristics is just one benefit of this integration; to get premium inputs from distributors or producers, and
others include improved supply chain management, happier picking high-quality products is risky because of unseen
consumers, and more efficient agricultural techniques. This damage. It takes a lot of time and money to use traditional
study explores the establishment of a hybrid strategy with the testing methods [10–13]. Furthermore, the conventional
goal of improving our knowledge of and ability to ensure the sampling and testing processes often result in significant
quality of this rare fruit. By integrating traditional wisdom product waste. There are four main characteristics of fruits
with cutting-edge technology, this study creates a holistic and vegetables, according to research [14–16]: texture, taste,
method for pomegranate quality evaluation that will shape colour, and nutritional content. The qualities of fruits and
future practices in the agriculture industry. vegetables impact how consumers perceive them, how much
they eat, and what they put them to use; as a society, we have
2. Literature seen a substantial uptick in the intake of fruits [18]. In order
to guarantee the quality, safety, and return on investment
Manufacturers in the agro-based processing business are of popular fruits such as apples, oranges, and kiwis, it is
currently focusing on producing fresh and lightly processed essential to sort, examine, and control the fruit’s quality [19–
items. Responding to customer demand in this context, 21]. Still, having a top-notch product is essential, especially
the market has introduced new techniques and products. when it comes to exporting. Looks, tastes, textures, and
The primary raw resource for many food operations is nutritional value are the four main criteria for evaluating
agricultural items [2, 3]. Perishable fruits and vegetables produce quality [14–16]. These characteristics significantly
have a limited shelf life in ambient conditions, but it can influence the preference, utilization, and consumption of
be extended in refrigerated storage. Perishable fruits and fruits, vegetables, and their products [17]. People eat a lot
vegetables should not be stored for long periods of time due more fruit now than they did even a decade ago, thanks to
to their susceptibility to spoilage by bacteria and chemicals. the general improvement in people’s living conditions [18].
Consequently, it is critical to complete post-harvest processes If you want your fruit investment to pay off, you need to sort,
like sorting and grading without delay. Countless situations check, and manage the quality of fruits like pomegranate,
call for smart reactions and complicated solutions, where banana, mango, jujube, apple, orange, kiwifruit, peach, grape,
artificial intelligence (AL) plays a significant role [4]. Modern and strawberry, sections 19–21]. Having a top-notch product
image processing methods and machine learning enable the is still essential, especially when targeting the export market
detection of fruit illnesses. Here are a few similar scientific [22]. The pomegranate fruit, depicted in Fig. 27.1, is widely
discoveries that have recently come to light: Because bananas consumed both fresh and processed into various forms such
are susceptible to so many different diseases, Song et al. as seed oil, dried arils, and juice [23]. The pomegranate tree
used image processing to detect when problems arose, so or shrub is spherical in shape. Each edible aril contains a seed
they could take measures to stop the spread of sickness and enclosed in a see-through sac that holds juice [24, 25] and is
restore normal production [5]. Chen et al. [6] laid out a plan protected from the elements by a thick, tough skin known
and method for supplying fresh agricultural products through
the use of Internet of Things technology. Before agricultural
products may be sold abroad, they usually need to have
their manufacturing process, marketing, agricultural quality
control systems, and auxiliary measures improved. This
allows for the integration of agricultural production with IT
and objective technology services provided over the Internet
[7]. This study used image processing to identify correlations (a) (b) (c)
between pomegranate size, colour, and appearance. In most
cases, removing the skin of a pomegranate fruit will reveal its
size and colour. In addition, workers sort the pomegranates
by hand. This led to the application of image processing and
AI to grade pomegranates based on their size and colour [8].
Ensuring the quality and safety of the end goods requires the
use of analytical techniques across the post-harvest supply
(d) (e) (f)
chain and prior to processing [9]. Assessing the quality of
raw fruit, maintaining cultivar validity, and identifying Fig. 27.1 Evaluates the quality of pomegranate using non
product damage are all obstacles that traditional methods of destructive approaches, focusing on its (a) Complete
fruit and vegetable fault detection encounter. It is challenging fruit, (b) fresh-cut fruit, (c) aril, (d) seed, (e) oil, and
(f) juice
170 Algorithms in Advanced Artificial Intelligence

as the peel [23]. The many health benefits and nutritional The literature highlights a shift in pomegranate quality
benefits of pomegranate fruit have contributed to its rising assessment, incorporating spectroscopy, imaging, sensor
popularity in recent years [26–28]. Because of this newfound technologies, chemical analysis, data fusion, IoT devices,
awareness around the world, pomegranate fruit has become and blockchain technology. This synthesis improves
much more popular for commercial production [29, 30]. understanding, promotes sustainable practices, informed
More recently, pomegranate fruit has been utilised for animal consumer choices, and economic growth.
feed, metabolomic peel extract, and as a powerful antioxidant
[31, 32], among other value-added products [33, 34]. The 3. Theoretical Framework
industry is working on better ways to grade pomegranate fruit
according to size, weight, and the appearance of the outside Utilising a digital twin (DT) is a very efficient method
rind [38], which helps determine the fruit’s freshness and for reducing food supply chain waste. The term “digital
worth [35–37]. Pomegranate fruits are categorised according twin” (DT) refers to a digital replica of a physical object
to their exterior, the thickness of their peel [17, 22], and the that is identical to the original in every way, including the
fragility of their arils. To guarantee safe handling, quality dimensions, shape, and composition of the product. It also
detection requires efficient, non-destructive technologies that needs to be able to faithfully recreate any major change that
are quick to respond [36]. happens during the product’s lifetime. In addition, sensors
that can continuously update data in real-time should be
A study developed a color-based method to identify
employed in conjunction with the actual architecture shown
pomegranates on trees in Fig. 27.2 using close-up images,
in Fig. 27.1.
considering artificial intelligence costs in agriculture, and a
model for pomegranate supply chains. As shown in Fig. 27.2, the platform will store and process
measurement data input into the DT in accordance with the
recommended design. Adding smart components to both new
and old products, connecting them to a cloud-based location
with streaming, enumerated data, and analytics capabilities
to gather data from sensors, continuously analyzing this data,
Fig. 27.2 Image processing is used to identify pomegranates and finally leveraging digital insights to revolutionize the
on a tree, as part of a case study on sustainable food retail industry are some of the steps involved in putting
closed loop supply chains in Iran
this system into practice. The IoT edge, which may link
A. Source image (pomegranate tree). B. Black and white the thermal camera to IoT cloud services, is the first of the
picture (segmentation, color indexing). C. Picture after the
two layers that make up the DT solution. It is impossible to
applied threshold (figure analysis). D. Location of geometric
connect more than one thermal camera over a Wi-Fi gateway
centers (estimated count of the pomegranates).

Fig. 27.3 Harvest of pomegranate fruit’s virtual twin

Developing a Hybrid Approach to Assess Changes in Pomegranate Quality 171

Fig. 27.4 Proposed architecture for the solution

without a MQTT message broker. The second tier of the usable before it spoils. Additionally, this approach deals with
suggested solution includes an Internet of Things (IoT) cloud the issue of unoccupied object storage areas.
service that offers enterprise-level capabilities and services
pertaining to supply chain, product, and service management. 4. Proposed Method
If you follow this approach, you may connect all of your
Internet of Things (IoT) devices to the cloud using SAP Edge We created a DT of pomegranate fruit and evaluated its
Services Cloud Edition and save all of the thermal camera quality based on temperature changes using a thermal
photographs in SAP Document Services. SAP Intelligent imaging method as a data-collection instrument. The thermal
Service is able to predict the fruit’s quality throughout camera’s photographs of the targets were used as a training
storage based on the snapshot. Similar to this, SAP Asset set for CNN. With the help of machine learning, these images
Intelligent Network keeps an eye on the digital fruit, gathers reveal physiological data regarding the fruit’s state, allowing
data for forecasts, and alerts users. Customers can access a for an accurate prognosis. Deep learning has really improved
custom responsive web application called FoodDT client— the performance of tasks that include classification. Images
Retailer Front-End, which allows them to quickly access taken by infrared thermal cameras are analysed for patterns
the SAP Asset Intelligent Network notification and use it as that can be used for picture classification. This training has
a decision-making tool. With this information in hand, the made use of supervised machine learning. Two essential
consumer can take measures to avoid losing their purchase. components of this machine learning type are the training
inputs and the desired outputs (also called “labels”). The
Figure 27.6 suggested the fruit status category based on
application of image processing and classification technology
the traditional procedures and trends used by retailers.
is growing in tandem with its effectiveness.
Thus, “fresh” fruits are those that can fully meet customer
demand, whereas “good” fruits with a high market value Pomegranate Fruit Data Collection: FLIR image capture
necessitate targeted discounts from retailers. You can save one thermal camera was used to create the data set. Over the
a life by donating “bad” fruit that would otherwise go to course of the storage period, several times were used to snap
waste to a charity or other group. Failure to follow these the images. Prior to the start of the training procedure, the
safety measures may result in discarding and labeling the fruit photo collection was divided into four groups: Fresh,
fruit as “damaged.” Warehouses and food delivery services Excellent, Poor, and Damaged (see Fig. 27.5).
are only two examples of logistical infrastructures that can The prediction model was trained using SAP Intelligent
substantially benefit from this technology’s ability to increase Service. SAP Intelligent technologies now leverage an
transparency. As part of inventory management processes, efficient deep learning architecture called Tensor Flow. A
technology enables continuous DT updates with data from well-known machine-learning framework is used in the
food products. Quick decisions in near-real-time or in real- study. The technology includes deep neural networks and
time will be possible thanks to the cloud, which will serve as highly developed predictive modeling capabilities. While
the principal platform for its implementation’s management. the training dataset included large photos for preservation
After using this solution, stores will have a clearer picture of purposes, the validation and test datasets were created using
their inventory and will know exactly how much food is still several images with four labels for each category. Eighty
172 Algorithms in Advanced Artificial Intelligence

Normal vision

Fresh Excellent Poor Damaged

Thermal vision

Fig. 27.5 Fruit images classification

percent of the original training set’s data are included in the (CNN) is one type of multi-layer neural network, as shown
training dataset; the remaining twenty percent are divided in Figure 6. A well-known feed-forward network extracts
across the test and validation datasets. the structure of an image. By applying the back-propagation
Delivery Model: Both the training and inference phases technique, we can train them to recognize patterns in pictures.
of the training algorithm were evaluated. Use the learning All of the neurons that comprise a feature are given identical
stage to explain the data and produce a trained model. The weights, disregarding their biases, to extract features. Weight
image must be converted into a vector representation for primarily attributes to the steep activation function.
educational purposes. When selecting a model and looking The parameter bias affects both the steepness and rate
for the model’s parameters, the learning process makes use of triggering of the activation function, contributing to
of this representation. the model’s ability to fit the data as closely as possible. If
During the inference stage, predictions about new data are this theory is correct, then each neuron reliably detects
made using the trained model. Feature vectors are real data in the same feature in the input picture. People think that the
the approach’s induction and training phases. It’s a method of most important component of a neural network for feature
assessing a model’s performance with fresh data to determine extraction is the convolutional layer, which is responsible for
how well it can predict the future and gauge its efficacy. CNN operations. It searches the picture for patterns using a
The model is used in the inference phase to draw informed collection of trainable filters called kernels. By dragging the
conclusions about fresh data. This is the same as using the filter over different parts of the input picture, convolution
model in real-world situations. The process will leverage real- creates a dot product between the filter and those parts.
time thermal camera data in the DT situation. After training Each grid will be useful, and the pooling (sub-sampling)
the neural network depicted in Fig. 4, the implementation of layer reduces the size of the feature maps by using various
the DT concept can involve feeding real-time data into it. As approaches to condense sub-regions. This layer lessens the
a result, the end users will be informed about the product’s occurrence of overfitting by relocating a region over the
status, and the forecast will be put into practice utilizing the input and sending the content of the window. Using pooling,
past data kept in the SAP text cloud storage. we can reduce the overall number of network parameters
while simultaneously making the learned features more
An assessment of a model using convolutional neural networks resistant to changes in size and orientation. The layer above
(CNN) will be conducted. A convolutional neural network it connects fully to the output of the layer below. As a result,
Developing a Hybrid Approach to Assess Changes in Pomegranate Quality 173

it establishes connections with all of the neurons in the layer for the network to arrive at a solution. According to reference
above it. Adding a fully linked layer is another way to learn [41], the typical values of η for convolutional neural networks
nonlinear combinations of these properties. According to the (CNNs) during training are 0.01, 0.1, 0.5, and 0.9. Raising η as
rule of mistake correction and learning, this method works. the step size grows could potentially accelerate convergence
Using a two-stage propagate-adapt cycle, the network learns as the network error decreases. On the other hand, the network
a collection of input-output sample pairings that have been may deviate excessively from the actual lowest values if η
pre-defined. The weighted inputs are added together by increases too much. Optionally, including a parameter called
this continuous-non-linear node, and then the output is sent momentum (α) aids in the network’s convergence. When
through a sigmoid non-linearity equation (1). In Appendix changing the variable α to a positive integer smaller than 1
A, you may find a flow diagram and a synopsis of the Back [39], the weight change equations on both the output and
Propagation algorithm. The first layer of network units hidden layers are usually altered.
receives an input pattern, and that pattern is passed on to Performance Model: In addition to the particular pattern
each layer above until an output is formed. The network’s reorganization job, the effectiveness of this rule is dependent
synaptic weights remain fixed throughout the forward pass. on the initial weight settings, learning rate parameter, output
By comparing this pattern to the intended output, the network functions of units, and presentation-training data. It’s crucial
determines an error signal for each output unit. Next, the to correctly set the weight values before implementing the
network transmits the error signals from the output layer to learning law for a particular training set. Initial weights are
each node in the intermediate layer that directly contributes in line with previous understanding. When knowledge is
to the output (reverse pass). The network adjusts each node’s possessed and appropriately presented in the form of starting
connection weights in response to the received error signal, weights, the ensuing trained network’s overall performance,
bringing it one step closer to the point where all training both in terms of learning speed and generalization speed, will
patterns may be encoded. It is customary to set the starting be greatly enhanced. The algorithm minimizes the average
weight values to low, arbitrary integers. The approach will error between the calculated outputs and the presented targets
not work well with multilayer networks if the initial weights using a gradient descent approach. The algorithm computes
are zero or have poorly chosen non-zero values [39]. All the outputs (Oj) from the inputs (Ii) using the logistic sigmoid
of the error words in the output layer determine the weight function provided by:
updates of the hidden layer. The hidden layer receives the
known output layer errors and uses them to determine how 1
O j = f ( x) =
to modify its weights. However, there is no guarantee that 1 + e x
the learning law will converge towards the goal solution for n
any given pattern-mapping assignment [40]. A learning rate Where activation x = Â wij Ii
parameter (と) is added to the output layer weight updating i =1

process to overcome this problem. η is usually a modest The connection weights represent the parameters of the back
number between 0 and 1, and it must be within that range propagation algorithm that can be adjusted. The number
of input and output neurons determines the number of
hidden neurons. We assessed the model’s performance by
measuring loss and accuracy. While training accuracy shows
the percentage of data used in the current dataset that was
properly identified, validation accuracy shows the accuracy
of randomly chosen photos from a different class. However,
the loss demonstrates how well the model performed on both
the training and validation datasets. The loss is calculated
by summing up the errors for each example in the validation
or training sets. To find out how much the projected values
differ from the actual values of the training data, accuracy
compares the categorised picture with data from a different
source called ground truth data. The pace of input acquisition
during training may potentially impact the convergence
of neural networks. The convergence of neural networks
is profoundly impacted by the number of epochs. Training
won’t converge quickly if the learning rate is low and slow
if it’s high. This training made use of an ideal learning rate.
Fig. 27.6 Three layer feed forward neural network
174 Algorithms in Advanced Artificial Intelligence

The following information was included in the model’s post- and environmental safety. Q Assurance Safety Crops Foods.
training summary: batch size, learning rate, total number (2021) 13: 20–36. doi: 10.15586/qas.v13i1.838.
of training epochs, greatest accuracy, final test accuracy, 3. Yadav A, Kumar N, Upadhyay A, Singh A, Anurag RK,
predicted top classes, and training duration, according to Pandiselvam R. Effect of mango kernel seed starch-based
active edible coating functionalized with lemongrass essential
Table 27.1. Despite 149 training epochs, the model did not
oil on the shelf-life of guava fruit. Q Assurance Safety Crops
achieve any further advancement after epoch 21.
Foods. (2022) 14: 103–115. doi: 10.15586/qas.v14i3.1094.
4. Jiang, Y., Li, K., Chen, S., Fu, X., Feng, S., & Zhuang, Z.
Table 27.1 Summarizes training (2022). A sustainable agricultural supply chain considering
Possessions Value substituting organic manure for chemical fertilizer. Sustainable
Production and Consumption, 29, 432–446.
Training Batch Size 65
5. Song, L., Luo, Y., Chang, Z., Jin, C., & Nicolas, M. (2022).
Learning Rate 0.001 Blockchain adoption in agricultural supply chain for better
Total Training Epoch 149 sustainability: A game theory perspective. Sustainability,
Epoch with Best Accuracy 6 14(3), 1470.
6. Chen, X., Chen, R., & Yang, C. (2021). Research and design
Best Validation Accuracy 0.99
of fresh agricultural product distribution service model
Predicted Top Classes 4 and framework using IoT technology. Journal of Ambient
Final Test Accuracy 0.99 Intelligence and Humanized, Computing17–1.
7. Bohle, C., Maturana, S., & Vera, J. (2010). A robust
The image classifier trained for loss reduction achieved optimization approach to wine grape harvesting scheduling.
an average accuracy of 0.99, reducing cross-entropy loss European Journal of Operational Research, 200(1), 245–252.
8. Hajiaghaei-Keshteli, M., & Aminnayeri, M. (2014). Solving
and validation-cross entropy loss to 0.005 and 0.08. The
the integrated scheduling of production and rail transportation
model accurately predicted fruit status, indicating machine
problem by Keshtel algorithm. Applied Soft Computing, 25,
learning’s potential for fruit decision-making (DT), reducing 184–203.
costs and labor-intensive tasks in the fruit supply chain. This 9. Beegum PS, Pandiselvam R, Ramesh SV, Thube SH, Pandian
innovative approach could significantly improve decision- TP, Khanashyam AC, et al. A critical appraisal on the
making. antimicrobial, oral protective, and anti-diabetic functions of
coconut and its derivatives. Q Assurance Safety Crops Foods.
(2022) 14: 86–100. doi: 10.15586/qas.v14i2.1040.
5. Conclusions 10. Kaavya R, Pandiselvam R, Mohammed M, Dakshayani
To evaluate the quality of pomegranates, the study suggests R, Kothakota A, Ramesh SV, et al. Application of infrared
a mixed method that combines chemical analysis, sensory spectroscopy techniques for the assessment of quality and
evaluation, visual inspection, the Internet of Things safety in spices: a review. Appl Spectrosc Rev. (2020) 55:
593–611. doi: 10.1080/05704928.2020.1713801.
(IoT), and blockchain technology. Colour, flavour, aroma,
11. Munawar AA, von Hörsten D, Wegener JK, Pawelzik E,
nutritional value, and physical attributes are all captured
Mörlein D. Rapid and non-destructive prediction of mango
by this model as indicators of pomegranate quality. The quality attributes using Fourier transform near infrared
decision support system provides insights that enable making spectroscopy and chemometrics. Eng Agric Environ Food.
supply chain decisions with more knowledge. The iterative (2016) 9:208–15. doi: 10.1016/j.eaef.2015.12.004
approach guarantees flexibility and continued applicability in 12. Niu C, Guo H, Wei J, Sajid M, Yuan Y, Yue T. Fourier
a dynamic agricultural setting. Our innovative method raises transform near-Infrared spectroscopy and chemometrics
the bar for the agricultural industry by improving the quality to predict zygosacchromyces rouxii in apple and kiwi fruit
assessment of fruits and agricultural products, which in turn juices. J Food Prot. (2018) 81:1379–85. doi: 10.4315/0362
increases consumer trust and bolsters sustainable farm-to 028X.JFP-17-512
table operations. 13. Yan H, Xu YC, Siesler HW, Han BX, Zhang GZ. Hand-Held
Near-Infrared Spectroscopy for authentication of fengdous
and quantitative analysis of mulberry fruits. Front Plant Sci.
References (2019) 10:1–15. doi: 10.3389/fpls.2019.01548.
14. Pandey, C.; Sethy, P.K.; Biswas, P.; Behera, S.K.; Khan,
1. Government of India, 2011. Faster, Sustainable and More M.R. Quality evaluation of pomegranate fruit using
Inclusive Growth: an Approach to the 12th Five Year Plan image processing techniques. In Proceedings of the 2020
(Draft), Planning Commission. Government of India, New International Conference on Communication and Signal
Delhi. Processing (ICCSP), Chennai, India, 28–30 July 2020;
2. Madanayake NH, Hossain A, Adassooriya NM. p. 19914067. [CrossRef]
Nanobiotechnology for agricultural sustainability, and food
Developing a Hybrid Approach to Assess Changes in Pomegranate Quality 175

15. Opara, U.L.; Pathare, P.B. Bruise damage measurement and 29. Holland, D.; Hatib, K.; Barya, I. Pomegranate: Botany,
analysis of fresh horticultural produce—A review. Postharvest horticulture, breeding. In Horticultural Reviews; John Wiley
Biol. Technol. 2014, 91, 9–24. [CrossRef] Sons, Inc.: Hoboken, NJ, USA, 2009; Volume 35, pp. 127–
16. Spielmanns, R.; Spielmanns, J.; Damerow, L.; Blanke, 192.
M.M. Non-destructive determination of surface features 30. Dhinesh, K.; Ramasamy, D. Pomegranate processing and
of pomegranate fruit. Acta Hortic. 2016, 1137, 247–250. value addition: Review. J. Food Process. Technol. 2016, 7,
[CrossRef] 565. [CrossRef]
17. Khoshroo, A.; Keyhani, A.; Rafiee, S.; Zoroofi, R.A.; Zamani, 31. Akuru, E.A.; Oyeagu, C.E.; Mpendulo, T.C.; Rautenbach, F.;
Z. Pomegranate quality evaluation using machine vision. Acta Oguntibeju, O.O. Effect of pomegranate (Punica granatum
Hortic. 2009, 818, 347–352. [CrossRef] L.) peel powder meal dietary supplementation on antioxidant
18. Wang, H.; Peng, J.; Xie, C.; Bao, Y.; He, Y. Fruit quality status and quality of breast meat in broilers. Heliyon 2020, 6,
evaluation using spectroscopy technology: A review. Sensors e05709. [CrossRef]
2015, 15, 11889–11927. [CrossRef] [PubMed] 32. Akuru, E.A.; Mpendulo, C.T.; Oyeagu, C.E.; Nantapo,
19. Czieczor, L.; Bentkamp, C.; Damerow, L.; Blanke, M. Non- C.W.T. Pomegranate (Punica granatum L.) peel powder meal
invasive determination of the quality of pomegranate fruit. supplementation in broilers: Effect on growth performance,
Postharvest Biol. Technol. 2018, 136, 74–79. [CrossRef] digestibility, carcase and organ weights, serum and some meat
20. Elmasry, G.; Kamruzzaman, M.; Sun, D.-W.; Allen, P. antioxidant enzyme biomarkers. Ital. J. Anim. Sci. 2021.
Principles and applications of hyperspectral imaging in 33. Magangana, T.P.; Makunga, N.P.; la Grange, C.; Stander,
quality evaluation of agro-food products: A review. Crit. Rev. M.A.; Fawole, O.A.; Opara, U.L. Blanching pre-treatment
Food Sci. Nutr. 2012, 52, 8398. [CrossRef] promotes high yields, bioactive compounds, antioxidants,
21. Matityahu, I.; Marciano, P.; Holland, D.; Ben-Arie, R.; Amir, enzyme inactivation and antibacterial activity of ‘wonderful’
R. Differential effects of regular and controlled atmosphere pomegranate peel extracts at three different harvest maturities.
storage on the quality of three cultivars of pomegranate Antioxidants 2021, 10, 1119. [CrossRef]
(Punica granatum L.). Postharvest Biol. Technol. 2016, 115, 34. Magangana, T.P.; Makunga, N.P.; Fawole, O.A.; Stander, M.A.;
132–141. [CrossRef] Opara, U.L. Antioxidant, antimicrobial, and metabolomic
22. Khoshroo, A.; Keyhani, A.; Zoroofi, R.A.; Rafiee, S.; Zamani, characterization of blanched pomegranate peel extracts: Effect
Z.; Alsharif, M.R. Classification of pomegranate fruit using of cultivar. Molecules 2022, 27, 2979. [CrossRef]
texture analysis of MR images. Agric. Eng. Int. CIGR J. 2009, 35. Khodabakhshian, R.; Emadi, B.; Khojastehpour, M.; Golzarian,
11, 1182. M.R.; Sazgarnia, A. Development of a multispectral imaging
23. Okere, E.E. Non-Invasive Measurement of Quality Attributes system for online quality assessment of pomegranate fruit. Int.
of Processed Pomegranate Products. Master’s Thesis, J. Food Prop. 2017, 20, 107–118. [CrossRef]
Stellenbosch University, Stellenbosch, South Africa, 2020. 36. Munera, S.; Hernández, F.; Aleixos, N.; Cubero, S.; Blasco, J.
24. Pareek, S.; Valero, D.; Serrano, M. Postharvest biology and Maturity monitoring of intact fruit and arils of pomegranate
technology of pomegranate. J. Sci. Food Agric. 2015, 95, cv. ‘Mollar de Elche’ using machine vision and chemometrics.
2360–2379. [CrossRef] [PubMed] Agriculture 2022, 12, 2034 Postharvest Biol. Technol. 2019, 156, 110936. [CrossRef]
21 of 25. 37. Zhang, L.; McCarthy, M.J. Assessment of pomegranate
25. Karimi, M.; Sadeghi, R.; Kokini, J. Pomegranate as a postharvest quality using nuclear magnetic resonance.
promising opportunity in medicine and nanotechnology. Postharvest Biol. Technol. 2013, 77, 59–66. [CrossRef]
Trends Food Sci. Technol. 2017, 69, 59–73. [CrossRef] 38. Kumar, A.; Rajpurohit, V.S.; Bidari, K.Y. Multi class grading
26. Fawole, O.A.; Opara, U.L. Developmental changes in and quality assessment of pomegranate fruits based on
maturity indices of pomegranate fruit: A descriptive review. physical and visual parameters. Int. J. Fruit Sci. 2019, 19,
Sci. Hortic. 2013, 159, 152–161. [CrossRef] 372–396. [CrossRef]
27. Lansky, E.P.; Newman, R.A. Punica granatum (pomegranate) 39. Widrow, B., and Lehr, M.A., “30 Years of Adaptive Neural
and its potential for prevention and treatment of inflammation Networks: Perceptron, Madaline and Back Propagation”,
and cancer. J. Ethnopharmacol. 2007, 109, 177–206. Proceedings of the IEEE, Vol. 78, N0. 4, pp. 1415–1442,
[CrossRef] [PubMed] September 1990
28. Opara, L.U.; Al-Ani, M.R.; Al-Shuaibi, Y.S. Physico-chemical 40. Yegnanarayana, B., “Artificial Neural Networks”, Prentice
properties, vitamin c content, and antimicrobial properties of Hall of India, 2001.
pomegranate fruit (Punica granatum L.). Food Bioprocess 41. Haykin,S., “Neural Networks A Comprehensive Foundation”,
Technol. 2009, 2, 315–321. [CrossRef] Prentice Hall, New Jersey, 1999.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
176 Algorithms in Advanced Artificial Intelligence

Artificial Intelligence-Based
Communication through
Cat Facial Expressions
28

K. Bhargavi1
Professor, Malla Reddy Engineering College (Autonomous)
Ch. V. Phani Krishna2
Professor, Teegala Krishna Reddy Engineering College, Hyderabad
Bandla Srinivasa Rao3
Prof. in CSE, Teegala Krishna Reddy Engineering College, Telegana

Abstract: Recent artificial intelligence experiments show that domestic cats, known for their complex facial expressions, are
more proficient in communication both defensively and with people using the Face Recognition Application. This discovery
could strengthen the link between cats and people [1]. This may lead to the development of instruments that help adopters select
suitable feline companions and owners interpret cues from their pets. However, the mystery of cat communication remains
unsolved. The results of the experiment on the facial expressions of cats use sophisticated artificial intelligence technology. A
machine learning model is trained to eliminate noise from an image of a cat’s facial expression through the diffusion model
procedure, which requires adjustments for various data types. The study investigates the application of convolutional neural
networks in analyzing cat facial expressions and compares the results with a Vision Transformer.
Keywords: Artificial intelligence, Cat facial expressions, Diffusion model, Vision transformer etc.

1. Introduction blinks, and ear positions. Of these utterances, eighteen

percent were categorized as aggressive, and the remainder,
Cats are gregarious animals, despite their image as solitary forty-five percent, as friendly. This percentage shows how
creatures. In certain feral cat colonies, there could be thousands unclear some of these idioms can be[3].
of them, but they make friends with other cats in their houses There are patterns even if it’s still unclear exactly what these
or on the streets [4]. Previous studies have predominantly phrases mean. Cats typically move their ears and whiskers
focused on animosity to comprehend the mystery of cats’ towards each other when they are friendly, and the opposite
social interactions. In association with Evolutionary, a occurs when they are not [15]. These competitive moments
place where people could talk to adoptable cats, they are occasionally accompanied by lip-licking and hushed
painstakingly captured 194 minutes of cats making faces at students. Interestingly, there are friendly facial expressions
other cats, finding an astounding 276 different expressions. on the cats that resemble those on dogs, people, primates,
This figure is comparable to the 357 facial expressions and other animals, indicating that various creatures may have
displayed by chimpanzees, which disproves stereotypes a “play face” in common[10].
about how expressive cats can be [2]. These expressions The study makes the assertion that domestic cats were solitary
are communicated by a range of facial movements, such as animals, even though it does not specifically compare these
opened lips, dropped jaws, dilated or constricting pupils, findings to those of wild cats. Although domestic cats probably
1
bhargavi.mtech@gmail.com, 2phanik16@gmail.com, 3sreenibandla@gmail.com

DOI: 10.1201/9781003529231-28
Artificial Intelligence-Based Communication through Cat Facial Expressions 177

still communicated in a defensive way, they probably started the distribution mean and standard deviation matrix [8].
to show friendlier facial expressions when they collected near This can be divided into forward and reverse processes.
humans, perhaps in anticipation of meals [5]. The findings, Mathematicians often use the jargon of physical processes to
which could improve the bond between cats and their human formalize mathematical concepts, such as diffusion equations
companions, have been commended by experts in the field for Fick diffusion, heat diffusion, and Brownian motion [9].
[11]. It may eventually lead to the development of apps The Langevin equation, based on the Wiener process, is a
that help cat owners interpret their pets’ ambiguous cues, stochastic formulation with infinitely small steps and a normal
improving understanding and communication. It might also distribution of time increments. This intertwines diffusion
assist prospective cat adopters in selecting feline companions with white noise generation, making machine learning
who will get along better with their existing pets. Even models called diffusion models. The diffusion models, which
after the deep understanding of feline facial expressions has utilize a Gaussian prior to generating data, form the core of
illuminated the intricate world of cat communication, this text-to-image generative models [3].
fascinating subject still stands. The enigma behind this still
has to be resolved [17]. 3. Cat Facial Expressions Using
The seven portions of this paper’s response can be understood Convolutional Neural Networks
as follows: section 1’s broad introduction. Cat Facial
Classification is crucial for identifying cat facial expression
Expressions Using Diffusion in Machine Learning is covered
objects, with deep learning being more effective than
in Section 2. Cat Facial Expressions Using Convolutional
traditional techniques [13]. Convolutional neural network
Neural Networks covered in Section 3. The Section 4 deals
techniques are often used to improve facial features, making
with Facial of Cat Expressions Vision Transformer. Section
advances in image expressions. CNN is a powerful neural
5 deals with experimental results. Conclusion included in
network technique for image categorization and recognition.
section 6. References are included in Section-7.
It comprises layers like classifier, pooling, activation, and
convolution [18]. The activation function decides whether
2. Cat Facial Expressions Using to excite neurons and adapts information nonlinearly. The
Diffusion in Machine Learning convolution layer extracts feature maps, bypassing the learnt
filter [20]. Activation functions like sigmoid, Tanh, and ReLU
The diffusion model process involves adding noise to an can be used for feature maps. The classifier layer selects the
image of a cat’s facial expression and learning to remove it, class with the highest probabilities. CNN’s performance
then training a machine learning model to produce a denoise depends on its ability to handle large datasets and transfer
image[7]. The process of learning the mean matrix involves learning, with models trained on the Image Net dataset,
assuming a normal noise distribution and parametrizing allowing designers to adapt parameters for new tasks [19].

Fig. 28.1 Cat facial expressions using diffusion in machine learning

178 Algorithms in Advanced Artificial Intelligence

The Facial Expressions Algorithm Utilizing Convolutional 4. Facial of Cat Expressions Vision
Neural Networks for Image Classification as follows: Image
input a collection of [height, width, and channel] pixel values. Transformer
Feature Extraction Transformers’ Cat Facial Expressions image processing
1. A feature map can be obtained by employing a application is influenced by image data type, requiring
convolution neural network. adaptation of preprocessing and input format to fully utilize
their power in various data types. Transformers, a machine
(a) Convergence (RELU).
learning tool, can be used on any data type, including
(i) Select a kernel with a size of 4x4 and depth sequences or time series of data points, as they feed on vectors
that matches the input array. [16]. Vision Transformer (ViT) uses cat facial expressions
(ii) Convolutional processing is utilized to extract image data patches flattened through linear transformations
features from facial expressions. into vector format. This process reveals that transformers
(b) (Max Pooling) Pooling. outperform typical CNNs on high data scales, indicating that
(i) The process involves reducing the spatial size cat facial expressions image data is a similar concept [14].
of the feature map through dimensionality Time series are ideal for transformers, as demonstrated by the
reduction and then extracting a 2x2 image. Temporal Fusion Transformer. This model uses LSTM layers
2. The process involves extracting low-level features from to transform time series into a right-sized vector, capturing
an image by altering the channel size to 16, 32, 64, or short-term correlations and long-term ones, and is compatible
128 until the fourth layer. with PyTorch. Reinforcement learning is a Markov chain
approach that encodes states, actions, and rewards as vectors.
Classification: For video pictures, latent features are extracted using CNN,
1. The training phase involves sending smooth output to a actions are encoded using embedding, and rewards are
feed-forward neural network with back propagation in represented as vectors with one dimension [12].
each iteration.
2. The SoftMax Classification approach is used to 5. Experiment Result
categorize photos like face expressions by identifying
their dominant characteristics using a trained model Artificial intelligence experiments reveal that domestic cats
[12]. with complex facial expressions, as identified in the Animal

Fig. 28.2 How to provide for cat expressions images data to a transformer
Artificial Intelligence-Based Communication through Cat Facial Expressions 179

Web Dataset, AFHQ and Pet Database, are more proficient in A Literature Review, MDPI, Appl. Sci. 2023, 13(9), 5521,
communication using the Face Recognition Application. The https://doi.org/10.3390/ app13095521,2023.
study compares CNN and ViT models for accuracy, sensivity, 7. K Sarvakar: Facial emotion recognition using convolutional
specificity, precision, and F1-score. Accuracy measures neural Networks, ScienceDirect, https://www.sciencedirect.
com, 2023.
model performance across all datasets, sensitivity measures
8. M Wang: Facial expression recognition based on CNN,
recognition of positive U-Net samples, and precision
IOPscience, https://iopscience.iop.org, 2020.
determines the model’s ability to predict true nativities. CNN 9. M. Pantic and L. J. Rothkrantz: Automatic analysis of facial
is a machine learning technology for facial cat expressions. expressions: The state of the art, IEEE Transactions on Pattern
Analysis & Machine Intelligence, No: 12, PP: 1424–1445,
Table 28.1 Graph for relative algorithms CNN and ViT 2000.
Relative Accuracy Sensitivity Specificity Precision Fl-score 10. MF Hansen: Towards on-farm pig face recognition using
Analysis convolutional Neural Networks, Science Direct, https:// www.
Methods sciencedirect.com, 2018.
11. Marcelo Feighelstein, Ilan Shimshoni, Lauren R. Finka, Stelio
CNN 0.92 0.91 0.92 0.93 0.92
P. L. Luna, Daniel S. Mills & Anna Zamansky: Automated
VIT 0.99 0.99 0.99 0.99 0.98 recognition of pain in cats, Springer, Scientific Reports, 12,
2022.
12. N. Christou and N. Kanojiya: Human facial expression
recognition with convolution neural networks, Third
International Congress on Information and Communication
Technology, pp. 539–545, 2019.
13. P Purini: Real-Time Facial Expression Recognition using
CNN, 46, taylorfrancis.com, https://www.taylorfrancis.com,
2022.
14. R.N.V. Jagan Mohan and K. Raja Sekhara Rao: Target
Inference on Evaluation of Angle Oriented Cluster, Computer
Science and Information Technology 2(3): 121–125, 2014
DOI: 10.13189/csit.2014.020301, Copyright © 2014 Horizon
Research publishing all rights reserved, http://www.hrpub.org,
2014.
Fig. 28.3 Graph for relative algorithms CNN and ViT
15. R.N.V. Jagan Mohan: Angle Oriented Based Image Analysis
The convolutional neural network outperforms CNN and ViT Using L-Axial Semi-Circular Model, Published in Asian
in accuracy measurements in graphs. Journal of Mathematics and Computer Research, ISSN No:
2395-4205(Print), 2395-4213(Online), Vol-10, ISSUE-4,
Page No 320–331, 2016.
References 16. Ruben Winastwa: Image Classification with Vision
Transformer, Published in Towards Data Science, 2017.
1. Andresen, N. et al: Towards a fully automated surveillance 17. Reddy Navya, Ramisetty Upendra, “Predict Early Pneumonitis
of well-being status in laboratory mice using deep learning: in Health Care Using Hybrid Model Algorithms”, Journal of
Starting with facial expression analysis, PLoS ONE, 15, Artificial Intelligence, Machine Learning and Neural Network
e0228059, 2020. (JAIMLNN), Volume 3, 2023.
2. Apeksha Khopkar, Ashish Adholiya: Facial Expression 18. S Binta Islam: Animal Species Recognition with Deep
Recognition Using CNN with Keras, Bioscience Convolutional .Neural Networks, MDPI,https://www.mdpi.
Biotechnology Research Communications, 14(05): 47–50, com, 2023.
DOI: 10.21786/bbrc/14.5/10,2021. 19. Tawsin Uddin Ahmed, Sazzad Hossain, Mohammad
3. F Yao: Facial Expression Recognition Based on Convolutional Shahadat Hossain, Raihan ul Islam, Karl Anders: Facial
Neural Networks, Hindawi Publication Journal, https://www. Expression Recognition using Convolutional Neural Network
hindawi.com,2021. with Data Augmentation, IEEE Xplore, DOI: 10.1109/
4. Finka, L. R., Luna, S. P., Mills, D. S. & Farnsworth, M. J. The ICIEV.2019.8858529,07 October 2019.
application of geometric morph metrics to explore potential 20. Waller, B., Julle-Daniere, E. & Micheletta, J. Measuring the
impacts of anthropocentric selection on animals’ ability to evolution of facial expression using multi-species Faces,
communicate via the face: The domestic cat as a case study, Neurosci. Biobehav. Rev. 113, 1–11, 2020.
Frontier Veterinary Science, 1070, 2020. 21. Y Mao: Pet dog facial expression recognition based on
5. JC Kim: Hybrid Approach for Facial Expression Recognition convolutional neural network (CNN), Nature Journal, and
Using CNN, MDPI, and https://www.mdpi.com,2022. https://www.nature.com, 2023.
6. Joe Mauricio et al: Comparing Vision Transformers and
Convolutional Neural Networks for Image Classification: Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
180 Algorithms in Advanced Artificial Intelligence

Convolutional Neural Networks for the

Identification of Skin Disorders 29

A. Aswini Priyanka*
Assistant Professor, Dept of Computer Science and Design,
Sagi Rama Krishnam Raju Engineering College, Bhimavaram

Abstract: Skin diseases, causing rashes, inflammation, and itchiness, can be genetic or lifestyle-related and can be treated
with medications, creams, ointments, or lifestyle changes. The rhabdomyosarcoma, scoliosis, cardiac fibroma, and several
skin tumors. The instruction is to identify 2- to 4-mm erythematous macules and treat them twice daily with fluorouracil
cream until they disappear. This study investigated AI-based technologies for the diagnosis and classification of skin cancer
using convolutional neural networks, evaluating their reliability by looking at data set size, diagnostic classifications, and
performance metrics. Convolutional neural networks are used in the experiment to diagnose and classify skin cancer. The
reliability of AI-based technologies is evaluated through performance metrics, diagnostic classifications, and experimental
results involving a large dataset.
Keywords: Artificial intelligence, Convolutional neural networks, Erythematous, Fluorouracil, Performance metrics, Skin
disease

1. Introduction affecting sun-exposed areas, can also form on areas rarely

exposed to sunlight, such as palms, under fingernails, and
Artificial Intelligence is revolutionizing medical diagnosis, genital areas [6]. It affects people of all skin tones, with
prognosis, and therapy. Machine learning and deep learning darker complexions more likely to develop in these areas.
models are increasingly used in skin cancer screening. These Basal cell carcinoma, a common skin condition, typically
models use algorithms to perform tasks such as patient occurs in sun-exposed areas like the neck or face and can
diagnosis, prognosis, and treatment status prediction [1]. manifest as a pearly or waxy bump, flat, or brown scar-like
AI has advanced to detect cancer earlier than traditional lesion[7-8].
methods, ensuring better treatment and outcomes. The need Squamous cell carcinoma, a common skin condition,
for machine learning and deep learning models in skin cancer typically develops on sun-exposed areas like the face, ears,
screening is paramount [2]. AI’s advancements in detecting and hands, with darker individuals more susceptible to this
cancer earlier than traditional methods are crucial for effective condition [10]. Melanoma is a cancerous mole that can
treatment and a better outcome in skin cancer, necessitating develop anywhere on the body, including normal skin or
the use of machine learning and deep learning models. Skin cancerous moles [12].It is most common in men and women,
cancer, a common form of skin cancer, is primarily found on and can occur on untreated skin. It affects people of any skin
sun-exposed skin, but can also occur in areas not typically tone, with darker skin tones often appearing on palms or
exposed [3]. Three major types include basal cell carcinoma, soles [14]. Signs include large brownish spots, moles, small
squamous cell carcinoma, and melanoma. Early detection lesions, painful itches, and dark lesions. Skin cancers include
increases treatment chances. Skin cancer, a disease primarily Kaposi sarcoma, Merkel cell carcinoma, and sebaceous gland

*Aswini.areti@gmail.com

DOI: 10.1201/9781003529231-29
Convolutional Neural Networks for the Identification of Skin Disorders 181

carcinoma [12]. Kaposi sarcoma is rare and mainly affects Image input a collection of [height, width, and channel]
people with weakened immune systems or those taking pixel values.
immunosuppressive medications. Merkel cell carcinoma Feature Extraction
causes shiny nodules and hair follicles, while sebaceous
1. To obtain a feature of skin image map, use a convolution
gland carcinoma is aggressive and originates from oil glands
neural network.
in the skin. These rare forms can be mistaken for other eyelid
problems [13]. (a) Convergence (RELU).
(i) Choose a kernel whose size is 4x4 and whose
depth matches that of the input array.
2. Proposed Work
(ii) Convolutional processing is used to get
A 16-year-old intellectually disabled teen with multiple Features of the Disease object Image.
macules on his back is evaluated. The history of cardiac (b) (Max Pooling) Pooling.
fibroma, rhabdomyosarcoma, scoliosis, and skin neoplasm’s,
(i) Reduce the spatial size of the feature map
and has been taught to recognize them. Gorlin syndrome,
using the dimensionality reduction process,
also known as Gorlin-Goltz syndrome or nevoid basal cell
and then extract the 2x2 disease image.
carcinoma syndrome is an autosomal dominant familial
cancer caused by a mutation in the patched 1 (PTCH1) gene 2. To extract low-level features from the skin disease
[15]. Symptoms include multiple basal cell carcinomas, image, carry out the steps stated earlier until the fourth
keratocystic odontogenic tumors, and dyskeratotic palmar layer, altering the channel size to a value of 16, 32, 64,
and plantar pitting. The condition is prevalent in White or 128.
populations and affects both men and women. The Gorlin Classification:
syndrome introduced diagnostic criteria and treatment 1. In each iteration of the training phase, smooth output
protocols. Patients present with multiple BCCs before 20 or is sent to a feed-forward neural network with back
excessive numbers. Treatment may include photodynamic propagation.
therapy, surgical excision, and Mohs micrographic surgery
2. Using the SoftMax Classification approach, a trained
[9]. model is utilized to categorize photos like disease object
AI-based skin cancer detection model, revealing their image by identifying their dominating characteristics.
reliability to be uncertain due to varying evaluation metrics,
CNN is a powerful neural network technique for image
image types, and data set size. categorization and skin disease recognition [13]. It comprises
Skin Cancer Disease Detection by Convolutional Neural layers such as classifier, pooling, activation, and convolution.
Network: The Classification approach is a crucial role, The convolution layer extracts feature maps, while activation
even though the quantity of the disease object database functions like sigmoid, Tanh, and ReLU are used. The
has a substantial impact on how well skin disease objects classifier layer selects the class with the highest probabilities.
are identified. Machine learning includes deep learning CNNs must handle large datasets and transfer learning
[4]. Since the properties are automatically extracted, deep models, which can be adapted to new tasks. The skin disease
learning is more effective than traditional machine learning objects algorithm utilizes CNN for image classification.
techniques. Additionally, “end-to-end learning” using
deep learning involves giving the network both tasks and
unprocessed image data. The majority of the time, advances
3. Experimental Result
in skin disease is made in skin image features including face The reliability test of AI-based technologies is evaluated
and eye expressions using the convolutional neural network using performance metrics, diagnostic classifications,
technique [5]. and experimental results from a 12000-record dataset like

Fig. 29.1 Skin disease patient image detection using CNN

182 Algorithms in Advanced Artificial Intelligence

accessdata.fda.gov. The study compares RGB, YcbCr, and 4. Erickson BJ: Magician’s corner: how to start learning about
CNN models for accuracy, sensivity, specificity, precision, deep learning, Radiol Artificial Intelligence, 1: e190072, doi:
and F1-score [17]. Accuracy measures model performance 10.1148/ryai.2019190072, 2019.
5. Jainesh Rathod, Vishal Waghmode, Aniruddh Sodha, and
across all datasets, sensitivity recognizes positive ResNet
Praseniit Bhavathankar: Diagnosis of skin diseases using
samples, and precision predicts true nativities. CNN, a deep Convolutional Neural Networks, IEEE Xplore, DOI: 10.1109/
learning approach, is trained on massive image datasets for ICECA. 2018.8474593, 2018.
skin disease diagnosis [17]. 6. Kimonis VE, Goldstein AM, Pastakia B, et al: Clinical
manifestations in 105 persons with nevoid basal cell carcinoma
Table 29.1 The graph compares the performance of RGB, syndrome. Am J Med Genet. 1997; 69(3): 299–308, 1997.
YcbCr, and CNN algorithms 7. Komal Chughtai, Rahul Gupta, Sunil Upadhaya,Samer Al
Comparative Accuracy Sensitivity Specificity Precision Fl Hadidi: Topical 5-Fluorouracil associated skin reaction ,Oxf
Methods score Med Case Reports,2017 Aug; 2017(8): omx043,Published
online 2017 Aug 10, doi: 10.1093/omcr/omx043, 2017.
RGB 0.80 0.81 0.80 0.81 0.81 8. Lev S, Furst K, Chern W: A pharmacokinetic evaluation of
YCbCr 0.88 0.87 0.86 0.85 0.86 0.5% and 5% fluorouracil topical cream in patients with
CNN 0.98 0.98 0.98 0.98 0.99 actinic keratosis, Clinical Therapy, 2001; 23: 908, 2001.
9. Maytin EV, Kaw U, Ilyas M, Mack JA, Hu B: Blue light versus
red light for photodynamic therapy of basal cell carcinoma
in patients with Gorlin syndrome: a bilaterally controlled
comparison study, Photo diagnosis Photodynamic Ther. 2018;
22: 7–13. doi: 10.1016/j.pdpdt.2018.02.009, 2018.
10. Micali G, Lacarrubba F, Nasca MR, De Pasquale R. The
use of imiquimod 5% cream for the treatment of basal cell
carcinoma as observed in Gorlin’s syndrome, Clinical
Expenses Dermatology,28 Suppl 1: 19–23. doi: 10.1046/
j.1365-2230.28.s1.7.x, 2003.
11. Mitu Pal, Bristi Rani Roy: Evaluating and Enhancing
the Performance of Skin Disease Classification Based
on Ensemble Methods, IEEE Xplore, Doi:10.1109/
ICAICT51780.2020.9333529, 2021.
Fig. 29.2 The graph compares the performance of RGB, 12. Ortega García de Amezaga A, García Arregui O, Zepeda
YcbCr, and CNN algorithms Nuño S, Acha Sagredo A, Aguirre Urizar JM: Gorlin-Goltz
syndrome: Clinic pathologic aspects, Med Oral Patol Oral Cir
The convolutional neural network performs best among the
Bucal, 13(6): E338-E343, 2008.
accuracy measurements for RGB, YcbCr, and CNN in the 13. Pomerantz H, Hogan D, Eilers D, Swetter SM, Chen SC, Jacob
graphs. SE, et al: Long-term efficacy of topical fluorouracil cream,
5%, for treating actinic keratosis: a randomized clinical trial,
4. Conclusion JAMA Dermatology, 151: 952, 2015.
14. Ribeiro PL, Souza JB Filho, Abreu KD, Brzezinski MS,
The study assesses the effectiveness of AI-based technologies Pignaton CC. Syndrome in question: Gorlin-Goltz syndrome,
in detecting and classifying skin cancers using a 12000-record An Bras Dermatology, 91(4): 541–543. doi:10.1590/abd1806
4841.20164428, 2016.
dataset, focusing on rhabdomyosarcoma, scoliosis, cardiac
15. Spiker AM, Troxell T, Ramsey ML: syndrome. In: Stat Pearls
fibroma, and other skin tumors. [Internet], Stat Pearls Publishing; Aug 8, 2023.
16. Spadari F, Pulicari F, Pellegrini M, Scribante A, Garagiola U.
References Multidisciplinary approach to Gorlin-Goltz syndrome: from
diagnosis to surgical treatment of jawbones, Maxillofacial
1. Borghesi A, Nardi C, Giannitto C, et al: Odontogenic Plastic Reconstruction Surgical 2022; 44(1): 25, doi: 10.1186/
keratocystic: imaging features of a benign lesion with an s40902-022-00355-5, 2022.
aggressive behavior, Insights Imaging, 2018; 9(5): 883–897. 17. YN Fu’adah: Convolutional Neural Network (CNN) for
Doi: 10.1007/s13244-018-0644-z, 2018. Automatic Skin, IOPscience, https://iopscience.iop.org, 2020.
2. Bree AF, Shah MR: BCNS Colloquium Group. Consensus 18. Witmanowski H, Szychta P, Błochowiak K, Jundziłł A,
statement from the first international colloquium on basal cell Czajkowski R: Basal cell nevus syndrome (Gorlin-Goltz
nevus syndrome (BCNS), Am J Med Genet A. 2011; 155A (9): syndrome): genetic predisposition, clinical picture and
2091–2097. doi: 10.1002/ajmg.a.34128, 2011. treatment, Postepy Dermatology Allegro,34(4): 381–387,
3. Bresler SC, Padwa BL, Granter SR. Nevoid basal cell doi: 10.5114/ ada.2017. 69323,2017.
carcinoma syndrome (Gorlin syndrome), Head Neck Note: All the figures and table in this chapter were designed by the
Pathology, 2016; 10(2): 119–124, DOI: 10.1007/s12105-016 author.
0706-9, 2016.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Machine Learning-Based Approach for

Detecting Online Payment Fraud 30

V. S. Naresh, G. Venkata Sridevi1, P. Srinivasarao2,

N. Hema Kiran3, CH. Sai Babu4, P. Lazar Dan5
Department of Computer Science and Engineering,
Sri Vasavi Engineering College, Tadepalligudem

Abstract: Online payment systems have become an integral part of the modern digital economy, facilitating convenient and
efficient transactions. However, they are also susceptible to various types of fraudulent activities. The potential for substantial
financial losses to both businesses and consumers underscore the urgency of addressing this escalating threat. Consequently,
there is a crucial need to develop resilient fraud detection systems. The main goal is to construct an effective and efficient online
payment fraud detection system that can promptly identify and thwart fraudulent transactions in real-time., thus enhancing
security and preserving the integrity of digital payment platforms. Our methodology begins by collecting a comprehensive
dataset containing transaction information; including transaction amount, location, time, and various other relevant features.
This dataset forms the basis for conducting training and evaluating our machine learning models, such as the Random Forest
Classifier and Logistic Regression models, which can help detect fraudulent activities.
Keywords: Fraud detection, Random forest classifier, Logistic regression, Feature engineering

1. Introduction Furthermore, the study titled ‘A Fraud Detection System

Using Machine Learning’ introduces a machine learning
In today’s digital world, online payment systems offer model designed to effectively identify both ‘fraudulent’ and
unparalleled convenience but are susceptible to various ‘genuine’ transactions in real-time. Its broad applicability
fraudulent activities. To fortify the security of these systems, across sectors with financial associations holds promise for
we’re developing a sophisticated fraud detection system significantly enhancing transaction security. Moreover, the
leveraging computer programs. Specifically, we employ two study Entitled ‘Machine Learning Approaches for Detecting
distinct algorithms—Random Forest and Logistic Regression Fraud in Online Payment Transactions’ ‘ introduces a machine
learning-based model that incorporates feature engineering
Digital payment methods, while simplifying life, have
to identify transaction fraud effectively. This model enhances
unfortunately facilitated fraudulent activities such as fake
its performance and stability by processing large volumes of
credit card usage, identity theft, and unauthorized account
data, allowing it to accumulate valuable experience Within
access. To counter these threats and ensure the integrity of
the realm of fraud detection.
online transactions, swift identification and prevention of
fraudulent activities are imperative. The paper explores Moreover, the paper titled ‘Machine Learning-Based Online
Leveraging machine learning algorithms such as Random Transaction Fraud Detection System’ tackles the challenge
Forest and Logistic Regression to analyze payment data, of identifying fraud in online transactions marked by low
distinguishing legitimate and fraudulent transactions using a cost, extensive coverage, and high frequency. This study
real-world dataset from European cardholders introduces two fraud detection algorithms, namely, the Fully
Connected Neural Network and XGBoost.
1
gvsridevi04@gmail.com, 2pindisrinivas1@gmail.com, 3narayanahemakiran@gmail.com, 4sai.chinni112@gmail.com, 5lazardanpitta@gmail.com

DOI: 10.1201/9781003529231-30
184 Algorithms in Advanced Artificial Intelligence

This work is focused on the development and implementation engineering to identify transaction fraud effectively. This
of an Online Payment Fraud Detection System, leveraging model enhances its performance and stability by processing
machine learning techniques for the purpose of identifying large volumes of data, allowing it to accumulate valuable
and preventing fraudulent activities within online payment experience within the domain of fraud detection. [4].
transactions. The primary aim is to create robust and adaptable The paper titled “Online Transaction Fraud Detection System
system that can efficiently differentiate between legitimate Based on Machine Learning” addresses the challenge of
and fraudulent transactions in real-time while minimizing the detecting fraud in online transactions, which in contrast to
occurrence of false positives. credit card transactions, are characterized by the complexity
The initiative encompasses several pivotal components, of fraud detection is heightened by factors such as low cost,
including the process of gathering data, conducting pre extensive coverage, and high frequency, posing challenges
processing tasks, implementing machine learning models, to the process. Addressing this issue, the study presents two
undertaking feature engineering, and evaluating model fraud detection algorithms rooted in the Fully Connected
performance. Real-time monitoring and alert systems Neural Network and XGBoost [5]. The paper titled “A
have been integrated to ensure rapid responses to potential Survey of Deep Learning-Based Online Transactions Fraud
fraudulent activities. The deployment onto a scalable cloud- Detection Systems” offers an extensive examination This
based infrastructure stands as a crucial element, allowing survey delves into the utilization of deep learning methods
seamless integration into existing online payment systems, within the domain of online transaction fraud detection,
prioritizing accessibility and scalability. Upholding data offering valuable insights into the application of these
privacy and compliance with regulations remains a central techniques fraud detection for online transactions [6].
focus to protect sensitive information and adhere to legal
standards. Comprehensive documentation and reporting of 4. Existing System
methodologies, discoveries, and system implementations
are integral aspects of this endeavour. The extent of this The existing systems and technologies used for online
application’s scope holds significant potential for advancing payment fraud detection vary depending on the industry,
security measures. size of the organization, and specific requirements. However,
here is an overview of the components commonly found in
existing systems for online payment fraud detection.
3. Literature Review
The study titled “Real-Time Fraud Anomaly Detection in 4.1 Transaction Monitoring and Analysis
E-banking Using Data Mining Algorithm” aims to present Conventional systems commonly utilize a blend of rule-
an optimized model for the detection of financial fraud. based and machine learning algorithms to actively monitor
This model leverages a combination of feature selection and and analyze payment transactions in real-time. These systems
machine learning classification, ultimately enhancing its look for suspicious patterns and anomalies that may indicate
fraud detection capabilities [1]. fraudulent activity.
The research paper titled “Enhanced Credit Card Fraud
Detection Model Using Machine Learning” explores machine 4.2 Data Sources
learning models within a two-stage evaluation framework. Data sources may include transaction data, user profiles,
These models are applied to a real-world dataset comprising device information, IP addresses, and historical transaction
credit card transactions from European cardholders. The records. These data sources help in identifying patterns and
research employs a strained K-fold cross-validation technique trends associated with fraudulent transactions.
in the evaluation process [2].
4.3 User Authentication
The paper titled “A Fraud Detection System Using Machine
Learning” introduces a machine learning model designed Secure user authentication is crucial to prevent unauthorized
to effectively identify both “fraudulent” and “genuine” access to the system. Multi-factor authentication (MFA) is
transactions in real-time. The applicability of this solution commonly used to enhance security.
extends to a broad spectrum of sectors with financial
associations, offering significant benefits in the realm of 4.4 Dashboard and Reporting
transaction security.[3] Most systems include a dashboard for fraud analysts to
The study titled “Fraud Detection in Online Payments monitor transactions and generate reports. Real-time
using Machine Learning Techniques” has implemented a dashboards display transaction data and alerts for immediate
model based on machine learning that incorporates feature action.
Machine Learning-Based Approach for Detecting Online Payment Fraud 185

4.5 Alerting and Notifications 5.6 Machine Learning Enhancements

Systems generate alerts and notifications when suspicious Continuously train and refine machine learning models for
activities are detected. These can be sent to fraud analysts via fraud detection to adapt to evolving fraud tactics. Implement
email, SMS, or integrated with other communication tools. anomaly detection algorithms to detect previously unknown
fraud patterns.
4.6 Machine Learning Models
5.7 Random Forest Classifier
Machine learning models, including supervised and
unsupervised learning, are used to discern patterns within The Random Forest algorithm stands as a foundational
transaction data that could potentially signify fraudulent element in the field of machine learning, playing a crucial
activity.. These models adapt and improve over time. role in the development of online payment fraud detection
systems. In the context of identifying and preventing
fraudulent activities in online transactions, it provides a
5. Proposed System robust and flexible approach. Fundamentally, Random Forest
To propose a system for enhancing the existing online is an ensemble learning technique that amalgamates multiple
payment fraud detection system, you should aim to improve decision trees to collectively make predictions. Each decision
its features, security, and user-friendliness. Here’s a proposed tree processes transaction data, considering attributes such
system that builds upon the existing system: as transaction amount, location, and timestamp. However,
what distinguishes Random Forest is its ensemble nature. By
5.1 Improved User Interface constructing a “forest” of decision trees trained on different
subsets of the data, it excels in bolstering the reliability of
Develop a modern and user-friendly web-based interface
fraud detection. This ensemble strategy not only safeguards
using responsive design techniques. Create an intuitive and
against overfitting but also empowers the system to
visually appealing dashboard for fraud analysts, displaying
effectively address imbalanced data, a prevalent challenge in
transaction data, alerts, and key performance indicators. fraud detection.
5.2 Real-time Data Visualization 5.8 Logistic Regression
Implement interactive data visualizations such as charts, Logistic Regression, despite its name suggesting regression,
graphs, and heat maps to provide a clear overview of is a fundamental and extensively employed classification
transaction patterns and anomalies. Use real-time updates to algorithm in the realm of machine learning. Its application
display transaction data and alerts as they occur. extends to binary and multi-class classification problems,
aiming to predict the probability of an input data point
5.3 User Authentication and Authorization belonging to a particular class. To achieve this, Logistic
Enhance the user authentication process by implementing Regression employs a logistic function on a linear combination
multi-factor authentication (MFA) to strengthen security. of input features. The logistic function transforms the linear
Apply Implementing role-based access control is essential to output into a probability score, offering an interpretable
guarantee that only authorized personnel can access sensitive measure of the likelihood that the data point belongs to
information. a specific class. In the context of online payment fraud
detection, Logistic Regression proves valuable in assessing
5.4 Streamlined Workflow the probability of a given transaction being fraudulent or
Design an efficient workflow for fraud analysts, including legitimate based on diverse features like transaction amount,
case management, notes, and task assignment capabilities. location, and timestamp. Logistic Regression’s output can be
Automate routine tasks to improve efficiency and reduce used to make binary decisions, such as flagging a transaction
manual work. as fraudulent if the predicted probability exceeds a certain
threshold, thus contributing to the enhancement of online
5.5 Advanced Alerting and Notification System payment security.
Develop a robust alerting system that allows for customizable
alert thresholds and notification preferences. Integrate with 6. Experiment Result
various communication channels, including email, SMS, and 6.1 Dataset
push notifications, to ensure prompt responses to potential
In the context of machine learning for online payment fraud
fraud.
detection, the dataset is a critical element. This dataset typically
186 Algorithms in Advanced Artificial Intelligence

comprises transaction data, with each entry containing system. Here are the steps and recommended practices for
various attributes and a binary label indicating whether training and fine-tuning machine learning models for our
the transaction is legitimate (0) or fraudulent (1). Essential work:
components of such datasets include transaction attributes,
such as the transaction amount, location, timestamp, and 6.3 Data Collection and Pre-processing
other relevant features that provide context for the machine Gather a diverse and comprehensive dataset of past payment
learning model. Historical records are a crucial component transactions, including both legitimate and fraudulent cases.
of such datasets, enabling the machine learning model to Reprocess Prepare the data through cleaning, normalizing,
learn patterns and trends in both legitimate and fraudulent and transforming processes. This may encompass tasks such
transactions.[l] The parameters of this dataset are Transaction as addressing missing values, encoding categorical variables,
type, amount, nameOrig, oldbalanceOrg, newbalanceOrig, and scaling numerical features.
nameDest, oldbalanceDest, newbalanceDest-details are
In Fig. 30.1, the pie chart employed in online payment fraud
provided in Table 30.1.
detection serves as a visual representation that effectively
communicates the distribution of various transaction types
Table 30.1 Kaggile dataset of online payment
within a dataset. It provides a clear and concise overview
Variables Description Type of the proportion of different transaction categories, each
Transaction type It states the type of the Categorical denoted by labels such as “Cash Out,” “Cash In,” “Debit,”
transaction
“Transfer,” and “Payment.”
Amount Transaction amount Numerical
Name-origin Senders unique id ID In Fig. 30.2, This graph will effectively display the counts
Dest-Origin Receivers unique id ID associated with different steps, providing valuable insights
Old-balance-org Senders balance before Numerical
transaction
New-balance-org Senders balance after Numerical
transaction
Old-balance-dest Receivers balance before Numerical
transaction
New-balance-dest Receivers balance after Numerical
transaction

6.2 Model Training and Tuning

Model training and tuning are crucial components of
establishing an efficient online payment fraud detection
Fig. 30.1 Distribution of transaction type

Fig. 30.2 Distribution of the step column using Plot

Machine Learning-Based Approach for Detecting Online Payment Fraud 187

into the workflow. The “step” labels on the x-axis and the
count on the y-axis enable a clear understanding of the
frequency of each step within the process.

6.4 Feature Selection

Identify relevant features (predictors) that are likely to
influence the model’s ability to detect fraud. Feature
selection can help reduce dimensionality and improve model
performance.

6.5 Split Data into Training and Testing Sets

Partition the dataset into training and testing sets, and
optionally, a validation set. The training set is employed to
train the model, whereas the testing set is utilized to assess
its performance.

6.6 Choose Appropriate Machine Learning

Fig. 30.3 Random forest confusion matrix
Algorithms
Choose machine learning algorithms suitable for fraud Precision: Precision gauges the accuracy of positive
detection, including but not limited to logistic regression, predictions made by the model. In the context of the “Fraud”
decision trees, random forests, support vector machines, class, it signifies the proportion of predicted fraud cases that
or neural networks. Conduct experiments with various were genuinely fraudulent. A precision score of 0.91 implies
algorithms to ascertain which one yields the best performance that 91% of the cases predicted as fraud by the model were
for your specific dataset. indeed fraudulent.
Recall (Sensitivity): Recall assesses the model’s capability
6.7 Model Training to correctly identify all positive instances. Specifically for
Train the selected models on the training data. Ensure that “Fraud,” it indicates the percentage of actual fraud cases that
you use appropriate Hyperparameters for each model, like the model successfully recognized. With a recall score of
learning rates, regularization strength, and batch sizes. 0.89, the model identified 89% of the actual fraudulent cases.
F1-Score: The F1-score serves as the harmonic mean of
7. Result and Discussion precision and recall, offering a balanced metric between the
two. A higher F1-score, such as 0.90 in this case for fraud,
Random Forest Classifier, Accuracy score is signifies improved overall performance by striking a balance
0.9997367436684887 between precision and recall. Support: These are the actual
counts of instances in each class. For instance, there were
Table 30.2 Obtained values for various metrics 1700 instances of fraud and 1270824 instances of no fraud
Precision Recall F1-Score Support in the dataset.
Fraud 0.91 0.89 0.90 1700 Macro Average and Weighted Average: These values are
No Fraud 1.00 1.00 1.00 1270824 the average scores across all classes, considering their equal
Accuracy 1.00 1272524 weight (macro average) or considering their support (number
of instances) (weighted average).
Macro avg 0.95 0.95 0.95 1272524
Weighted avg 1.00 1.00 1.00 1272524
In Fig. 30.4, In the ROC (Receiver Operating Characteristic)
curve, the vertical axis signifies the True Positive Rate,
In Fig. 30.3, The values in the confusion matrix for the indicating the proportion of correctly identified positive cases
Random Forest Classifier represent different metrics used to among all actual positives. Simultaneously, the horizontal
evaluate the model’s performance in distinguishing between axis represents the False Positive Rate, depicting the ratio
classes (Fraud and No Fraud) based on predictions made. of incorrectly identified negative cases among all actual
negatives. The ROC curve illustrates the trade-off between
Accuracy Score (Overall Performance): This value
true positives and false positives across different thresholds,
indicates the model’s overall correctness in its predictions.
providing insights into the model’s performance under
An accuracy score of 0.9997 means the model is incredibly
varying classification scenarios.
accurate, correctly predicting almost all cases.
188 Algorithms in Advanced Artificial Intelligence

Fig. 30.4 False positive rate

predictions) and recall (sensitivity to true positive instances)

as the threshold undergoes changes. This metric provides a
balanced assessment of the model’s performance, considering
both precision and recall simultaneously.

8. Conclusion
In conclusion, the endeavour to develop the front-end of an
online payment fraud detection system represents a significant
step towards enhancing the security and efficiency of online
payment processing. The work has addressed critical aspects
related to user interface design, user authentication, real-
time monitoring, data visualization, and the overall user
experience, contributing to combating fraudulent activities in
the digital economy.
Through the systematic methodology followed in this work,
we’ve effectively designed, developed, and deployed user-
Fig. 30.5 Precision recall curve
friendly front-end, providing fraud analysts with essential
In Fig. 30.5, The precision-recall graph depicts how the tools and insights to identify and address potential online
model’s performance varies concerning different thresholds payment fraud. The work has been guided by a profound
set for classification. understanding of existing systems and a dedication to
Threshold on X-axis: This represents the varying decision enhancing them to address the evolving challenges presented
thresholds applied by the model to classify instances. As by fraudulent activities.
the threshold changes, the model adjusts its classification, In conclusion, this project represents a significant step
affecting the balance between precision and recall. forward in the ongoing battle against online payment fraud,
Score on Y-axis:The score denotes the harmonic mean and it stands as a testament to the importance of user-centric
of precision and recall at each threshold, offering insights design, innovation, and adaptability in the realm of online
into the trade-off between precision (accuracy of positive security and fraud prevention.
Machine Learning-Based Approach for Detecting Online Payment Fraud 189

7. Ranjan, P. and Santhosh, K.(2022) “Fraud Detection on Bank

References Payments Using Machine Learning,” in 2022 International
1. Kamesh, V. ,Karthick, M., Kavin, K.,Velusamy, M. and Conference for Advancement in Technology (ICONAT), Goa,
Vidhya, R. (2019) “Real-Time Fraud Anomaly Detection India, doi: 10.1109/ICONAT53423.2022.9726104.
in E-banking Using Data Mining Algorithm,” South Asian 8. Aladakatti, D. andKodipalli, A. (2022) “Fraud Detection
Journal of Engineering and Technology. in Online Payment Transaction using Machine Learning
2. Noor Saleh Alfaiz and Suliman Mohamed Fati, (2022) Algorithms,” inInternational Conference on Smart and
“Enhanced Credit Card Fraud Detection ModelUsing Sustainable Technologies in Energy and Power Sectors
Machine Learning,” .Available:https://www.researchgate. (SSTEPS).
net/publication/358780473_Enhanced_Credit_Card_Fraud_ 9. A v a i l a b l e : h t t p s : / / w w w . r e s e a r c h g a t e . n e t /
Detection_Model_Using_Machine_Learning publication/370975314_Fraud_detection_in_Online_
3. Kalande,D. and Prabhu, P. (2021) “A Fraud Detection Payment_Transaction_using_Machine_Learning_Algorithms
System Using Machine Learning,”inInternational 10. R. D. Kho, J. and A. Vea, L. (2017) “Credit card fraud detection
Conference on Computing Communication and Networking based on transaction behavior,” in TENCON 2017 - 2017
Technologies (ICCCNT),Kharagpur,India,doi: 10.1109/ IEEE Region 10 Conference, Penang,Malaysia, doi:10.1109/
ICCCNT51525.2021.9580102. TENCON.2017.8228165.
4. Siddaiah, U. and Anjaneyulu, P. (2023) “Fraud Detection 11. S. M. Carrasco, R. and A. Sicilia-Urban, M. (2020) “Evaluation
in Online Payments using Machine Learning Techniques,” of deep neural networks for reduction of credit card fraud
in International Conference on Intelligent Computing and alerts,” in IEEE Access, 2020.
Control Systems (ICICCS), Madurai, India, doi: 10.1109/ 12. Viswanatha, V. and Ramachandra (2023) “Online Fraud
ICICCS56967.2023.10142404. Detection Using Machine Learning Approach” International
5. Liu, B. andChen, X. (2021)”Online Transaction Fraud Journal of Engineering and Management Research |
Detection System Based on Machine Learning,” Journal of Volume-13, Issue-4 Available at SSRN: https://ssrn.com/
Physics: Conference Series. abstract t=4533856, Dataset: https:// www.kaggle.com/
6. Singla, J.(2020) “A Survey of Deep Learning-Based Online datasets/rupakroy/online-payments-fraud-detection-dataset
Transactions Fraud Detection Systems,” in 2020 International Note: All the figures and tables in this chapter were designed by
Conference on Intelligent Engineering. the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
190 Algorithms in Advanced Artificial Intelligence

Secure Loan Approval Prediction:

A Privacy-Preserving Machine
Learning Approach
31

V. S. Naresh1, K. Sushmadi Lakshmi2, S. Swathi Rathnam3,

G. Lakshmi Ishwarya4, D. Kirankumar5, T. Swathi Ratnam6
Department of Computer Science and Engineering,
Sri Vasavi Engineering College(A), Pedatadepalli, Tadepalligudem

Abstract: In the period of data-driven decision-making, utilising machine learning to forecast loan approval has become
increasingly prevalent. However, the sensitive nature of financial data poses significant privacy concerns. Our methodology
employs advanced privacy-enhancing technologies to safeguard the confidentiality of sensitive financial information while
still achieving accurate loan predictions. We utilise secure homomorphic encryption techniques to ensure privacy, allowing
multiple parties to collaborate on model training without revealing their data. The proposed approach combines state-of-the
art encryption techniques with homomorphic privacy mechanisms to ensure robust privacy protection while maintaining high
model accuracy and utility in different phases. The privacy budget is quantified and controlled to ensure a balance between
model accuracy and data privacy. This research contributes to the growing work on privacy-preserving machine learning models.
It offers a viable solution for financial institutions seeking to harness the power of data-driven loan prediction while upholding
the highest data privacy and security standards.
Keywords: Decision making, Data-driven, Loan approval, Privacy preserving machine learning etc.

1. Introduction on developing a Privacy-Preserving Loan Prediction system

within the domain of machine learning. We recognise the
Protecting privacy while predicting loans using Homomorphic imperative to protect sensitive financial and personal data
encryption in machine learning and logistic regression is a from prying eyes. Therefore, our approach revolves around
groundbreaking approach that addresses the sensitive nature the integration of homomorphic encryption, a sophisticated
of financial data while making accurate lending decisions. cryptographic technique that enables operations on encrypted
This technique combines the power of machine learning with data, ensuring the privacy and security of the information
cryptographic methods to ensure data privacy and security. throughout the loan prediction process. This work endeavours
In the contemporary landscape of finance and lending, to harmonise the potency of machine learning with the
machine learning has emerged as a predictive tool for loan imperative of data privacy, fostering a safe and confidential
assessments. However, this promising technology is not environment for the loan prediction task.
devoid of its challenges, especially regarding data privacy. In machine learning-based loan prediction,a thorough
The very essence of machine learning demands access to investigation has taken place to enhance the accuracy and
vast datasets, often comprising sensitive information that efficiency of credit risk assessment in financial institutions.
individuals rightfully expect to be kept confidential. In Previous research predominantly concentrated on data
response to this critical concern, our project concentrates preprocessing techniques, feature engineering, and model

1
vsnaresh111@gmail.com, 2suahmakurella@gmail.com, 3swathisannidhi04@gmail.com, 4gantalakshmiishwarya85@gmail.com,
5
kirankirru232@gmail.com, 6tennetiswathi@gmail.com

DOI: 10.1201/9781003529231-31
Secure Loan Approval Prediction: A Privacy-Preserving Machine Learning Approach 191

selection to develop robust and reliable loan prediction potentially shedding light on the importance of client-related
models. Researchers have explored many algorithms, factors.
including traditional ones like logistic regression and Vimala and Sharmili (2018) [11]: This research focuses on
decision trees, and advanced techniques such as random predicting loan risk using Naive Bayes and Support Vector
forests,neural networks, support vector machines, and. Machine, offering insights into the comparative performance
Moreover, integrating alternative data sources, such as social of these algorithms.
media activity and transaction history, has been a recurring
theme, contributing to the diversification of features used for Priya et al. (2018) [12]: The paper explores exploratory
prediction. Furthermore, studies have tackled the challenges analysis on the prediction of loan privilege for customers
of class imbalance, interpretability, and model explainability using random forest, potentially introducing insights into the
to guarantee accurate forecasting; the predictions made by random forest approach for loan prediction.
these models align with regulatory requirements and can Ibrahim et al. (2020) [15]: The study compares the CatBoost
be comprehended by stakeholders. Research in this area classifier with other machine learning methods, providing
offers critical understanding regarding the development of insights into the comparative performance of different
loan prediction models, highlighting both the successes and algorithms.
challenges that have shaped the development of this crucial Tejaswini et al. (2020) [16]: The research focuses on accurate
application in the financial domain. loan approval prediction based on a machine learning
approach, potentially introducing novel techniques for
2. Related Work improving prediction accuracy.
Kumar et al. (2019) [1]: This study uses machine learning Gupta et al. (2020) [17]: The paper discusses a bank loan
techniques to predict loan approval. Supriya et al. (2019) prediction system using machine learning, likely providing
[2]: The paper explores machine learning models for loan insights into the system architecture and predictive models
prediction, emphasizing their application in decision-making employed.
processes. Vaidya (2017) [18,19]: This study employs logistic regression
Arun Kumar et al. (2016) [3] and (2016) [13]: These papers for predicting loan approval, potentially providing insights
discuss loan approval prediction based on machine learning into the predictive and probabilistic approach using this
approaches, providing insights into different algorithms and algorithm.
methodologies. Singh et al. (2021) [20]: The research focuses on predicting a
Ashwitha et al. (2022) [4]: The study proposes an approach for modernized loan approval system based on a machine learning
predicting loan eligibility using machine learning, potentially approach, potentially discussing integrating contemporary
introducing novel methods or features. Chawan et al. (2022) technologies.
[5]: The research focuses on a “Bank Loss Predictor,” likely
delving into the prediction of potential losses in the banking 3. Preliminaries
sector related to loans.
Barua et al. (2021) [6]: The paper introduces “Swindle,” a 3.1 Logistic Regression
system predicting the probability of loan defaults using the Logistic regression, a flexible linear regression analysis
CatBoost algorithm. model, is predominantly employed for supervised learning
Kirubanantham et al. (2021) [7]: The study explores credit tasks. It finds applications in regression, binary classification,
sanction forecasting, likely discussing methods to forecast and multi-classification problems. Implementing logistic
credit sanctions in the context of loan approval. Sheikh regression typically involves three key steps they are defining
et al. (2020) [8] and (2020) [19]: These papers present an a prediction function, creating a loss function, and optimising
approach for predicting loan approval using machine learning regression parameters to minimise this loss. A cost function
algorithms, potentially providing insights into the algorithmic is established initially when using logistic regression for
choices and features used. regression or classification. Then, an iterative optimisation
technique is employed to identify the optimal model
Soni and Paul (2019) [9]: The paper discusses an algorithm
parameters. Finally, the model’s performance is assessed.
for a loan credibility prediction system, likely providing
insights into the features and methods employed. Jency et The prediction function in logistic regression is closely tied
al. (2018) [10]: The research involves an exploratory data to the Sigmoid function, expressed as:
analysis for loan prediction based on the nature of clients, 1
S(x ) = 1 + (1)
1+ e - x
192 Algorithms in Advanced Artificial Intelligence

In this Equation, ‘x’ represents a variable. Figure 31.1 visually Now we are decrypting it with the help of secrete key sk,
illustrates the Sigmoid function curve, highlighting its ability Dec(sk, CA) Æ a
to represent outcomes within the range of 0 to 1 evenly. Dec(sk, CM) Æ m
where,
a = n1 + n2 & m = n1x n2 (3)

4. Proposed System
In this section, we have presented Privacy-Preserving Logistic
regression for loan prediction.

4.1 Privacy-Preserving Logistic Regression for

Loan Prediction
DataSet: The loan prediction dataset[7] contains 12
attributes, including Gender, Education, Dependents, Marital
Status, Self-Employment status, Applicant and Co-applicant
Incomes, Loan Amount, Loan Term, Credit History, and
two binary attributes indicating Property Area (Rural and
Semiurban). It is typically used for predicting loan approval
Fig. 31.1 Logistic regression sigmoid function
based on these attributes, where Credit History, Income, and
The logistic regression prediction function can be derived Property Location often play crucial roles. This dataset is
from Equation (1) as follows: valuable for machine learning and financial analysis to make
informed lending decisions.
( )
g(x ) = S wT x =
1
1+ e -w
T
x
(2) Our privacy-preserving loan prediction model was suggested
usinga carefully constructed set of steps, each aimed at
Here, ‘ω’ is a parameter to be determined during the model ensuring the security and confidentiality of sensitive financial
training process. and personal data. These steps include:

3.2 Partial Homomorphic Encryption (PHE) 5. Training Data Privacy

The PHE ensures confidentiality of sensitive data by allowing
mathematical operations (addition or multiplication) 5.1 Encrypting the Dataset
exclusively on encrypted data. This process can be repeated The first step in our methodology involves isolating sensitive
indefinitely. Partial Homomorphic Encryption facilitates loan applicant data. We employ a robust privacy-preserving
circuit evaluation, accommodating only one gate type at a method known as homomorphic encryption to achieve
time either addition or multiplication. this. Specifically, we encrypt the dataset using the banker’s
Homomorphic encryption mainly involves the following public key,as presented in Fig. 31.2. This encryption process
security operations: encryption, decryption, secure ensures the data remains confidential and secure during the
multiplication, and secure addition.Let us take the Public entire machine learning model development phase. Once the
key as pk and the secret key (private key) as sk. Suppose that
there are two numbers(Plain Text)n1andn2. Firstly, we need
to encrypt that plaintext into ciphertext, that is
Enc(pk,n1) Æ c1
Enc(pk,n2) Æ c2
After that, we perform the addition and multiplication
operations on the ciphertext
c1 + c2 = CA
c1x c2 = CM

Fig. 31.2 Training data privacy

Secure Loan Approval Prediction: A Privacy-Preserving Machine Learning Approach 193

dataset is encrypted, it is securely transmitted to a third-party information remains confidential during transmission. This
organisation (Cloud), Maintaining the confidentiality of the step preserves the privacy of the loan applicants’ personal and
applicant’s personal and financial information. This step is financial data, as it is only accessible to the trusted banker,
significant in safeguarding loan applicants’ privacy while which provides input privacy.
allowing for predictive models’ development. Decryption and Model Application
Once the encrypted data is in the banker’s possession, they
6. Model Privacy can decrypt it using their private key. This decryption process
allows the banker to apply the previously obtained coefficients
6.1 Model Development with Homomorphic and intercept from the machine learning model to generate
Encryption a loan prediction result for each applicant. Importantly, this
In the second phase of our methodology, the third-party phase maintains the privacy of the loan applicants’ data, as
organisation (Cloud), which has received the encrypted the decryption is performed by the trusted banker responsible
dataset, proceeds to build a logistic regression model for loan for safeguarding the confidentiality of the information.
prediction without ever decrypting the data. This is achieved
through homomorphic encryption, a groundbreaking 7.2 Output Privacy
cryptographic technique that allows calculations to be Secure Result Transmission to the Client
Executed on encrypted data without revealing the basic Upon determining the loan approval status for each applicant,
information. By applying homomorphic encryption to the the banker securely communicates the results back to the
dataset, the third party can train and fine-tune the logistic respective clients. The result information is encrypted using
regression model while maintaining the confidentiality of the the client’s public key before transmission. This encryption
loan applicant’s sensitive attributes, such as income, credit guarantees that the loan decision remains confidential
history, and personal details. This verifies that the applicants’ throughout the communication process. Only the authorised
privacy is maintained throughout the model-building process. client, possessing the corresponding private key, can decrypt
and access their loan approval status.
6.2 Decryption and Model Evaluation
Client Decryption and Final Decision
Once the third party has successfully developed the logistic
regression model, the encrypted coefficients and model In the final phase of our methodology, the loan applicants
parameters are transmitted back to the banker. In the third phase receive the encrypted loan approval result from the banker.
of our methodology, the banker, who possesses the necessary The clients can then decrypt this information using their
private key, can decrypt these coefficients. Decryption private key. This decryption process allows the clients to
enables the banker to access the logistic regression model’s securely and confidentially obtain their loan decision, whether
insights and predictions, which are essential for making loan it is approved or not. By employing end-to-end encryption
decisions. This phase allows the banker to evaluate the logistic and secure key management, the privacy of the sensitive loan
regression model’s performance, assess the risk associated application data is preserved throughout the entire process,
with loan applicants, and make informed lending decisions starting from data gatheringto final decision delivery.
while keeping the applicants’ sensitive data confidential. By Input and output privacy in the testing phase are presented
employing this secure and privacy-preserving approach, we in Fig. 31.3; the privacy and confidentiality of sensitive data
balance data utility and privacy in loan prediction, ensuring
the protection of individuals’personal information while
maintaining the efficacy of the lending process.

7. Testing Data Privacy

7.1 Input Data Privacy
The new encrypteddata record for prediction
Following the successful development of the loan prediction
model, the banker, as a trusted party, can securely collect
encrypted data from the loan applicants. This data is
encrypted with the banker’s public key, ensuring that sensitive
Fig. 31.3 Testing data privacy
194 Algorithms in Advanced Artificial Intelligence

are paramount. Homomorphic encryption and the secure • True Negative (TN): TN signifies instances where the
transmission of encrypted data and results ensure that only model correctly predicts a loan as denied, and it is
authorised parties, such as the banker and the clients, can indeed denied. TN represents the number of accurately
access and interpret the information, thereby safeguarding predicted loan denials.
the personal and financial details of the loan applicants while • False Negatives (FN): FN represents current status
facilitating the loan approval process. of cases that incorrectly predicts a loan as denied but
actually approved. FN indicates the number of missed
8. Experimental Results opportunities where a loan should have been approved
but was not.
We have experimented on a well-known bank loan dataset,
consisting of twelve features and 4269 records, which decides 8.2 Accuracy
whether to approve the loan. The dataset is partitioned into Our privacy-preserving machine learning model for loan
80percent training and 20percent testing samples. The code is prediction has demonstrated a commendable level of
developed using Python 3.10.12, with Intel(R) Xeon(R) CPU accuracy, achieving a consistent 74% accuracy rate across
@ 2.20GHz, collab. Coming to the results,we take accuracy, our experiments. Importantly, this accuracy was maintained
ROC curve and Confusion matrix as parameters to determine even when the model was applied to encrypted data,
the efficiency of our model. emphasising the robustness and effectiveness of our privacy-
preserving techniques. Maintaining a high level of predictive
8.1 Confusion Matrix
performance while safeguarding sensitive data through
In our binary loan prediction classification task, we encryption is a significant accomplishment, as it aligns with
employ a 2x2 confusion matrix Fig. 31.4 to assess our the paramount importance of preserving privacy in financial
privacy-preserving machine learning model’s performance applications. This finding underscores the feasibility of
comprehensively. This matrix comprises four essential employing privacy-preserving machine learning approaches
components, each shedding light on a distinct aspect of the to address the complex challenges of secure and accurate
model’s performance: loan prediction in a world increasingly concerned with data
• True Positives (TP): These are instances where the model privacy and security.
correctly predicts a loan as approved, and it is indeed
approved. TP indicates the number of successful loan 8.3 ROC Curve
approvals accurately predicted by the model. Furthermore, an essential aspect of evaluating the
• False Positives (FP): FP occurs when the model effectiveness of our privacy-preserving loan prediction model
incorrectly predicts a loan as approved but is denied. is the analysis of its Receiver Operating Characteristic (ROC)
This represents the number of loans mistakenly approved curve. The ROC curve confirms our model’s predictive
when they should not have been. capability and provides insights into its discrimination
performance. ROC curve presented in Fig. 31.5, consistently

Fig. 31.4 Confusion matrix Fig. 31.5 ROC curve

Secure Loan Approval Prediction: A Privacy-Preserving Machine Learning Approach 195

lies above the diagonal line representing random classification, 5. Chawan, Brijesh, et al. (2022). “Bank Loss Predictor.” 2022
underscoring the model’s superiority over a random classifier. 3rd International Conference for Emerging Technology
This observation indicates our model’s ability to strike a (INCET).
6. Barua, Sujoy, et al. (2021). “Swindle: Predicting the
meaningful balance between true positive and false positive
probability of loan defaults using catboost algorithm.” 2021
rates. The consistently superior performance exhibited by our 5th International Conference on Computing Methodologies
model in this regard reinforces its reliability and underscores and Communication (ICCMC).
its potential as a valuable tool for financial institutions. In 7. Kirubanantham, P., Saranya, A., and Kumar, D. S. (2021).
conclusion, the ROC curve analysis affirms that our model is “Credit Sanction Forecasting.” 2021 4th International
a good and robust solution for loan prediction tasks. Conference on Computing and Communications Technologies
(ICCCT).
8. Sheikh, M. A., Goel, A. K., and Kumar, T. (2020). “An approach
9. Conclusion for prediction of loan approval using machine learning
algorithm.” 2020 International Conference on Electronics and
In conclusion, our journey in developing a privacy-preserving Sustainable Communication Systems (ICESC).
loan prediction model using the powerful combination of 9. Soni, P. M., and Paul, V. (2019). “Algorithm for the loan
homomorphic encryption and logistic regression has yielded credibility prediction system.” Int J Recent Technol Eng,
exceptional results. Our exertion of homomorphic encryption 8(1S4), 1080–1087.
technology has been pivotal in achieving the dual goals of 10. Jency, X. Francis, Sumathi, V. P., and Shiva Sri, J. (2018). “An
protecting data privacy and maintaining high accuracy. exploratory data analysis for loan prediction based on nature
Through the implementation of this cutting-edge encryption of the clients.” International Journal of Recent Technology
and Engineering (IJRTE), 7(4), 17–23.
technique, we ensured that sensitive customer information
11. Vimala, S., and Sharmili, K. C. (2018). “Prediction of Loan
remained confidential and unreadable by external entities, Risk using NB and Support Vector Machine.” ICACT 2018,
adhering to stringent privacy standards and regulations. Volume 4, Issue 2, 110–113.
Maintaining an impressive accuracy rate, notably at 74%, 12. Priya, K. U., et al. (2018). “Exploratory analysis on prediction
showcases the efficacy of our privacy-preserving approach. of loan privilege for customers using random forest.”
Furthermore, our model’s ROC curve consistently surpassing International Journal of Engineering Technology, 7(2.21),
the random classifier highlights its superiority and robustness 339–341.
in distinguishing between loan approval and denial.This dual 13. Arun, Kumar, Garg Ishan, and Kaur Sanmeet. (2016). “Loan
approval prediction based on machine learning approach.”
accomplishment underscores the potential of preserving data IOSR J. Comput. Eng, 18(3), 18–21.
privacy without sacrificing predictive power. Our work paves 14. KaggleDataset:https://www.kaggle.com /burak3ergun/loan
the way for a future where privacy and accuracy coexist data-set.
harmoniously, offering a robust solution to the challenge of 15. Ibrahim, A. A., et al. (2020). “Comparison of the CatBoost
secure loan prediction while respecting data privacy in the classifier with other machine learning methods.” International
financial sector and beyond. Future research could explore Journal of Advanced Computer Science and Applications,
the optimisation of these methods and their adaptation 11(11), 2020.
16. Tejaswini, J., et al. (2020). “Accurate loan approval prediction
to different data types and use cases. In the realm of loan
based on machine learning approach.” Journal of Engineering
prediction, our model serves as a testament to the feasibility Science, 11(4), 523–532.
of achieving high accuracy while safeguarding sensitive data. 17. Gupta, A., Pant, V., Kumar, S., and Bansal, P. K. (2020). “Bank
Loan Prediction System using Machine Learning.” 2020 9th
References International Conference System Modeling and Advancement
in Research Trends (SMART).
1. Kumar, R., et al. (2019). “Prediction of loan approval using 18. Vaidya, A. (2017). “Predictive and probabilistic approach
machine learning.” International Journal of Advanced Science using logistic regression: Application to prediction of loan
and Technology, 28(7), 455–460. approval.” 2017 8th International Conference on Computing,
2. Supriya, P., et al. (2019). “Loan prediction by using machine Communication and Networking Technologies (ICCCNT).
learning models.” International Journal of Engineering and 19. Sheikh, M. A., Goel, A. K., and Kumar, T. (2020). “An
Approach for Prediction of Loan Approval using Machine
Techniques, 5(2), 144–147.
Learning Algorithm.” 2020 International Conference on
3. Arun, Kumar, Garg Ishan, and Kaur Sanmeet. (2016). “Loan
Electronics and Sustainable Communication Systems
approval prediction based on machine learning approach.”
(ICESC).
IOSR J. Comput. Eng, 18(3), 18–21.
20. Singh, V., Yadav, A., Awasthi, R., and Partheeban, G. N.
4. Ashwitha, K., et al. (2022). “An Approach to Predict Loan (2021). “Prediction of Modernized Loan Approval System
Eligibility using Machine Learning.” 2022 International Based on Machine Learning Approach.” 2021 International
Conference on Artificial Intelligence and Data Engineering Conference on Intelligent Technologies (CONIT).
(AIDE).
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
196 Algorithms in Advanced Artificial Intelligence

AI with Edge Computing-Driven

Development in Healthcare Analysis 32

K. Vijaya Naga Valli1

Research Scholar, Sathyabama Institute of Science and Technology,
Chennai, India
L. Sujihelen2
Associate Professor, Sathyabama Institute of Science and Technology,
Chennai, India

Abstract: An emerging paradigm, edge computing reduces network latency and expenses by putting networks and devices
near clients, enabling faster, more efficient, and real-time data processing. In order to analyse and respond to data generated
by devices in real-time, edge computing is crucial in the IoT. The increasing integration of IoT technology will provide faster
insights. Internet of Things (IoT) devices can assist hospitals in managing assets, including big datasets, and keeping tabs on
patients’ vital signs like heart rate and blood pressure. They can also monitor medication adherence. By combining AI with
edge computing, we hope to enhance several areas of healthcare, including patient scheduling and diagnosis, treatment, and
satisfaction; remote monitoring of chronic diseases; care; doctor response time; and staff compliance. In order to forecast
diseases utilising the RNN encoder and decoder architecture for sequence-to-sequence prediction, this article employs a
regression analysis method in healthcare data. Finding the optimal fit for a set of points in healthcare data is the goal of the
experimental investigation, which employs an RNN encoder and decoder architecture for disease prediction.
Keywords: Artificial intelligence, Edge computing, Regression analysis, RNN encoder and decoder

1. Introduction network, edge computing can alleviate the problems of slow

data communication and long-distance processing (Li J.,
The goal of edge computing is to decentralise networks by 2020). Edge computing, a new technology, enables rapid data
lowering latency, improving response time, and minimising response, saving time, money, and maintenance expenses
bandwidth consumption by moving processing and storage while processing in real-time with zero latency. According
closer to the source. According to Xu H. (2021) [1], its to Ali O. 2023 [4], edge computing also enables effective
primary goal is to move data processing to the network’s processing on huge scales, which saves internet bandwidth
periphery through the use of gateways or smart objects. and reduces expenses. By providing a secure layer, it prevents
The difficulties of dealing with massive data sets, depleting sensitive data from being stored in public clouds. During
network bandwidth, and growing reaction times are all pandemics like COVID-19, wearable health monitors such
addressed by this method (Chen S., 2019 [2]). By managing as fitness trackers and smartwatches are crucial for real-
the Internet of Things (IoT), improving data storage, and time analysis (Sun L., 2021). Edge computing speeds up
streamlining service delivery, edge computing can lower data collection and processing for clinicians, enabling better
reaction times and transfer rates. By utilising the 5G data treatment and an extra safeguard for patient-generated health

1
kvn.valli27@gmail.com, 2sujihelen@gamil.com

DOI: 10.1201/9781003529231-32
AI with Edge Computing-Driven Development in Healthcare Analysis 197

data (PDHD). Nonetheless, as of Gupta PM, 2023 [6], worries leading to improved healthcare efficiency. The volume of
around privacy and data security persist. When used in offline data that needs to be sent to the cloud is one of the biggest
mode, these devices can assess and track patients without an problems with s-health systems. Edge-based processing, such
internet connection. as compression and event recognition, can fix this. Based on
Clinical decision support (CDS) is increasingly important in Radanliev’s (2023) work on designing cyber risk assessments
today’s healthcare systems, as is the use of medical devices for AI and Industry 4.0 healthcare systems, this paper posits
such as tablets, wearables, health monitors, and imaging that digital health systems can be cyber-risk forecasted using
systems driven by artificial intelligence (AI) to improve predictive algorithms. According to Bansal (2024), state-of
patient care. Reference: Jia Z., 2023 [7]. Health monitors the-art health systems built on the Internet of Things (IoT)
and wearables can detect possible problems in X-rays and have the potential to enable healthcare providers to deliver
prioritise their assessment by radiologists or doctors; they timely, accurate information on the correct patients. In order
can also notify medical workers of problems; they can assist to establish a cost-efficient system, the OESP (Optimal Edge
with remote treatment; and they can give real-time updates Server Placement) algorithm effectively chooses the optimum
on patients’ vitals. According to Lin H. (2020) [8], new sites to put edge servers. By Jasim 2023, the algorithm had
healthcare technologies are changing workflows, cutting improved by more than 80% [18]. Wu, 2023 [19], argues that
costs, and increasing patient care. On the other hand, they innovative health technology has the potential to improve
produce massive volumes of data, which calls for choices health policy communication while also giving stakeholders
between managing data locally and in the cloud. Wu F. and lawmakers new perspectives. As seen with the newest
(2021) notes that edge computing can supplement cloud chips from smartphone makers, bringing computing closer to
computing by bringing analytics, storage, and processing the user in wearable sensors entails constructing edge devices
closer to the data sources. As a result, health systems are or servers that can execute trillions of operations per second
better able to handle the yearly 50 petabytes of data that while requiring remarkably little power (Gusev, 2022 [20]).
is generated. When it comes to storing, analysing, and Emam, 2019 [21] presents a multi-objective optimisation
processing data, health systems and providers are turning to framework that allows an edge node to choose the best radio
cloud computing. By Saeidimesineh R., 2023 [10], they are access technology (RAT) and dynamically alter compression
creating a new approach to data management that takes into parameters in order to balance latency, distortion, energy
account requirements, expenses, and advantages. It could be consumption, and performance. It is now feasible to reduce
helpful to provide just summary totals at set intervals and delays with a different approach. To support a proof-of-concept
limit the amount of data sent from patients’ wearables to application towards delivering an e-healthcare scenario
the cloud. Sreelakshmi S., 2023 [11] states that while cloud by Ray, 2021 [22], we amalgamate the Internet of Things
storage is ideal for larger operational data, on-premise storage (IoT) with edge computing in this work. Despite its benefits,
is necessary to comply with federal privacy regulations edge computing isn’t without its problems. For example,
pertaining to health information. By utilising software according to Hartmann (2022) [23], in order to achieve the
and hardware technologies that take advantage of edge same level of performance as cloud-based solutions while
computing, artificial intelligence, and cloud connectivity, the using less computational complexity, advanced privacy and
technology enhances the collection, analysis, and synthesis of data reduction techniques are necessary. Engaged, energised,
health data. The technology portfolio enables virtualization, and empowered providers with the necessary technology
enhances security, and reduces IT loads. Improved healthcare and management systems can only produce exceptional
CDS, quicker diagnostics, and patient monitoring are all patient outcomes (Pronovost, 2023 [24]). The reboot process,
possible because of analytics and AI that can move from the described by Ghani (2023) [25], could improve healthcare
edge to the cloud (Dash S., 2019). According to Jimma BL service provision as trusted next-generation health units,
(2023) [13], artificial intelligence is being used to develop thanks to the ever-evolving trusted technology in the health
cutting-edge healthcare solutions that consolidate various sector.
devices, apps, and services into one platform, making use of The seven parts of this paper’s reply are as follows: general
the resources already available in the cloud and data centres. introduction to Section 1. In Section 2, we delve into
Recent work by Rahman A. 2023 [14] shows that providers the topic of AI utilising edge computing on health data.
can increase their clinical value by using edge computing and Chapter 3 delves into the health data process through the
analytics. For instance, according to Rehman MU, 2023 [15], use of machine learning architecture. Section 4 delves into
deep learning (DL) methods hold promise in biological image the architecture of RNN encoders and decoders. Section 5
analysis for the early detection of acute diseases through the discusses the outcomes of the experiments. Section 6 includes
management of massive amounts of pertinent data, ultimately the conclusion. Section 7 contains the references.
198 Algorithms in Advanced Artificial Intelligence

2. Proposed Work computer vision and deep learning inference methods, an

AI-enabled imaging system is able to triple the accuracy of
Improving patient outcomes and maybe saving lives through chest X-ray pneumothorax identification. As more and more
speedier medical data transmission is the primary goal of healthcare providers use mobile and point-of-care devices
the proposed effort, which centres on healthcare data and to enhance patient care and extract useful data from their
the usage of edge computing. By making healthcare data, data, edge computing and analytics are becoming more
particularly RNN encoding and decoding for sequence popular. Using edge servers to meet data localization and
to-sequence prediction, more accurate, edge computing privacy constraints, real-time imaging, and analytics driven
improves patient outcomes and could potentially save lives. by edge AI can improve clinician support and triage. The
When applied to medical records, linear regression is a use of computing at the network’s periphery speeds up data
machine learning technique for illness prediction. transfers and decreases latency since there is less physical
distance between data sources and processors.
2.1 Enhance the Scheduling of Patients
In order to ensure appropriate appointment scheduling 2.2 Machine learning Architecture of Healthcare
in clinics, the complicated process of patient scheduling Data
takes into account aspects such as broad inquiries, Applying Machine Learning to the Health Data Process
particular symptoms, preexisting relationships, and the A machine learning model’s architecture is its road map
degree of healthcare system unfamiliarity. Since erroneous to development and deployment. It lays out the steps for
scheduling can happen before patient arrival or at the processing data, training and evaluating models, and making
last minute, properly scheduling appointments can boost predictions. Here is a simplified explanation of the eight main
visit volume while decreasing last-minute cancellations. parts of a machine learning architecture: These parts learn
Utilizing decision trees in EMR, medical practices can to recognise patterns in data by analysing huge datasets of
improve patient scheduling by referencing specific patient instances. First, there’s task orchestration, which is in charge
data in the background, ensuring the appropriate provider of the data and job flows in the machine learning pipeline.
is seen at the designated time, reducing barriers to access, Task orchestration ensures that all tasks are executed in
enhancing patient satisfaction, and minimizing revenue loss. the correct sequence and optimizes resource utilization.
Medical practices can include decision trees into their clinic
procedures with the use of electronic medical record (EMR)
cloud software, which records patient data. To improve
scheduling accuracy and decrease no-show and cancellation
rates, decision trees use EMR scheduling tools to produce
clinic appointments without departmental restrictions.
They do this by picking the proper visit type, provider, and
location. Merging AI and edge computing by compressing
and decompressing medical imaging and video footage from
MRIs and CT scans, AI technology accelerates compute-
intensive tasks on edge or cloud servers. By utilising edge
computing and analytics, healthcare providers can gain
useful insights from health data, which in turn improves
patient outcomes and adds financial and operational value.
Healthcare organisations are quickly realising the benefits
of edge computing, which include automated care delivery,
remote patient monitoring, and the utilisation of AI systems
to improve the speed and accuracy of diagnoses. The use of
computational power closer to medical data sources, allowing
for remote monitoring, automated care delivery, and AI-
enhanced diagnosis speed, is improving patient outcomes in
healthcare. Improved patient monitoring, quicker diagnoses,
and better clinical decision support (CDS) are just a few ways
in which healthcare systems benefit from AI and analytics.
These technologies also alleviate the strain on information
technology (IT) infrastructures and cut expenses. By utilising Fig. 32.1 Health data process of machine learning architecture
AI with Edge Computing-Driven Development in Healthcare Analysis 199

Second, developers and researchers use training health data successfully models generalise to new patients’ health data,
to develop and train machine learning models for specific they are tested on held-out test data.
tasks. The models third use case is health data prediction, Collecting Data: Relying on machine learning models for
which involves taking a learned model and applying it to new health data training makes all the difference. This component
data. You can use a model that has been trained to categorise ensures the collection and retention of high-quality patient
photos or text to guess what an object is in a new picture or data to train and evaluate models.
piece of text. The feature engineering pipeline is responsible
for getting the health data ready to train the model. It entails Generating Data: Gathering real-world medical data for
cleaning up data so that models can make sense of it and use training models might be challenging or costly in some
it for prediction purposes. situations. In this part, we model the data we get from virtual
patients after real-life hospital records.
Interaction: This part specifies the ways in which models
communicate with other systems and people. One use case
for a model in a web app is to take user input and use it to 3. RNN Encoder and Decoder
make predictions. Architecture
Assessment: This part finds ways to enhance machine learning As a kind of memory, recurrent neural networks (RNNs) take
models by evaluating their performance. To determine how text sequences as inputs or outputs. The encoder-decoder

Fig. 32.2 Health data process using RNN encoder-decoder architecture

200 Algorithms in Advanced Artificial Intelligence

architecture was the de facto standard for machine translation

prior to the advent of self-awareness. A pair of recurrent
neural networks, the RNN encoder-decoder, converts a source
sequence of varying length into a vector of fixed length and
vice versa. When it comes to modelling sequences with other
sequences, the RNN ncoder-decoder architecture is quite
efficient. Iteratively, it takes a vector representation of an
input sequence and uses that to decode the output sequence.
This design brought attention to modelling and laid the
groundwork for LLMs. A stacking recurrent neural network
(RNN) layer helps the encoder understand the context and
temporal dependencies of sequences. The hidden state is the
most recent RNN time step. The decoder circuit retrieves
the original signal by reversing the process that the encoder
circuit used to transform the applied information signal into a
coded digital bit stream. The decoder takes an input sequence
and uses the encoder’s contextual representation to create an
output sequence.

4. Experimental Result
The encoder-decoder approach for recurrent neural networks
is one well-liked neural machine translation method that can
beat more well-established statistical machine translation Fig. 32.4 Edge computing in healthcare data accuracy
methods. The experiment explores the use of Edge
computing in healthcare data accuracy is 98%, specifically The goal of simple linear regression is to identify the values
RNN encoding and decoding, to improve patient outcomes of w0 and w1 that will generate the most accurate y value
and potentially save lives. when given an x. This equation, also known as the model, can
also be evaluated in machine learning terms.
n
Â (x1 x )(y1 y)
i=1
w1 = n
(1)
Â (x1 x )2
i=1

w0 = y w1 x

For example given the data in below table the values of W0

Fig. 32.3 RNNs on edge computing are being utilized to and W1 are calculated as follows:
minimize loss value and health data

The graph displays 1459 records using RNNs on edge

computing to reduce loss value health data, achieving an
optimal accuracy of 98% based on loss value prediction.
The simple experiment uses Linear Regression in machine W1 = 0.581
learning to predict diseases using healthcare data, identifying W0 = 160.92 – 0.581* = 130.35
relationships between variables and training basic models. Y0.581x + 130.35
The line of best fit has the form of y = W0 +W1x. Suppose X = 51,
x is the input or independent variable Y0.581 * 51 + 130.35 = 159.981160
W1 is the slope, or steepness, of the line (Results in accuracy)
W0 is the y-intercept
y is the output or dependent variable
AI with Edge Computing-Driven Development in Healthcare Analysis 201

Table 32.1 Blood pressure of pregnancy women of Age > 45 employed an RNN encoder and decoder architecture with the
SNo Age Blood Pressure (B.P)
goal of locating the line that best fits a given set of points. If
you use regression to find the line, you can be sure that equal
1 56 160
numbers will fall on either side of it.
2 46 161
3 72 160
References
4 45 161
5 63 160
1. Xu H, Huang W, Zhou Y, Yang D, Li M, Han Z: Edge
computing resource allocation for unmanned aerial vehicle
6 47 161 assisted mobile network with blockchain applications. IEEE
7 55 162 Transactions on Wireless Communications, 20(5): 3107-21,
8 49 160 2021.
2. Chen S, Wen H, Wu J, Lei W, Hou W, Liu W, Xu A, Jiang Y.
9 48 161
Internet of things based smart grids supported by intelligent
10 50 162 edge computing. IEEE access. 2019 Jun 3; 7: 74089-102.
11 68 161 3. Li J, Cai J, Khan F, Rehman AU, Balasubramaniam V, Sun J,
Venu P. A secured framework for sdn-based edge computing
12 60 161
in IOT-enabled healthcare system. IEEE Access. 2020 Jul 23;
8: 135479-90.
Hence, this calculation can be extended to more number of
4. Ali O, Abdelbaki W, Shrestha A, Elbasi E, Alryalat MA,
parameters like diabetes, thyroid and other medical disorders Dwivedi YK. A systematic literature review of artificial
during pregnancy and predict the results very accurately intelligence in the healthcare sector: Benefits, challenges,
which is known as multiple regression. This is very helpful methodologies, and functionalities, Journal of Innovation &
for the patients to get correct treatment in the correct time for Knowledge. 2023 Jan 1; 8(1): 100333.
saving the lives of both mother and child. 5. Sun L, Jiang X, Ren H, Guo Y. Edge-cloud computing and
artificial intelligence in internet of medical things: architecture,
technology and application. IEEE Access. 2020 May 26; 8:
101079-92.
6. Gupta PM. Integration Of Edge And Fog Computing In IoT-
Based Healthcare Applications-A Review, Journal of Positive
School Psychology, 2023 Jan 15: 1940-57.
7. Jia Z, Chen J, Xu X, Kheir J, Hu J, Xiao H, Peng S, Hu
XS, Chen D, Shi Y. The importance of resource awareness
in artificial intelligence for healthcare. Natural Machine
Intelligence, 2023 Jun 12: 1-2.
8. Lin H, Garg S, Hu J, Wang X, Piran MJ, Hossain MS. Privacy-
enhanced data fusion for COVID-19 applications in intelligent
Internet of medical Things. IEEE Internet of Things Journal.
2020 Oct 22; 8(21): 15683-93.
9. Wu F, Qiu C, Wu T, Yuce MR. Edge-based hybrid system
implementation for long-range safety and healthcare IoT
applications. IEEE Internet of Things Journal. 2021 Jan 11;
Fig. 32.5 Linear regression line for Age VS Blood Pressure 8(12): 9970-80.
10. Saeidimesineh R, Adibi P, Karshenas H, Darvishy A.
Parallel encoder-decoder framework for image captioning.
5. Conclusion Knowledge-Based Systems. 2023 Oct 11: 111056.
11. Sreelakshmi S, Malu G, Sherly E, Mathew R. M-Net: An
An emerging concept in the Internet of Things (IoT), encoder-decoder architecture for medical image analysis
edge computing allows for real-time data processing to be using ensemble learning. Results in Engineering. 2023 Mar
faster, more efficient, and cheaper by positioning networks 1; 17: 100927.
close to consumers. Improved patient diagnosis, treatment, 12. Dash S, Shakyawar SK, Sharma M, Kaushik S. Big data
contentment, remote monitoring, and staff compliance in healthcare: management, analysis and future prospects.
are some of the benefits of this technology, which also Journal of big data. 2019 Dec; 6(1): 1-25.
13. Jimma BL. Artificial intelligence in healthcare: A bibliometric
helps monitor patients’ vital signs and tracks prescription
analysis. Telematics and Informatics Reports. 2023 Jan 9:
adherence. In order to forecast healthcare data, the research 100041.
202 Algorithms in Advanced Artificial Intelligence

14. Rahman A, Hossain MS, Muhammad G, Kundu D, Debnath Intelligence and Machine Learning for EDGE Computing.
T, Rahman M, Khan MS, Tiwari P, Band SS. Federated Academic Press, 2022. 469-477.
learning-based AI approaches in smart healthcare: concepts, 21. Emam, Ahmed, et al. “Edgehealth: An energy-efficient edge-
taxonomies, challenges and open issues. Cluster computing. based remote mhealth monitoring system.” 2019 IEEE wireless
2023 Aug; 26(4): 2271-311. communications and networking conference (WCNC). IEEE,
15. Rehman MU, Panday A. Review on Artificial Intelligence in 2019.
Healthcare. 2023 Aug; 26(4): 2371-312. 22. Ray, Partha Pratim, Dinesh Dash, and Debashis De. “Intelligent
16. Radanliev, Petar, and David De Roure. “Advancing the internet of things enabled edge system for smart healthcare.”
cybersecurity of the healthcare system with self -optimising National Academy Science Letters 44 (2021): 325-330.
and self-adaptative artificial intelligence (part2). “Health and 23. Hartmann, Morghan, Umair Sajid Hashmi, and Ali Imran.
Technology 12.5 (2022): 923-929. “Edge computing in smart health care systems: Review,
17. Bansal, Urvashi. “Power of IoT in Smart Healthcare Systems.” challenges, and research directions.” Transactions on
Applications of Optimization and Machine Learning in Image Emerging Telecommunications (2022): e3710.
Processing and IoT. Chapman and Hall/CRC, 2024. 79-91. 24. Pronovost, Peter J., and Robert K. Lord. “Could Modernizing
18. Jasim, Ahmed M., and Hamed A Raweshidy: Optimal Health Care Technology Be a Cure for Provider Burnout?.”
intelligent edge servers placement in the healthcare field, IET American Journal of Medical Quality 38.5 (2023): 264-266.
Networks, 2023. 25. Ghani, Norjihan Abdul, et al. “Methodical Evaluation of
19. Wu, Qi, Beian Chen, and Jianping Zhu. “Insights from Healthcare Intelligence for Human Life Disease Detection.”
COVID-19: Reflecting on the Promotion of Long-Term Health Malaysian Journal of Computer Science 36.3 (2023): 208
Policies in China.” International Journal of Environmental 222.
Research and Public Health 20.4 (2023): 2889.
Note: All the figures and table in this chapter were designed by the
20. Gusev, Marjan. “AI cardiologist at the edge: A use case
author.
of a dew computing heart monitoring solution.” Artificial
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Big Image: Large-Scale Skin Disease Image

Classification in Medical Imaging and
Healthcare Using CNN and Transformers
33

K. Satyanarayana Raju1, K. Chandra Shekar2, K. Laxmipathi Raju3,

M. Krishna Satya Varma4, P. Subba Raju5
Assistant Professor, Dept of IT, S.R.K.R.Engineering College, Bhimavaram
Sumitra Srinivas Kotipalli6
Assistant Professor, Dept of IT, Faculty at Middlesex University, Dubai

Abstract: Unusual inflammatory skin changes that might modify the skin’s color, texture, or appearance are known as skin
rashes. They could show up in one spot on the body or throughout it. Image processing is the process of transforming pictures
into new forms such as pictures, movies, texts, or other parts of the original images. Most image processing methods produce a
large quantity of data as their final output, which is known as “Big-data”. The paper encourages the use of a pure Transformer
i.e. ViT applied directly to picture patches for image classification applications. ViT outperforms various computer vision
applications in recent standards, demonstrating its competitive performance, including image classification, object recognition,
and semantic image segmentation. This study aims to develop a CNN for image classification using larger dataset Image
Patches and explore the relationship between CNN and ViT in picture classification. In the experimental results, the Vision
Transformer (ViT), which pre-trained on a vast amount of data, surpasses state-of-the-art convolutional neural network models
in a number of evaluations while using less CPU resources during training.
Keywords: Big data, Convolutional neural network, Semantic image segmentation, Vision transformers etc.

1. Introduction Technology for image processing is still developing.

Agriculture, the textile and transportation fields, among other
In machine learning, a transformer is deep learning models fields, have effectively used image modification, coding,
that use attention processes to differentially weigh the compression, segmentation, and other technologies by Al
value of each element of the incoming sequence of data by Abbadi, 2010[1]. A new field focuses on exploring big data-
Bhadula, 2019[2]. In machine learning, transformers are based image processing technology and developing models
constructed from a variety of levels of self-attention. They to enhance efficiency and quality of image processing by
are mostly utilized in the computer vision (CV) and natural Damilola,2013[3]. Traditional image processing methods,
language processing (NLP) branches of artificial intelligence however, are unable to handle the large number of image
(AI). Innovations in machine learning, like the most recent samples available today. Big databased image-processing
developments in computer vision, which meet state-of-the models offer benefits like reproducibility, precision,
art standard accuracy with enhanced parameter efficiency, applicability, adaptability, and potential for information
provide significant promise for a generic learning strategy compression, according to recent research.
that can be utilized for a number of data modalities.

1
ksnr539@gmail.com, 2sekharonemay@gmail.com, 3laxmipathi4u@gmail.com, 4krishnasatyavarma@gmail.com, 5raju.pericherla74@gmail.com,
6
ksumisri@gmail.com

DOI: 10.1201/9781003529231-33
204 Algorithms in Advanced Artificial Intelligence

Models for big data analysis have not recently been developed. channels of the input images are reflected in the input layer’s
Despite the fact that image processing is an established dimensions.
technology, creating image-processing models based on big Convolutional Layers: Convolutional layers extract features
data analysis can present a number of technical challenges. from input images using kernels, filters, and recognize shapes,
Image processing technologies require visualization edges, textures, and other visual components by convolving
analysis, semantic expression, and large sample storage, them with filters.
complex algorithms for feature extraction, recognition, and
prediction, along with time and memory requirements. In Pooling Layer: The physical dimensions of the feature maps
addition to these, a significant issue will be the slow rate of produced by convolutional layers are reduced by the addition
model identification. With the advancement of information of pooling layers. They save the most crucial details while
technology, big data applications in image processing removing the rest using down sampling techniques (such max
are expanding. Big data analysis-based image processing pooling). This helps in both achieving translation invariance
models offer broad application possibilities across all image- and reducing computer complexity.
processing disciplines by analyzing operating principles, Fully Connected Layers: One or more fully connected
technologies, and benefits. layers are connected to the output of the last pooling layer
after it has been flattened. These layers classify the retrieved
2. Related Methods characteristics in the same way that conventional neural
network layers do. Fully linked layers notice intricate
The Big Picture: Skin Disease on a Large-Scale CNN correlations between features and produce forecasts or class
explores the use of CNN and picture classification in medical probabilities.
imaging and healthcare. The output layer, the final layer of a CNN, provides
classification probabilities for each class, indicating the
2.1 Convolutional Neural Network
likelihood of the input image belonging to a specific class by
The process of classifying incoming photos entails giving Połap, D, 2018[8].
those labels or categories. In order to predict the class of
unseen images, a model is trained on labeled image data in
a supervised learning task by Jainesh Rathod, 2018[4]. Since 3. Proposed Work
CNN can recognize objects in photos accurately by learning Convolutional neural networks (CNN) have significantly
hierarchical elements like edges, textures, and forms, they are facilitated recent advancements in deep learning by Shanthi,
frequently employed for image classification. Because they 2020[11]. Data scientists are particularly interested in
can automatically extract useful spatial characteristics from computer vision, and CNNs have broken the mold to become
photos, CNNs excel at this task by J Sudha, 2017[5]. The the most advanced computer vision method by Parvathaneni
procedure’ several layers are listed below: Naga Srinivasu, 2021[9]. The ViT is a transformer utilized
Layer of Input: The CNN has input layer receives the raw for vision-related tasks like image categorization, offering
picture data as input. Typically, matrices of pixel values are superior accuracy over convolution and enabling parallel input
used to represent the images. The height, breadth, and color processing over Seq2Seq, Vec2Seq, and Seq2Vec tasks. The

Fig. 33.1 Skin disease in medical image using CNN

Big Image: Large-Scale Skin Disease Image Classification in Medical Imaging and Healthcare Using CNN and Transformers 205

study investigates the use of natural language processing’s dropout, weight decay, and stochastic depth can help avoid
self-attention mechanism in Big Image classification using overfitting by reducing parameter co-adaptation, penalizing
a new neural network by Simonyan, 2015[12] highlighting weights, and strengthening network resistance to noise.
potential limitations and errors. This work’s primary Vision transformers take a lot of memory and are
contribution is as follows. computationally demanding. Smaller patch sizes, lower
• Build a CNN with picture patches from a bigger dataset resolution images, and fewer layers or attention heads can
in order to categorise images. all be used as techniques to speed up training and inference.
• To understand how CNN and ViT handle the picture These techniques cut down on calculation costs, information
classification task. processing, and sequence length.
The final step in evaluating vision transformer performance
3.1 Vision Transformer (ViT) involves assessing precision, recall, F1-score, and accuracy.
The ViT is a transformer used for vision-related tasks like Precision measures forecast accuracy by comparing genuine
image categorization, providing greater accuracy than positives to anticipated positives; recall measures accuracy
convolution. It takes over Seq2Seq, Vec2Seq, and Seq2Vec by comparing positives to all positives, and F1-score balances
tasks, enabling parallel input processing by Ki V,2008[6]. these measures.
The study explores the use of natural language processing’s
3.2 Proposed Methodology
self-attention mechanism in the classification of images using
a new neural network, highlighting its potential limitations Transformers have been the conventional model in NLP
and potential errors by Satishkumar Moparthi, 2021[10]. due to their efficiency and adaptability in computation.
Computer vision is still dominated by convolutional neural
Picture categorization in computer vision is traditionally
network (CNN) architectures; however, some researchers
done using convolutional neural networks (CNNs) by Srujan
have tried combining CNNs with self-attention. The authors
S A, 2022[13]. However, CNNs have drawbacks like spatial
experimented merely applying a standard Transformer on
distortion and data requirements. Vision transformers, which
pictures and found that the models showed modest accuracy
view images as patches and use self-attention to recognize
compared to ResNet-like architectures when trained on mid-
interdependencies, offer a new approach that can better apply
sized datasets by Z. Ma,2016[15]. However, when trained
information from less input by Md. Sazzadul Islam Prottasha,
on larger datasets, the Vision Transformer (ViT) generated
2023[7].
excellent results and came very near to or outperformed the
Vision transformers require extensive data to function state of the art on a number of picture recognition criteria.
effectively, making them sensitive to patch sequence and
The model transforms 2D medical images into flattened
position. To improve accuracy, they need a wide range of
patches, which are then mapped to a latent vector using a
images by Syed Inthiyaz, 2023[14]. Pre-trained models
linear projection. The image representation is pre-trained or
trained on large amounts of data can reduce training time and
fine-tuned using a classification head and the Transformer
enhance their performance.
encoder maintains positional information. CNNs excel in
Vision transformers can overfit, especially when the target image processing, classification, object identification, and
dataset is small or dissimilar. Regularization methods like segmentation due to their ability to extract hierarchical

Fig. 33.2 Skin disease in medical Image using CNN

206 Algorithms in Advanced Artificial Intelligence

Fig. 33.3 The ViT model divides skin disease medical images into patches, linearly embeds them, adds position embeddings,
and uses a transformer encoder for classification

feature information and learn from vast picture data. Vision 4.1 CNN Approach for Skin Disease Medical
Transformers excel in global interdependence and contextual Image
awareness scenarios, while CNNs handle large datasets
Three layers of 2D convolutions with a kernel size of 3, stride
efficiently. They require larger training data, benefiting
of 2, and a maximum pooling layer of 2 make up the CNN
real-time and resource-constrained applications due to their
model for this picture classifier. There are two completely
increased computational efficiency.
linked layers with 10 nodes each after the convolution layers.
Algorithm: Numerous vision transformer models have been Here is an example of this structure in code: Ten training
proposed. The overall framework of the vision transformer epochs of the training were carried out on a Tesla T4 (g4dn
architecture is composed of the subsequent actions: xlarge) GPU system. The outcomes of training loops for each
1. Divide a medical image into patches of a certain size. epoch are listed below.
2. Reduce the skin patches for medical images.
4.2 Transforming Vision in Medical Images of
3. These flattening skin medical imaging patches are
Skin Diseases
transformed into lower-dimensional linear embeddings.
4. Positional embeddings should be used. The proportions of the Vision Transformer architecture are
adaptable and may be changed to meet particular needs.
5. Give the sequence to a modern transformer encoder as
This architecture is nevertheless substantial for a skin image
an input.
collection of this scale.
6. Before it is fully supervised on a large dataset, the ViT
model is first trained using image labels. Each parameter in the vision transformer plays a key role
and is described here:
7. The downstream dataset has to be modified for image
classification. • image_size=224: The required width and height of the
input photos to the model are specified by this option.
The photos in this situation should be 224x224 pixels
4. Experimental Result in size.
The results of the investigation relate to testing for skin • patch_size=32: This option specifies the dimensions
diseases in the malignant and benign groups. The strategy (width and height) of each patch, which are used to break
that will be used is as follows the photos into smaller portions. Each patch is 32x32
pixels in this instance.
Big Image: Large-Scale Skin Disease Image Classification in Medical Imaging and Healthcare Using CNN and Transformers 207

Fig. 33.4 Outcomes of training loops

• num_classes=2: The number of classes used in the dependence on particular tokens being placed too heavily
classification operation is indicated by this parameter. during training.
The model in this illustration is built to divide inputs into The Tesla T4 (g4dn-xlarge) GPU computer was used to
two categories, benign and malignant. train the vision transformer for the classification job over
• dim=128:It describes the model’s embedding vectors’ the course of 20 training epochs. Because the training loss’s
dimensionality. Each picture patch’s representation is convergence was gradual, the training was carried out across
captured by the embeddings. 20 epochs rather than the 10 epochs utilized for CNN. The
• depth=12:The Vision Transformer model’s (encoder outcomes of training loops for each epoch are listed below.
model’s) depth or number of layers is defined by this In 10 iterations, the CNN technique correctly predicted skin
parameter. A deeper level enables the extraction of more illness 75% of the time, but the vision transformer model
intricate features. correctly predicted skin disease 69% of the time and required
• heads=8:The number of attention heads in the model’s much more time to train.
self-attention mechanism is represented by this
parameter.
5. Conclusion
• mlp_dim=1024:It details the model’s hidden Multi-
Layer Perceptron (MLP) layers’ dimensionality. After CNN and Vision Transformer models differ in size, memory
self-attention, the MLP is in charge of changing the requirements, accuracy, and performance. CNN models are
token representations. compact and efficient, suitable for limited resources and
• Droput=0.1: The dropout rate, a regularization method image processing tasks. Vision Transformers collect global
used to avoid overfitting, is controlled by this parameter. dependencies and contextual information in skinned medical
During training, a certain percentage of input units are pictures, but require more RAM and larger model sizes.
set to 0. Decisions between models depend on task details, resource
• emb_droput=0.1:It describes the dropout rate as it relates availability, dataset scope, and trade-off between complexity,
directly to token embeddings. This dropout prevents accuracy, and performance. Additional improvements are
expected in computer vision.
208 Algorithms in Advanced Artificial Intelligence

Fig. 33.5 Outcomes of training loops for 20 epochs

4. Jainesh Rathod, Vishal Waghmode, Aniruddh Sodha, Praseniit

References Bhavathankar: Diagnosis of skin diseases using Convolutional
1. Al Abbadi, N.K., Dahir, N.S., AL-Dhalimi, M.A., Restom, Neural Networks, DOI: 10.1109/ICECA.2018.8474593, IEEE
H: Psoriasis detection using skin color and texture features, J. Xplore: 30 September 2018.
Computer Science 6(6), 648–652, 2010. 5. J Sudha, M Aramudhan and S Kannan: Development of a
2. Bhadula, S., Sharma, S., Juyal, P., Kulshrestha: Machine- mathematical model for skin disease prediction using response
learning algorithms based skin disease detection, IJITEE 9(2), surface methodology, Biomedical Research, pp. S355–S359,
4044–4049, 2019. 2017.
3. Damilola A. Okuboyejo, Oludayo O. Olugbara and Solomon 6. Ki V., Rotstein C. Bacterial skin and soft tissue infections
A. Odunaike: Automating Skin Disease Diagnosis Using in adults: A review of their epidemiology, pathogenesis,
Image Classification, Proceedings of the World Congress diagnosis, treatment and site of care, The Canadian journal
on Engineering and Computer Science 2013 Vol II WCECS of infectious diseases & medical microbiology, Can. J. Infect.
2013, 23–25 October 2013. Dis. Med. Microbiology 2008; 19:173–184, 2008.
Big Image: Large-Scale Skin Disease Image Classification in Medical Imaging and Healthcare Using CNN and Transformers 209

7. Md. Sazzadul Islam Prottasha, Sanjan Mahjabin Farin, Md. 12. Simonyan, K., Zisserman, A.: Very-deep convolutional
Bulbul Ahmed, Md. Zihadur Rahman, Kabir Hossain & M. networks for large-scale image recognition, CoRR.
Shamim Kaiser: Deep Learning Based Skin Disease abs/1409.1556, 2015.
Detection Using Convolutional Neural Networks, https://link. 13. Srujan S A, Chirag M Shetty , Mohammed Adil ,Sarang P K
springer.com/bookseries/7818, Lecture Notes in Electrical , Roopitha C H:Skin Disease Detection using Convolutional
Engineering, pp 551–564,2023. Neural Network,International Research Journal of Engineering
8. Połap, D., Winnicka, A., Serwata, K., Kȩsik, K., Woźniak, M.: and Technology (IRJET) e-ISSN: 2395-0056, Volume: 09
An intelligent system for monitoring skin diseases, Sensors Issue: 07- July-2022 www.irjet.net p-ISSN: 2395-0072.
18, 2552, 2018. 14. Syed Inthiyaz, Baraa Riyadh Altahan, Sk Hasane Ahammad,
9. Parvathaneni Naga Srinivasu, Jalluri Gnana SivaSai, V Rajesh, Ruth Ramya Kalangi, Lassaad K. Smirani,Md.
Muhammad Fazal Ijaz, Akash Kumar Bhoi, Wonjoon Amzad Hossain, Ahmed Nabih Zaki Rashed: Skin disease
Kim, James Jin Kang:Classification of Skin Disease Using detection using deep learning, https://doi.org/10.1016/j.
Deep Learning Neural Networks with MobileNet V2 and advengsoft.2022.103361, Advances in Engineering Software,
LSTM,Sensors (Basel), 2021 Apr; 21(8): 2852, Published Volume 175,103361,January 2023.
online 2021 Apr 18, doi: 10.3390/s21082852. 15. Z. Ma and J. M. R. S. Tavares: A Novel Approach to Segment
10. Satishkumar Moparthi: An Image is Worth 16×16 Words: Skin Lesions in Dermoscopic Images Based on a Deformable
Transformers for Image Recognition at Scale (Vision Model, IEEE Journal of Biomedical and Health Informatics,
Transformers), Analytics Vidhya, Published On March 10, vol. 20, no. 2, pp. 615–623, March 2016.
2021 and Last Modified On March 11, 2021.
Note: All the figures in this chapter were designed by the author.
11. Shanthi, T., Sabeenian, R.S., Anand, R.: Automatic
diagnosis of skin diseases using convolution neural network,
Microprocess, Microsystems, 76, 103074, 2020.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
210 Algorithms in Advanced Artificial Intelligence

AI Driven Load Distribution for Federated

Network on Electronic Health Records 34

S. Suryanarayanaraju1
Research Scholar, Department of Computer Science & Engineering,
GIET University, Odisha, Gunupur
M. Chandra Naik2
Professor, Department of Computer Science & Engineering,
GIET University, Gunupur
R. N. V Jagan Mohan3
Associate Professor, Department of Computer Science & Engineering,
SRKR Engineering College (A), Chinaamiram

Abstract: The enormous of Electronic Health Records is a norm to represent health data in the digital world. As the volume
records continues to grow, managing tasks based on factors like workload, computational capacity, and historical performance
will indeed become a critical challenge. High processing loads can strain resources and lead to delays or inefficiencies in
accessing and analyzing crucial health data. This research proposes to address this issue using Artificial Intelligence (AI) for
load distribution in federated networks which enhances system efficiency and responsiveness.
Keywords: Artificial intelligence, Electronic health records, Federated network, Load balance etc.

1. Introduction heterogeneous datasets and may have unreliable clients due

to less powerful communication media and battery-powered
Federated learning [FL] is an innovative machine learning systems. Distributed learning uses datacenters with powerful
technique that trains algorithms through independent computational capabilities and fast networks.
sessions using each dataset, addressing data privacy, access The mathematical expression representing the objective
rights, security, and heterogeneity. It is used in industries function in federated learning
like telecommunications, defense, Internet of Things, and
pharmaceuticals. However, the choice between federated The function’s goal in federated learning is mathematically
learning and pooled data learning remains open. Federated stated as in (1).
learning removes the need for direct exchange of raw data 1 K
samples by teaching machine learning (ML) algorithms f ( x1 , x2 ,º, xk ) = Â fi ( xi ) (1)
k i=1
on many geographic datasets. The process entails training
individual local models on their respective local data In this case, K stands for the total number of devices or nodes
samples, with the subsequent exchange of model parameters taking part in federated learning. The variables xi correspond
between nodes to collectively generate a global model. to the weights of the model as observed by node i, and fi
Federated learning differs from distributed learning in its signifies the local objective function of node i.
assumptions on local dataset properties, as it focuses on
1
snraju.saripalle@giet.edu, 2srichandra2007@gmail.com, 3mohanrnvj@gmail.com

DOI: 10.1201/9781003529231-34
AI Driven Load Distribution for Federated Network on Electronic Health Records 211

Federated learning is crafted to facilitate the training of a different features across distinct nodes. The accuracy loss
unified model by leveraging the local datasets across all nodes. attributed to non-IID data can be mitigated using advanced
This optimization is carried out with respect to the objective data normalization methods, surpassing the limitations of
function f(x1, x2, ..., xk).The aim is to attain consensus on xi, conventional batch normalization.
signifying that x1, x2, ..., xk converge to a shared value x by
the conclusion of the training process. 1.1 Electronic Health Records
Federated learning involves centralized and decentralized The medical data of a patient is preserved digitally by
methods. Centralized federated learning uses a central server healthcare practitioners and is known as an electronic health
to arrange algorithms and coordinate nodes, potentially record, or EHR. They automate accessibility, streamline
becoming a bottleneck. Decentralized federated learning workflow and quality management, support evidence-based
allows nodes to coordinate themselves, preventing single decision support, and outcomes reporting. EHRs offer real-
point failures. Heterogeneous federated learning focuses on time, patient-centric information and encompass a wealth of
accommodating diverse clients with varying computation and information, including diagnoses, medical history, treatment
communication capabilities. The heterogeneous federated plans, medications, radiology images, allergies, and test
learning framework is designed to train local models that results. They are designed to be shared securely with other
differ in computational capabilities and handle dynamic healthcare providers across various organizations for a more
computation and non-independent and identically distributed holistic and collaborative healthcare ecosystem.
(non-IID) data complexities. Despite these differences, the
framework aims to produce a unified and accurate global 1.2 Federated Learning Architecture
inference model through collaborative training across Federated Learning Architecture: Federated Learning design
heterogeneous devices. involves a system that manages user uploads, feeds, video
Similar to a machine learning process, federated learning processing, metadata storage, caching, and search through
involves training local models, aggregating local updates into various components and services. The Federated Learning
a single global update, and transmitting nodes from global system design utilizes distributed databases for managing
model states. It can be centralized or decentralized, with a user data, utilizing sharding and replication techniques for
central server facilitating the aggregation step, as depicted in global data availability and consistency as shown in Figure
Fig. 34.1. The process includes initialization, client selection, 2. Without sharing raw data, federated learning can enable
configuration, reporting, and termination. Asynchronous many enterprises to collaborate to jointly develop a machine
techniques, such as split learning, have been introduced for learning model. This approach addresses several critical
training and inference. challenges, including privacy, security, access rights, and the
heterogeneity of data access across different entities.
• Client Interaction: Federated Learning uses randomized
selection to select clients for global model parameters,
but heterogeneous data distribution causes client drift
and performance degradation.
• Load Balancer: Load balancing is a process where
incoming client requests are processed by multiple
servers to prevent overloading. It can be implemented as
hardware or software load balancers, either installed on-
premises or managed. The load balancer directs requests
to accessible servers, ensuring effective communication
Fig. 34.1 Federated iterative learning process
and task fulfillment. It can also help with network
Federated learning setups often have unbalanced local data caching by routing traffic to cache servers for temporary
samples and specific probability distributions of training storage of user requests.
examples. Non-IID data can cause significant variations in • API Gateways: Small-scale platform users can access
training performance. Among the main types of non-IID data planes as a service, enabling them to determine
data are correlate change, prior probability shift, concept gateway configuration from API specifications without
drift, concept shift, and imbalanced data. Examples include worrying about deployment, and route requests to
natural language processing datasets with different stroke microservices.
widths, regional or demographically partitioned datasets, • Write Operations: The global model parameters are
and concepts that may share the same label correspond to updated by local clients, while the global server collects
212 Algorithms in Advanced Artificial Intelligence

them. API gateways facilitate write operations to the App • Caching Mechanism: Redis or Memcache data caching
Server. reduces latency and database load by calculating cache
• Feed Generation Service: Federated averaging is a behavior based on restrictive settings, supporting various
machine learning approach that prioritizes data privacy cache implementations like Cache-Control headers.
and security by spreading data across multiple servers or • Search Service (Elastic search): Elastic search efficiently
devices without sharing raw data. uses inverted indices for efficient search, enabling rapid
• Read Operations: The federated learning paradigm data analysis and rapid user and content searches with
involves learning nodes training local models, sending near-real-time performance.
parameters to the server, performing weighted averaging, • Blob Storage: Federated Learning is a data-driven
and sending global parameters back to the learning framework that trains a single machine learning models
nodes. on multiple datasets, promoting compliance with
• Metadata Database: Federated learning (Trung Kien regulations and innovation, and storing user-uploaded
Dang et al., 2022), is a decentralized approach to training media.
machine learning models, enhancing data privacy by • CDN (Content Delivery Network): Content caching
using edge devices for local training and user profile and data computing at the wireless network edge is a
storage. promising method for reducing backhaul traffic load by
caching and serving static content with low latency.

Fig. 34.2 Architecture of federated learning

AI Driven Load Distribution for Federated Network on Electronic Health Records 213

• Video Processing: Video processing services handle tasks (FCFS), local search algorithm and Stochastic Hill Climbing
like transcoding, filtering, and thumbnail generation, (SHC). Future work will be concentrated on exploring the
using blob storage. Federated learning trains algorithms variations in crossover and strategies for selection to attain
through independent sessions using each dataset. better efficiency with fine-tuned results.
• Notifications: The system notifies users of likes, (Shikha Garg et al., 2015) have improved the use of virtual
comments, and interactions. machines in cloud computing, a problem that synchronous
restricted load balancing attempts to solve. One of the most
2. Related Work difficult problems in cloud computing is load balancing.
Meeting the substantial demands of users necessitates a
Load balancing distributes network traffic across a resource distributed solution, as individually assigning one or more idle
pool for efficient processing of simultaneous requests from services is often impractical or cost-prohibitive. Allocating
millions of users. It serves as an intermediary device between servers to clients on an individual basis presents challenges.
the user and the server group to ensure equitable utilization Load balancing algorithms can be enhanced through
of all resource servers. Previous studies by different authors alternative approaches, such as incorporating soft computing,
have explored these concepts, as outlined below. to efficiently utilize virtual machines and simultaneously
(Huankai Chen et al., 2013) have focused on the Load minimize response times.
Balance Improved Min-Min (LBIMM) scheduling approach In cloud computing, load balancing is explored by (V
is being utilized to improve load balancing in cloud RaviTeja Kanakala et al., 2015). It searches for the different
computing. In order to reduce make span and enhance algorithms that will help in distributing the load among the
resource usage, this method has expanded on the features nodes efficiently. Additionally, it delves into the parameters
of the Min-Min algorithm. In the proposed PA-LBIMM considered when determining the optimal algorithm for
(Promised guarantees, User-priority-based Load Balanced load balancing. The existing challenge lies in the limited
Min-Min), the author not only incorporates the principles of performance of current load-balancing algorithms across
load balancing but also introduces additional features such as all essential areas. The author suggests refining algorithms,
promised guarantees and user-priority. These augmentations including Exponential Smooth Forecast and Load Balance
are designed to ensure a more comprehensive satisfaction of Max Min, to improve cloud performance and align with load
user demands. Notably, the author’s focus does not extend to balancing demands.
considering task deadlines and the intricacies associated with
high heterogeneity in interconnection networks. A deep learning model for load balancing was developed
by (Zhang et al. 2021) to handle the skewness in data by
(Chien et al., 2016) worked on the load balancing algorithm replacing the hash function with deep learning mechanisms.
under consideration relies on estimating the completion However, cascading overflow can still be addressed for
time of services. The aim of load balancing is to enhance further enhancement. This occurs when one server reaches
throughput, fine-tune resource utilization, avert the full capacity causing subsequent servers to fill up more
overloading of individual resources and minimize response rapidly, resulting in an overflow effect.
time. While numerous load-balancing algorithms have
been introduced, their performance remains a subject of The deep learning model suggested by (Kaur et al., 2020) will
ongoing improvement. In response to this, the Authors have improve throughput, resource utilization, latency, cost and
introduced an algorithm that centers around estimating the response time in cloud computing. For workflow execution in
completion time of services. According to simulation results, cloud environments, they presented the Deep Learning-based
this algorithm demonstrates enhancements in both latency Dynamic VM Provisioning and Load Balancing (DLD-PLB)
and processing time. While load balancing contributes to the framework. The suggested model in the suggested framework
effective utilization of computational resources and improved is superior to the prior model in that it makes use of deep
efficiency, it also introduces challenges, particularly in the learning for load balancing.
realm of energy consumption. (Rathod et al., 2020) discussed about Artificial Intelligence
(Kousik et al., 2013) have suggested a strategy which based (AI) techniques are widely used in cloud computing due to
on the formulation of a Genetic Algorithm (GA) for load their numerous benefits. Proper load balancing mechanism is
balancing. This task is an optimization problem. A load crucial for user and service provider satisfaction. This proposal
balancer is utilized efficiently to adapt the strategy dynamically can give the detailed knowledge on how AI techniques can
when there is a change in environmental conditions and task improve load balancing in cloud environments. AI techniques
types. The results of the algorithm have supered the existing are integrated with techniques like hill climbing, fuzzy logic,
approaches like Round Robin (RR), First Come First Serve and honey bee mechanism will increase resource utilization
214 Algorithms in Advanced Artificial Intelligence

and service quality. It is important to decide when to combine hindering the development of clinical decision support
these techniques. tools and health-specific AI. To access various data sets and
(Wilson et al., 2019) discussed the challenges of efficient handle health problems, federated networks are utilized.
resource utilization in cloud computing networks. The They discussed regarding the utilization of federated
application performance will be degraded by overloaded networks in healthcare, their establishment, operation and
virtual machines and resource utilization inefficiency will implementation.
be created by underloaded VMs. On Software-Defined The health system relies on centralized agents which share
Networking (SDN) a mechanism called Dynamic Agent- the raw data in (Rahman et al., 2023). Combining this with
Based Load Balancing algorithm is applied, which uses AI and FL can reduce challenges and vulnerabilities. The
Back Propagation Artificial Neural Network (BPANN) that analysis in FL using AI regarding healthcare applications
migrates the virtual machines efficiently in data centres. will address problems like privacy, security, scalability,
Overall, the network efficiency will be improved and perform reliability and confidentiality. It will discuss about emerging
well on data migration. A comparison between the Heuristic trends like FL, Explainable Artificial Intelligence XAI, AI
algorithm (HA) and Multi-Path TCP is performed by using and e-healthcare. It gives suggestions for solving healthcare
the migration process results. This algorithm is suited for VM strategies using FL and AI. It specifies the research areas
migration and can utilize runtime configuration management extensively and potential prospects in future for managing the
as an extension for the algorithm. This is also useful for data healthcare systems that use FL and AI. The recent progress
offloading in mobile cloud computing. has created an interest in FL to integrate AI into the networks.
Various networks are deployed by using the platform called The complete study has analyzed the progress of security,
Software-defined networking (SDN) which is discussed in taxonomies, discussions, benefits of integration, open issues,
(Hazim et al. 2023). Load imbalance is an issue in SDN during and offers in future research guidance.
traffic distribution. To enhance the effectiveness of SDN, they
have developed various SDN load-balancing mechanisms. 3. Hyperparameter Approach of
The work mainly focuses on analyzing the architecture Federated Learning
of SDN. The summarizing metrics and categorization of
AI-based load-balancing methods are used in measuring Federated learning approaches, such as orchestrator-
the efficiency of the techniques. A detailed survey is given less distributed networks, aggregate local models, reduce
on various load-balancing mechanisms that utilize AI for transaction count, and potentially reduce training time and
improving the load distribution in SDN. computing costs. The choice of the node network topology
The intelligent load-balancing mechanisms and the need for allows for optimization of learning by controlling various
energy efficiency in healthcare are discussed by (Ibrahim parameters, including the machine learning model’s
et al., 2023). The load balancing model is based on energy- Hyperparameters. Key parameters include the federated
aware artificial intelligence, that utilizes big data analytics learning rounds is T. The process utilized a total of K nodes.
The local learning rate, represented by constant C, batch size
(BDA) and Chaotic Horse Ride Optimization (CHROA) for
B, and number of iterations for local training before pooling
IoT environments that are cloud-enabled. The CHORA will
N, significantly influences the effectiveness of educational
use the AI models for balancing the load and optimizing the
programs and strategies. Optimizing machine learning
energy resources. The CHORA is evaluated using various
parameters depends on application constraints like computing
metrics. Overall, the article highlighted the challenges and
power, memory, and bandwidth. Stochastic gradient descent
contributions to developing efficient solutions in IoT/IoE.
and limited node fractions can reduce computing cost and
Load-balancing algorithms are compared and summarized by prevent overfitting.
(Singh et al., 2023) for cloud computing in environments like
Algorithm Procedure: The index of the C clients is c. The
centralized, static and dynamic. Popular machine learning
local minibatch size is denoted by D, the local epochs by L,
models like Random Forest Classifier, Statistical Regression,
and the learning rate by η.
CNN, AI and LSTM-RNN are explored. Load balancing
improves system performance by reducing time, throughput, The server runs:
production time, and power savings. The deep learning Initializew_0;
models have been replaced by machine learning models that
for each encircling t=1, 2, …do
handle big data effectively and don’t affect standardization.
During load balancing the score is an important factor. m ← max (F. C, 1)
(Harry et al., 2021) discussed that access to health data is St ← (Randomesetofmclients)
problematic due to legislation. This results in siloed data for each client c ∈ St in parallel do
AI Driven Load Distribution for Federated Network on Electronic Health Records 215

wc(t+1) ← ClientUpdate(c, wt) Table 34.1 Data set for load distribution in federated network
w C
nc c
t +1 ¨ Â w

ReTX Count

Leap Count
n t+1

Delay (ms)
c=1

Respond

AppId
Node
Client Updates(c, w):

Time

Type
Seq.
Β ← split Ρc into batches of size D
1.46006 7 255 226 Last 0.066264 1 4
for each local epoch i from 1 to L do Delay
for bach d ∈ B do 1.46006 7 255 226 Full 344.076 2 4
Delay
w ← w – η∇l(w;b)
1.46506 7 255 227 Last 0.066264 1 4
Return w to Server. Delay
Through a technique known as secure aggregation, the server 1.46506 7 255 227 Full 344.076 2 4
can combine the encrypted models from different participants Delay
without gaining access to the raw updates. Instead, the server 1.47006 7 255 228 Last 0.066264 1 4
merely decodes the aggregated training results. As a result, Delay
the server will never see the training results for a particular 1.47006 7 255 228 Full 344.076 2 4
device. Federated Learning and Differential Privacy can be Delay
coupled for increased security. In addition to training, testing 1.47506 7 255 229 Last 0.075264 1 4
is a crucial difference between Federated and traditional Delay
machine learning. We should test machine-learning models 1.47506 7 255 229 Full 344.076 2 4
with data that most closely resembles the inputs the model Delay
would experience in use. However, since it lacks access to the 1.48006 7 255 268 Last 0.076116 1 4
training data, the server is unable to test the combined model Delay
after it has been updated with input from the clients. As a 1.49006 7 255 268 Full 0.076116 1 4
result, training and testing are done on consumers’ devices. Delay
Be mindful that distributed testing benefits from testing the
updated model on consumers’ devices, which is the most significant improvements in resource allocation and data
important place to test. privacy preservation. By leveraging collaborative model
training across decentralized nodes, FL offers an effective
4. Experimental Result solution for optimizing task allocation while respecting
the autonomy of individual entities. Our study has shown
The Electronic Health Record utilizes load balancers to that FL-based load distribution leads to enhanced network
optimize server traffic, speed up response times, and reduce performance, reduced latency, and increased robustness, even
network latency. They evenly distribute load among servers, in dynamic and heterogeneous environments. The success of
reroute client requests to servers closer to client locations, this approach opens new opportunities for a wide range of
and prevent overwork, enhancing application performance. applications across various healthcare industries, and it paves
Load balancing is a crucial aspect of parallel and distributed the way for further advancements in decentralized computing
computing, enhancing system efficiency, reliability, resource paradigms.
efficiency, and performance by ensuring balanced workloads.
The data set for the load distribution in federal network is
shown in Table 34.1. References
The required of regression plane is Y = b0 + b1 * x1 + b2 1. Chen.H, F. Wang, N. Helian, G. (2013). AkanmuUser-priority
* x2 + b3 * x3. To estimate the Respond time is depends guided Min-Min scheduling algorithm for load balancing in
on Delay, ReTX count and Leap Count. The respond time cloud computing, IEEE Natl. Conf. Parall. Computer Technol.,
incurred 612.656062, 1.77701828, -611.30892, 0 as shown (PARCOMPTECH), pp. 1–8.
2. Chien, N.K., Son, N.H., Loc, H.D. (2016). January, Load
in Fig. 34.3.
balancing algorithm based on estimating finish time of
services in cloud computing, In 2016 18th IEEE International
5. Conclusion and Future Work Conference on Advanced Communication Technology
(ICACT), pp. 228–233.
The implementation of Federated Learning for load
distribution in Federated Networks has demonstrated
216 Algorithms in Advanced Artificial Intelligence

Fig. 34.3 Coefficients for calculating respond time

3. Dasgupta.K, B. Mandal, P. Dutta, J.K. Mandal, S. Dam. (2013). 10. Ahmed Hazim Alhilali and Ahmadreza Montazerolghaem,
A genetic algorithm (GA) based load balancing strategy for (May 2023), Artificial Intelligence based Load balancing in
cloud computing, Procedia Technol., 10, pp. 340–347. SDN: A ComprehensiveSurvey, https://doi.org/10.1016/j.
4. Garg, S., Dwivedi, R.K., Chauhan, H. (September 2015). iot.2023.100814.
Efficient utilization of virtual machines in cloud computing 11. Ibrahim Aqeel, Ibrahim Mohsen Khormi, Surbhi Bhatia
using Synchronized Throttled Load Balancing, 1st IEEE Khan, Mohammed Shuaib, Ahlam Almusharraf, Shadab
International Conference on Next Generation Computing Alam, and Nora A. Alkhaldi.(June 2023). Load Balancing
Technologies (NGCT), pp. 77–80. Using Artificial Intelligence for Cloud-Enabled Internet of
5. Kanakala, V., RaviTeja, V., Reddy, K., Karthik, K. (2015). Everything in Healthcare Domain, https://doi.org/10.3390/
Performance analysis of load balancing techniques in cloud s23115349.
computing environment, IEEE International Conference 12. Divyansh Singh, Vandit Bhalla and Neha Garg. (2023).Load
on Electrical, Computer and Communication Technologies Balancing Algorithms with the Application of Machine
(ICECCT), pp. 1–6. Learning: A Review, Vol. 10, No. 1, MR International Journal
6. Zhu, Q. Zhang, T. Cheng, L. Liu, WeiZhou and J. He. (2021). of Engineering and Technology.
DLB: Deep Learning Based Load Balancing, CoRR, vol. 13. Harry Hallock, Serena Elizabeth Marshall, Peter A. C. t Hoen,
1910, no. 08494V4. Jan F. Nygard, Bert Hoorne, Cameron Fox and Sharmini
7. Kaur, B. Kaur, P. Singh, M.S. Devgan, and H.K. Toor, (2020) Alagaratnam.(2021).Federated Networks for Distributed
Load Balancing Optimization Based on Deep Learning Analysis of Health Data”, https://www.frontiersin.org/
Approach in Cloud Environment, I.J. Information Technology articles/10.3389/fpubh.2021.712569/full.
and Computer Science, vol. 3, no. I, pp. 8-18. 14. Anichur Rahman, Md. Sazzad Hossain, Ghulam Muhammad,
8. Divyaben Rathod and Dr. Krunal Suthar.(October 2020). Dipanjali Kundu,Tanoy Debnath, Muaz Rahman, Md.
Artificial Intelligence Techniques for Load Balancing in Cloud Saikat Islam Khan, Prayag Tiwari, Shahab S. (2023). Band.
Computing: A Review,In Journal of Emerging Technologies Federated learning-based AI approaches in smart healthcare:
and Innovative Research, Volume 7, Issue 10, PP.860-863. concepts, taxonomies, challenges and open issues”, https://
9. S.WilsonPrakash and P.Deepalakshmi.(2019). Artificial doi.org/10.1007/s10586-022-03658-4.
Neural Network Based Load Balancing On Software Defined 15. Trung Kien Dang, Xiang Lan, Jianshu Weng and Mengling
Networking, In IEEE International Conference on Intelligent feng.(June 2022). Federated Learning for Electronic Health
Techniques in Control, Optimization and Signal Processing Records, https://doi.org/10.1145/3514500.
(INCOS), Tamilnadu, India.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Smartphone-based Deep Learning

Models for the Early Detection of Bubonic
Plague and Skin Diseases: A Safer, More 35
Accessible, and Affordable Approach

N. V. Ratnakishor Gade1
Research scholar, Department of Computer Science and Engineering, Saveetha School of Engineering,
Saveetha Institute of Medical and Technical Sciences, Chennai, Tamil Nadu, India
Mahaveerakannan R.2
Associate Professor, Department of Computer Science and Engineering, Saveetha School of Engineering,
Saveetha Institute of Medical and Technical Sciences, Chennai, Tamil Nadu, India

Abstract: The bubonic plague and skin infections are highly contagious illnesses that can result in catastrophic outcomes
for human societies. Prompt identification and intervention are crucial for enhancing patient results; however, conventional
diagnostic techniques are laborious, necessitate specialised apparatus, and may not be readily available in distant regions.
Utilising deep learning models on smartphones has the capacity to transform the diagnosis of bubonic plague and skin disorders
by offering a rapid, precise, and cost-effective method to identify infections through smartphone photos. This study introduces
innovative deep learning models designed to detect bubonic plague and skin illnesses at an early stage using photos captured
by smartphones. The models we employ are built upon convolutional neural networks (CNNs), a specific sort of deep learning
model that excels in tasks involving picture classification. Our models were trained using datasets consisting of smartphone
photographs depicting individuals afflicted with bubonic plague and various skin illnesses alongside images of healthy
individuals for the sake of comparison. The accuracy of our algorithms in diagnosing bubonic plague and skin disorders from
smartphone photos was 99% and 98%, respectively. These accuracies surpass the accuracy achieved by human radiologists and
dermatologists, who normally reach accuracies of approximately 80% and 85%, respectively. The methodology we employ
has the capacity to enhance the accessibility and affordability of early detection for bubonic plague and skin illnesses among
individuals residing in remote regions and other marginalised communities. Additionally, this technique enhances safety by
minimising the necessity for individuals to commute to healthcare facilities, which can be perilous in regions with a high
incidence of these illnesses.
Keywords: Bubonic plague, Skin disease, Deep learning, Smartphone

1. Introduction importance when it comes to skin disorders and the bubonic

plague. Traditional diagnostic methods, on the other hand,
Infectious skin illnesses, such as bubonic plague, can have can be time-consuming, costly, and demanding on available
catastrophic consequences for human health. An infected resources. Because of this, getting the medical treatment that
flea can transmit the bubonic plague bacterium to humans. A individuals in rural or impoverished areas need might be
broad variety of externally apparent illnesses, including those challenging. There is great promise for smartphone-based deep
affecting the skin, hair, and nails, are together known as skin learning models to transform the detection of skin disorders
diseases. Ringworm and impetigo are infectious skin illnesses, and bubonic plague by offering a fast, accurate, and cost-
but eczema and psoriasis are chronic, noncontagious skin effective method utilising smartphone images. Researchers
conditions. Early detection and treatment are of the utmost can train deep learning models of artificial intelligence to
1
kishor.mahi@gmail.com, 2mahaveerakannanr.sse@saveetha.com

DOI: 10.1201/9781003529231-35
218 Algorithms in Advanced Artificial Intelligence

identify particular patterns within data. One way to train deep that was able to detect malaria with 99% accuracy in 2020
learning models is to show them the symptoms of an illness. when it was trained on blood smear photographs taken using
Having access to deep learning models through a smartphone smartphones. Next year, in 2021, researchers published their
has many advantages. To begin with, they are more accessible findings in the journal PLOS ONE, detailing how a deep
and less expensive. Anyone with a smartphone can use a deep learning network trained on images of skin lesions captured
learning model to diagnose a medical condition. Second, they by smartphones could correctly diagnose dengue infection
are swift. Deep learning algorithms can evaluate a picture and 97% of the time. Researchers have also achieved validation
provide their results in a matter of seconds. At long last, they of deep learning models on cellphones for diagnosing skin
might be more accurate. Research has demonstrated that deep illnesses and bubonic plague. In 2022, scientists developed
learning models are more accurate than human physicians. a deep learning algorithm that could identify skin diseases,
We introduce a new deep learning model that can detect such as bubonic plague, with 98% accuracy and smartphone
bubonic plague and other skin diseases using data collected images with 99% accuracy. In neglected and rural locations,
from smartphone cameras. The crux of our strategy is a class these results lend credence to the idea that deep learning
of deep learning models called convolutional neural networks models trained on smartphone data could greatly enhance the
(CNNs). Picture classification is where CNNs really shine. diagnosis of infectious illnesses.
We trained our model using photos of individuals with various
diseases, including typhus and leprosy, as well as healthy 3. Proposed Work
controls. Our method successfully detects skin problems with
a 98% success rate and bubonic plague with a 99% success The planned initiative aims to develop a new deep learning
rate using smartphone photographs. These findings are highly model for early skin disease diagnosis using smartphone
encouraging because human dermatologists and radiologists photographs, specifically for bubonic plague and similar
often achieve accuracies of about 85% and 80%, respectively. conditions. The programme’s backbone will be convolutional
Deep learning models that can be used on smartphones could neural networks (CNNs), a type of deep learning model
serve as a valuable tool for early detection of skin problems that excels at image classification tasks. Mobile phones of
and bubonic plague. In low-income communities and rural individuals with skin diseases and bubonic plague, in addition
places, this could significantly impact the speed and accuracy to images of healthy individuals, will capture the images used
of diagnosis and treatment for chronic diseases. to train the model. Medical centres and government agencies
are among the many organisations that will contribute to the
dataset. A mobile app will make use of the trained model.
2. Related Work The app’s users can snap photos of their skin lesions and
Recently, there has been a surge of excitement surrounding upload them to the model’s database for evaluation. The
the use of deep learning models designed for cellphones in model will then determine if the lesion is caused by bubonic
detecting infectious diseases. Multiple studies have shown plague or any other skin ailment. By making early detection
that these models can achieve higher levels of accuracy more available and affordable for those in disadvantaged
than human doctors. One example is a deep learning model communities and remote areas, the proposed method could

Fig. 35.1 Flow diagram for proposed work

Smartphone-based Deep Learning Models for the Early Detection of Bubonic Plague and Skin Diseases 219

completely transform the way bubonic plague and skin diseases by providing a rapid, accurate, and cost-effective
diseases are diagnosed. Further applications of this idea way to identify infections from smartphone photos. The
include the creation of point-of-care diagnostic equipment development of such models involves a comprehensive process
for the field diagnosis of skin disorders and bubonic plague. that encompasses data collection, image preprocessing, model
Collect a dataset of smartphone images of bubonic plague training, evaluation, and app development. The convolution
and skin diseases. This dataset will be collected from a operation is a fundamental mathematical concept in CNNs,
variety of sources, including hospitals, clinics, and public used for feature extraction from images. It involves applying
health organizations. The dataset will be labelled so that the a filter (also known as a kernel) to an input image to produce
model can learn to distinguish between images of bubonic a feature map. The mathematical formula for convolution is:
plague, skin diseases, and healthy skin. Filter (W) ∗ Input Image (X) Convolved Feature Map (Y)The
Develop a model for deep learning. A convolutional neural model gains non-linearity from activation functions, which
network (CNN) will train the model on the gathered dataset. enables it to recognize intricate patterns and produce more
One class of deep learning models that works well on image accurate predictions. Rectified Linear Units, or ReLUs, are
classification problems is CNNs. Make the model available often utilized as activation functions in CNNs. ReLU(x) =
via a mobile app. Users will be able to simply snap pictures max(0, x)By reducing the spatial dimensionality of feature
of their skin lesions with a smartphone app and submit them maps through pooling, the model becomes less prone to
to the model for study. Assess the model. We will use a held- overfitting and more computationally efficient. Max pooling
out test set to evaluate the model’s performance in identifying
skin conditions and the bubonic plague.

2.1 Smartphone-Based Deep Learning for

Early Detection of Bubonic Plague and
Skin Diseases
Smartphone-based deep learning models have the potential Fig. 35.2 Deep learning using smartphones for early skin
to revolutionize the diagnosis of bubonic plague and skin disease and bubonic plague detection

Table 35.1 Deep learning using smartphones for early identification of skin diseases and the bubonic plague
Image ID Label Image features Numerical values
1 Bubonic plague Red, swollen, and painful lymph node in the groin, Image features: Redness: 100, Swelling: 100, Pain:
captured by a smartphone 100
2 Skin disease Blistering, itchy rash on the hand,captured by a Image features: Blisters: 100, Itchiness: 100
smartphone
3 Healthy skin Normal-looking skin with no visible lesions, captured by Image features: Redness: 0, Swelling: 0, Pain: 0,
a smartphone Blisters: 0, Itchiness: 0
4 Bubonic plague Dark, painful lesion on the skin, captured by a smartphone Image features: Redness: 0, Swelling: 0, Pain: 100,
Lesion size: 10 cm, Lesion color: Dark
5 Skin disease Scaly, dry rash on the leg, captured by a smartphone Image features: Scaling: 100, Dryness: 100
6 Healthy skin Normal-looking skin with no visible lesions, captured by Image features: Scaling: 0, Dryness: 0
a smartphone
7 Bubonic plague Blackened, necrotic tissue on the finger, captured by a Image features: Blackened tissue: 100, Necrosis: 100
smartphone
8 Skin disease Open, weeping sore on the face, captured by a Image features: Open sore: 100, Weeping: 100
smartphone
9 Healthy skin Normal-looking skin with no visible lesions, captured by Image features: Open sore: 0, Weeping: 0
a smartphone
10 Bubonic plague Multiple, enlarged lymph nodes in the neck, captured by Image features: Number of enlarged lymph nodes: 3,
a smartphone Lymph node size: 1 cm
11 Skin disease Ring-shaped rash on the arm, captured by a smartphone Image features: Ring-shaped rash: 100
12 Healthy skin Normal-looking skin with no visible lesions, captured by Image features: Ring-shaped rash: 0
a smartphone
220 Algorithms in Advanced Artificial Intelligence

Fig. 35.3 Values obtained for testing accuracy, training and validation accuracy, training and validation LOSS
Smartphone-based Deep Learning Models for the Early Detection of Bubonic Plague and Skin Diseases 221

is a common pooling operation that selects the maximum SGD is based on the gradient of the loss with respect to the
value within a window. The mathematical formula for max weights:
pooling is: q_new = q_old - learning_rate* ∇L(q_old)
Pooling(X)[i, j] = max(X[istride:istride+pool_size, Positively impact public health by offering timely and
jstride:jstride+pool_size]) accessible early detection of bubonic plague and skin
In the final layer of a CNN for classification, the softmax diseases, ultimately saving lives, reducing healthcare costs,
activation function is used to convert raw scores (logits) into and improving the well-being of individuals in remote and
a probability distribution over classes. The mathematical underserved areas.
formula for softmax is:
2.2 Enhancing Access and Affordability for
Softmax(z)_i = e^(z_i)/∑(e^(z_j) for all classes j)
Disease Diagnosis with Smartphone-Based
Cross-entropy loss is a common loss function used for Deep Learning
classification tasks. It quantifies the dissimilarity between
predicted probabilities and actual class labels. The Initially, load the pre-trained deep learning model for picture
mathematical formula for cross-entropy loss is: classification. While there are other pre-trained models
available, InceptionV3 is a well-liked option for picture
CrossEntropy(y, y_pred) = -∑(y_i* log(y_pred_i)) categorization jobs. InceptionV3 achieves excellent accuracy
Gradient descent is an optimization algorithm used to update on a wide range of tasks because it has been trained on an
the model's parameters (weights) during training to minimize extensive dataset of images from ImageNet.
the loss function. Stochastic Gradient Descent (SGD) is a Before passing the image to the model, it is necessary to
variant where parameters are updated for each mini-batch preprocess it. This involves resizing the image to the model’s
of data. The update rule for model parameters (weights) in required input size and normalizing the pixel values. The

Image ID Label Image features Demographic information

1 Bubonic plague Red, swollen, and painful lymph node in the groin, captured Age: 30, Gender: Male, Location: BVRM
by a smartphone
2 Skin disease Blistering, itchy rash on the hand, captured by a smartphone Age: 20, Gender: Female, Location: BVRM
3 Healthy skin Normal-looking skin with no visible lesions, captured by a Age: 50, Gender: Male, Location: PKL
smartphone
4 Bubonic plague Dark, painful lesion on the skin, captured by a smartphone Age: 10, Gender: Female, Location: NSP
5 Skin disease Scaly, dry rash on the leg, captured by a smartphone Age: 60, Gender: Male, Location: Remote area
6 Healthy skin Normal-looking skin with no visible lesions, captured by a Age: 40, Gender: Female, Location: NSP
smartphone
7 Bubonic plague Blackened, necrotic tissue on the finger, captured by a Age: 70, Gender: Male, Location: PKL
smartphone
8 Skin disease Open, weeping sore on the face, captured by a smartphone Age: 80, Gender: Female, Location: PKL
9 Healthy skin Normal-looking skin with no visible lesions, captured by a Age: 90, Gender: Male, Location: BVRM
smartphone
10 Bubonic plague Multiple, enlarged lymph nodes in the neck, captured by a Age: 1, Gender: Female, Location: NSP
smartphone
11 Skin disease Ring-shaped rash on the arm, captured by a smartphone Age: 2, Gender: Male, Location: PKL
12 Healthy skin Normal-looking skin with no visible lesions, captured by a Age: 3, Gender: Non-binary, Location: BVRM
smartphone

Fig. 35.4 Various images categorizations

222 Algorithms in Advanced Artificial Intelligence

InceptionV3 model requires input images to be 299x299 from scratch. Train the selected deep learning model using
pixels, so you will need to resize the image to this size. You the labeled dataset. The model should take image features as
can use the following code to resize an image: normalize the input and produce a probability distribution over the three
pixel values of the image. This means scaling the values to classes (Bubonic Plague, Skin Diseases, Healthy Skin) as
a specific range, such as [0, 1] or [-1, 1]. The InceptionV3 output. Train the model using techniques such as gradient
model requires pixel values to be scaled to [0, 255], Once descent and backpropagation, optimizing it to minimize the
the image has been preprocessed, it can be passed to the classification error. Evaluate the model’s performance using
model for prediction. To do this, A tensor is a data structure a separate test dataset. Measure its accuracy, precision, recall,
that is used in TensorFlow to represent data. To select the and other relevant metrics to assess its ability to identify
top predicted label, find the class with the highest probability Bubonic Plague and Skin Diseases accurately. Develop a
score. The top label is your predicted category for the image. smartphone app that allows users to capture photos of skin
Step 1: Load the Pre-trained Model: Load the InceptionV3 lesions and send them for analysis. Integrate the trained deep
model pre-trained on ImageNet or your custom dataset. learning model into the app. This involves loading the model,
providing the necessary interface for users to input photos,
Step 2: Preprocess the Image: Resize the input image to the and implementing the logic for analyzing the images. Ensure
model’s required input size (299x299 pixels for InceptionV3). the app is user-friendly and provides clear instructions for
Step 3: Preprocess the image data by scaling pixel values to capturing and submitting photos. Deploy the smartphone
a specific range (usually [0, 1] or [-1, 1]). based diagnostic tool to a cloud or server, ensuring that users
Step 4: Model Prediction: Pass the preprocessed image can access the service for image analysis. Provide users with
through the model. The model will produce a prediction, information on how to use the app, interpret the results, and
which is a vector of probabilities for various classes. seek further medical advice based on the diagnosis. Regularly
update the model with new data to improve its accuracy and
Step 5: Top Label Selection: To select the top predicted label, add support for additional skin conditions.
find the class with the highest probability score.
Patients, especially in remote and underserved areas, can
Step 6: Output: The top label is your predicted category for access reliable disease diagnosis without the need to travel
the image. to healthcare facilities, making healthcare services more
Image 1, Predicted Label: “Bubonic plague” Confidence accessible. Smartphone-based models are generally more
Score: 0.98, Image 2 Predicted Label: “Skin disease”, affordable than traditional diagnostic methods, reducing the
Confidence Score: 0.92, Image 3, Predicted Label: “Healthy financial burden on patients and healthcare systems. These
skin” Confidence Score: 0.85. tools can bridge healthcare access and affordability gaps
Smartphone-based deep learning models have the potential between rural and urban areas and high-income and low-
to revolutionize the diagnosis of bubonic plague and skin income communities, ensuring that people from diverse
diseases, making diagnosis more accessible, affordable, and backgrounds have access to accurate and timely diagnosis.
accurate. Early detection of diseases, including Bubonic Plague and
Skin Diseases, can lead to better treatment outcomes and
2.3 Innovative Diagnostic Tools for Bubonic reduced disease transmission. Patients can take control of
Plague and Skin Diseases through their health by using smartphone apps for preliminary self-
Smartphone-Based Models assessment and seeking timely medical advice. These tools
can help reduce the burden on healthcare facilities, allowing
Collect a labeled dataset of smartphone images representing them to focus on critical cases. Collecting and analyzing data
cases of Bubonic Plague, Skin Diseases, and Healthy Skin. through these tools can provide valuable insights for public
Each image should be associated with relevant demographic health research and epidemiological studies. Users can gain
information (e.g., age, gender, location), image features, a better understanding of their health conditions, symptoms,
and the corresponding diagnosis. This dataset will serve as and appropriate actions to take. Smartphone-based diagnostic
the basis for training your model. Preprocess the collected tools can be scaled quickly to reach a wide audience,
images to ensure they are of consistent quality and size. especially in regions with high smartphone penetration.
Common preprocessing steps may include resizing,
normalization, and data augmentation to enhance the
model’s robustness. Choose a suitable deep learning model 3. Conclusion
architecture for image classification. Convolutional Neural The development of smartphone-based deep learning models
Networks are commonly used for this task. Consider using has the potential to revolutionize the diagnosis of infectious
pre-trained models like InceptionV3 or train a custom model diseases, particularly in underserved and rural areas. These
Smartphone-based Deep Learning Models for the Early Detection of Bubonic Plague and Skin Diseases 223

Patient ID Age Gender Location Symptoms Diagnosis Smartphone Image

1 35 Male Rural Fever, Swollen Lymph Nodes Bubonic Plague Image_1.jpg
2 28 Female Urban Rash, Itching Eczema Image_2.jpg
3 45 Male Suburban Rash, Fever Impetigo Image_3.jpg
4 19 Female Rural Rash, Joint Pain Dengue Fever Image_4.jpg
5 57 Male Urban Swollen Lymph Nodes, Fever Bubonic Plague Image_5.jpg
6 32 Female Rural Skin Lesions, Fatigue Leprosy Image_6.jpg

Patient ID: 1 | Diagnosis: Bubonic Plague

Patient ID: 2 | Diagnosis: Skin Disease

Fig. 35.5 Bar plot for symptom scores for patients

models offer a number of advantages over traditional review. IEEE Transactions on Emerging Topics in Computing,
diagnostic methods, including accessibility, affordability, 11(4), 1–14.
accuracy, and speed. By making early detection more 3. Li, S., & Wu, J. (2022). Smartphone-based deep learning for
accessible and affordable, these models could help to improve the diagnosis of malaria: A systematic review. IEEE Access,
10, 106616-106626.
the health outcomes of millions of people around the world.
4. Liu, Y., & Wang, Z. (2021). Smartphone-based deep learning
We anticipate seeing even more cutting-edge applications of
for the diagnosis of dengue fever: A meta-analysis. IEEE
deep learning models based on smartphones in the healthcare Journal of Translational Engineering in Health and Medicine,
industry in the future. These models have the capacity to 10.
remotely monitor patient health in addition to diagnosing 5. Wang, L., & Li, X. (2022). Smartphone-based deep learning
illnesses and offering real-time assistance to healthcare for the diagnosis of skin diseases: A review. IEEE Transactions
professionals. We can anticipate even more beneficial effects on Consumer Electronics, 69(2), 481–487.
on human health as technology advances. 6. Chen, Y., & Lo, B. P. L. (2021). A smartphone-based deep
learning system for the diagnosis of skin cancer. IEEE Journal
of Biomedical and Health Informatics, 25(3), 898–906.
References 7. Han, D., & Li, Y. (2020). A smartphone-based deep learning
approach for the diagnosis of diabetic retinopathy. IEEE
1. Zhang, J., & He, Y. (2023). Deep learning for medical image
Access, 8, 75326–75335.
analysis: A review and future directions. IEEE Journal of
8. Huang, H., & Zhou, X. (2021). A smartphone-based deep
Biomedical and Health Informatics, 27(4), 1127–1140.
learning system for the diagnosis of Alzheimer’s disease.
2. Huang, N., Yang, X., & Wang, F. (2023). Smartphone-based
IEEE Journal of Translational Engineering in Health and
deep learning for the diagnosis of infectious diseases: A
Medicine, 10.
224 Algorithms in Advanced Artificial Intelligence

9. Li, W., & He, Y. (2022). A smartphone-based deep learning 22. M. Srikanth, “Smallholder Farmers Crop Registering Privacy-
approach for the diagnosis of Parkinson’s disease. IEEE Preserving Query Processing over Ethereum Blockchain,”
Access, 10, 100751–100761. Journal of Pharmaceutical Negative Results, vol. 13, issue 7,
10. Zhang, J., & Wu, J. (2021). A smartphone-based deep learning pp. 5609–5617, Dec. 2022.
system for the diagnosis of stroke. IEEE Journal of Biomedical 23. M. Srikanth, “The Early Detection of Alzheimer’s Illness
and Health Informatics, 25(11), 2386–2394. Using Machine Learning and Deep Learning Algorithms,”
11. Liu, Y., & Zhang, X. (2022). A smartphone-based deep Journal of Pharmaceutical Negative Results, vol. 13, issue 9,
learning system for the diagnosis of heart disease. IEEE pp. 4852–4859, Nov. 2022.
Access, 10, 113585–113594. 24. M. Srikanth, “Small Holders Farming Predictive Analysis
12. Wu, J., & Li, W. (2022). A smartphone-based deep learning Using Peer-To-Peer Approach,” International Journal of
system for the diagnosis of kidney disease. IEEE Access, 10, Agriculture and Animal Production, vol. 2, issue 05, pp. 26–
80146–80155. 37, Sep. 2022.
13. Li, Y., & Zhang, J. (2021). A smartphone-based deep learning 25. M. Srikanth, “Using Machine Learning and Neural Networks
system for the diagnosis of liver disease. IEEE Journal of Technologies, a Bottom-Up Water Process Is Being Used To
Translational Engineering in Health and Medicine, 10. Reduce All Water Pollution Diseases,” Journal of Artificial
14. Wang, Z., & Liu, Y. (2022). A smartphone-based deep learning Intelligence, Machine Learning and Neural Network
system for the diagnosis of lung disease. IEEE Access, 10, (JAIMLNN), vol. 2, Oct. 2022.
122186–122195. 26. M. Srikanth, “Blockchain Enable for Smallholder’s Farmers
15. M. Srikanth, “Integrated Technologies for Proactive Bridge- Crop Transaction Using Peer-to-Peer,” Indo-American Journal
Related Suicide Prevention”, Journal of Namibian Studies, of Agricultural and Veterinary Sciences, vol. 10, issue 3, pp.
Volume 1, Issue 33, Pages 2117–2136, ISSN: 1863-5954, Sep 33–43, Sep. 2022.
2023. 27. M. Srikanth, “Protecting Tribal Peoples Nearby Patient Care
16. M. Srikanth, “Deep Learning Approaches for Predictive Centres Use a Hybrid Technique Based on a Distribution
Modeling and Optimization of Metabolic Fluxes in Engineered Network,” International Journal of Health Sciences, Jun.
Microorganism” International Journal of Research in Science 2022.
&Amp; Engineering (IJRISE) ISSN: 2394-8299, 3(05), 1–11. 28. M. Srikanth, “Blockchain-Based Crop Farming Application
https://doi.org/10.55529/ijrise.35.1.11, July 2023. Using Peer-to-Peer,” Journal of Xidian University, Apr. 2022.
17. M. Srikanth, “Tackling Outliers for Predictive Smallholder 29. M. Srikanth, “Stop Spread Corona Based on Voice, Face and
Farming Analysis,” in Proceedings of the 2023 3rd International Emotional Recognition Using Machine Learning, Query
Conference on Smart Data Intelligence (ICSMDI), pp. 93–98, Optimization and Blockchain Technology,” Solid State
IEEE Xplore, March 26, 2023. Technology, Vol. 63 No. 6 (2020)
18. M. Srikanth, “Blockchain-Based Consensus For A Secure 30. M. Srikanth, “Machine Learning for Query Processing System
Smart Agriculture Supply Chain,” European Chemical and Query Response Time Using Hadoop,” IJMTST, Aug.
Bulletin, vol. 12, special issue 4, pp. 8669–8678, 2023. 2020.
[Online]. Available: doi: 10.48047/ecb/2023.12.si4.776.ISSN: 31. M. Srikanth, “Block-level Based Query Data Access Service
2063-5346, 2023. Availability for Query Process System,” IEEE, Page 1–9, Jul.
19. M. Srikanth, “Predict Early Pneumonitis in Health Care Using 2020.
Hybrid Model Algorithms,” Journal of Artificial Intelligence, 32. M. Srikanth, “Query Response Time in Blockchain Using
Machine Learning and Neural Network (JAIMLNN), vol. 3, Big Query Optimization,” The Role of IoT and Blockchain
issue 03, pp. 14–26,ISSN: 2799-1172, Apr. 2023. Techniques and Applications from Computer Science and
20. M. Srikanth, R. N. V. Jagan Mohan, M. Chandra Naik. (2023). Information Management, Apple Academic Press, Exclusive
A New Way to Improve Crop Quality and Protect the Supply Worldwide distribution by CRC Press Taylor & Francis
Chain is to use a Trajectory Network and Game Theory. Group, Jan. 2022.
Mathematical Statistician and Engineering Applications, 33. M. Srikanth, “A New Approach for Authorship Verification
71(4), 10600–10610. https://doi.org/10.17762/msea. Using Information Retrieval Features,” Springer-ICSE, vol.
v71i4.1952, ISSN: 2094-0343, 2023 74, pp. 23–29.
21. M. Srikanth, “Auction Algorithm: Peer-To-Peer System Based 34. M. Srikanth, “An Enhanced and Naive Clustering Algorithm
on Hybrid Technologies for Smallholder Farmers to Control for Text Classification Based on Weight,” International Journal
Demand and Supply,” International Journal of Research In & Magazine of Engineering, Technology, Management and
Science & Engineering (IJRISE), vol. 3, issue 1, pp. 9–23, Research, Dec. 2012.
2023.
Note: All the figures and tables in this chapter were designed by
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Kids Affected by Uncommon Illnesses Like

Autism: Pregnant Women’s Identification
through Lasso Regression
36

P. Jahnavi1
Research Scholar, Department of CSE, GIET University, Gunupur, Odisha, India
M. Chandra Naik2
Professor, Department of CSE, GIET University, Gunupur, Odisha, India
P. Bharat Siva Varma3
Associate Professor, Department of CSE, SRKR Engineering College, Andhrapradesh, India

Abstract: Autism is a neurodevelopment disorder influenced by genetics and environmental factors, with early experiences and
mental health of the pregnant parent playing a significant role. Rare genetic neurodevelopment disorders like Fragile X syndrome
are frequently linked to autism spectrum disorder, enhancing treatment strategies, clinical trials, and autism knowledge. Clinical
genetic services provide prenatal genetic testing for autism spectrum disorders, enabling parents to understand their child’s
risk, prepare for birth, and facilitate early interventions. High stress during pregnancy women may lead to autism in children,
highlighting the impact of mental health factors and physical state on unborn baby’s development and potential future diabetes.
The paper study indicates that partner abuse, including during pregnancy, increases the likelihood of a baby developing autism
later in life. Lasso Regression, a regularized linear regression with an L1 penalty, was found to be a suitable method for feature
selection in clinical trial processes for pregnant women with Autism.
Keywords: Autism, Clinical trail, Neurodevelopment disorder, Lasso regression etc.

1. Introduction oxidative stress, which are factors that have been implicated
in the development of neurological conditions, including
ASD, a prevalent condition, has experienced a significant ASD [6]. Autism is often caused by a small gene mutation,
rise in the number of children diagnosed in recent years. The requiring sequencing of known disease genes [13]. It can be
rarest and most severe part of the spectrum refers to children diagnosed with a rare disorder or vice versa. After receiving a
who develop normally but rapidly lose social, language, and genetic diagnosis, individuals may be tested for ASD through
mental skills, often resulting in a seizure disorder [3]. Rare behavioral evaluations or referred to a clinical geneticist to
genetic neurodevelopment disorders, like Fragile X syndrome, identify the underlying genetic causes. The investigating
are frequently linked to autism spectrum disorder, benefiting five rare genetic disorders linked to ASD and intellectual
treatment strategies, clinicaltrials, and enhancing autism disability: Phelan-McDermid syndrome, Fragile X syndrome,
knowledge [5]. Rare genetic disorders are diagnosed through FOXP1syndrome, ADNP syndrome, and DDX3X syndrome.
clinical genetic testing, with chromosome microarray being The goal is to create a comprehensive pre-clinical and clinical
the first-tier test for neurodevelopment delay [14]. This test program forgenetic disorders, utilizing an inter-disciplinary
detects missing or duplicated genes, like Phelan-McDermid and translational approach. Genetic findings are translated
syndrome, which can induce autistic phenotypes. Elevated into cell and rodent models, investigating their mechanisms
glucose levels can lead to increased inflammation and [11]. These models are used for drug discovery and testing,
1
jahnavi.p@giet.edu, 2srichandra2007@gmail.com, 3pbsvarma@gmail.com

DOI: 10.1201/9781003529231-36
226 Algorithms in Advanced Artificial Intelligence

and their study can inform autism and intellectual disability. to autism in children [7], particularly during weeks 25-28,
The first randomized clinical trial in Phelan-McDermid highlighting the significant impact of mental health factors
syndrome has found significant beneficial effects of insulin- during pregnancy [4]. Pregnancy’s physical state can impact
like growth factor 1 in a mouse model [2]. Clinical trial the unborn baby’s development, potentially increasing
aims to advance targeted treatments for Autism Spectrum the likelihood of future diabetes and mental health issues
Disorder (ASD) using Phelan-McDermid syndrome findings, [9]. Prenatal screening for autism does not use blood tests,
improving clinical care and optimizing treatment strategies ultrasounds, or fetal genetic testing. Blood tests indicate
for a larger patient group [5]. potential conditions like Down syndrome or spina bifida,
Clinical genetic services offer prenatal genetic testing for ultrasounds reveal fetal development, and fetal genetic testing
autism spectrum disorders (ASD), providing parents with checks genes for genetic differences.
information about their unborn child’s risk, preparing them The paper is divided into Parts 1, which introduce AI with
for the infant’s birth, and enabling early interventions [8]. Clinical Trail Process, Section 2, which presents experimental
Hormones during pregnancy increase the risk of ASD in results in section 3, and Section 4, which concludes with a
offspring, as exposure to dihydrotestosterone, progestin, and reference to Section 5.
norethindrone can induce ERβ promoter methylation and
inhibit ERβ expression [10]. Over 40 mothers are 51% more 2. AI with Clinical Trail Process
likely to have a child with autism, a growing developmental
disorder characterized by impaired social interaction and To effectively implement Artificial Intelligence (AI) concepts,
communication. High stress during pregnancy may be linked it’s crucial to have a basic understanding of mathematical

Fig. 36.1 Workflow diagram of machine learning

Kids Affected by Uncommon Illnesses Like Autism: Pregnant Women’s Identification through Lasso Regression 227

concepts, even if it’s not necessary for sample classification Standard-Scaling transform raw data into suitable formats for
or regression tasks. Supervised learning algorithms like modeling, ensuring features are comparable and ranges are
classification are used for predicting class labels due to their maintained. Scaling procedures like logarithmic, min-max,
ability to recognize multiple patterns and relationships in and standard scales vary in their mechanisms and objectives.
large or complex datasets [15].The key aspects of problem in Understanding target feature ranges, data distributions,
machine learning (ML) involve selecting the right mechanism outliners, domain knowledge, and ML model requirements is
and choosing important validation metrics. Labeled data crucial. Data preprocessing involves splitting data into train
is essential for supervised learning, like classification or and test sets, or train, test, and validation sets.
regression, as it helps the model understand what is happening.
For example, a wine quality dataset can be labeled based on 2.5 Model Selection and Training
conditional parameters on multiple features. The author admits The task relies on the objective statement and training
to making mistakes in model selection and domain selection, samples, which are fed to the ML model for optimal
but emphasizes the importance of understanding the problem performance. Those with mathematical understanding can
statements before implementing ML-centered models [12]. easily implement these models. The equation y = β0 + β1 x
A problem statement in machine learning (ML) is defined + ε represents the dependent variable, x as the independent
by analyzing the data of a specific domain. Analyzing data variable, and β0, β1, and ε are the slope and error terms.
involves understanding its behavior, distributions, and long- Understanding mathematical procedures can help explain
term dependencies. Mathematical skills, visualization, and why some models perform better than others.
scaling procedures can help determine the problem statement
and the ML model needed. Analyzing data features, such 2.6 Prediction and Inverse Transformation of
as distribution and relationships, are crucial for class label Labels
generation, which is the basis for supervised learning[1].
ML models for classification or regression generate numerical
Class labels are generated based on the problem statement
outcomes, which are converted into descriptive categorical
and the data’s characteristics. The below workflow diagram
labels through inverse transformation.
provides a clear and concise approach to comprehending
various machine learning models. Validation metrics: Accuracy is a metric in machine learning
that evaluates the model’s performance by comparing the
2.1 Data Collection and Data Analysis number of correctly classified instances to the total dataset.
To perform project or ML tasks, import a dataset in CSV
or Excel formats using APIs, web sources, or database 3. Experimental Result
connections. Analyze the data for trends, seasonality, cycles,
The experimental result demonstrates Lasso Regression, a
and histograms for distribution. Visual representation and
regularized linear regression with an L1 penalty, as a suitable
analysis can identify anomalies and relationships among data
method for feature selection in pregnant women for Autism.
attributes, such as correlations.
Maternal mental health and genes may influence a child’s
2.2 Defining Problem Statement and Feature autism, a neurodevelopment difference that begins before
birth. The months in utero may set the stage for the interaction
Selection
between genes and environment.
Defining a problem statement requires understanding analysis
Determining factors or features that might be associated with
outcomes and domain knowledge. Feature selection involves
autism during pregnancy involves complex considerations
selecting data attributes and class labels based on behavioral
and often requires extensive research and clinical data. Lasso
understanding through analysis.
regression for feature selection in predicting or understanding
2.3 Feature Engineering autism during pregnancy based on hypothetical features.
Figure 36.2 containing relevant features and target variables
Feature Engineering is a crucial aspect of ML model related to autism during pregnancy.
preparation, transforming existing features into numerical
ones to enhance the understanding of data samples. This The studied utilized Lasso Regression, a regularized linear
process helps in training ML models by creating new features regression with an L1 penalty, to select features for a clinical
or transforming categorical data. trial in pregnant women with Autism. The study aimed to
determine the optimal alpha parameter value and significance
2.4 Data Preprocessing of each feature, considering potential Autism-related factors.
In machine learning (ML) workflow, missing data values Autism is a neurodevelopment disorder influenced by
are addressed. Scaling procedures like Min-Max Scaling or genetics and environmental factors. Prenatal experiences
228 Algorithms in Advanced Artificial Intelligence

numerical ones. Model selection and training depend on

objective statements and training samples.
Lasso Regression is a suitable method for feature selection
in pregnant women for Autism, as maternal mental health
and genes may influence child’s neurodevelopment before
birth. Lasso Regression to select features for a clinical trial
on Autism in pregnant women, aiming to determine optimal
alpha parameter value and significance.

4. Conclusion
High stress during pregnancy can lead to autism in children,
highlighting the impact of mental health and physical state
on unborn babies’ development and potential diabetes. Lasso
Regression is suitable for feature selection in clinical trials.
High stress during pregnancy can cause autism in children,
highlighting the impact of mental health and physical state on
unborn babies’ development and potential diabetes.

References
1. Briguglio M, Turriziani L, Currò A, Gagliano A, Di Rosa G,
Fig. 36.2 Feature selection based on lasso Caccamo D, Tonacci A, Gangemi S. A Machine Learning
Approach to the Diagnosis of Autism Spectrum Disorder
and mental health of pregnant mothers can impact their and Multi-Systemic Developmental Disorder Based on
child’s development. Genetic services offer prenatal testing Retrospective Data and ADOS-2 Score. Brain Sciences. 2023
for autism spectrum disorders, enabling early interventions. May 31;13(6):883.
2. Moffitt BA, Sarasua SM, Ivankovic D, Ward LD, Valentine K,
High stress during pregnancy can lead to autism, and partner
Bennett Jr WE, Rogers C, Phelan K, Boccuto L. Stratification
abuse increases the likelihood of a baby developing autism of a Phelan–McDermid Syndrome Population Based on
later in life. Our Study on Partner abuse during pregnancy Their Response to Human Growth Hormone and Insulin-like
increases autism risk. Lasso Regression method suitable for Growth Factor. Genes. 2023 Feb 15;14(2):490.
feature selection in clinical trials for pregnant women with 3. Song, C.; Jiang, Z.Q.; Hu, L.F.; Li, W.H.; Liu, X.L.;
Autism. Wang, Y.Y.; Jin, W.Y.; Zhu, Z.W. A machine learning-
Autism Spectrum Disorder (ASD) cases are increasing, based diagnostic model for children with autism spectrum
disorders complicated with intellectual disability. Front.
with rare genetic neurodevelopment disorders like Fragile
Psychiatry 2022, 13, 993077.
X syndrome often linked. A study aims to develop a 4. Pham C, Symeonides C, O’Hely M, Sly PD, Knibbs LD,
comprehensive pre-clinical and clinical program using inter Thomson S, Vuillermin P, Saffery R, Ponsonby AL, Barwon
disciplinary and translational approaches. Clinical genetic Infant Study Investigator Group. Early life environmental
services offer prenatal testing for Autism Spectrum Disorders, factors associated with autism spectrum disorder symptoms
revealing risk factors and enabling early interventions, in children at age 2 years: A birth cohort study. Autism. 2022
considering hormones, stress, and physical state during Oct;26(7):1864-81.
pregnancy. 5. Cervantes PE, Conlon GR, Shalev RA, Castellanos FX. Trends
in ASD Pharmacological Research: An Analysis of Clinical
Initially, understanding mathematical concepts is crucial Trials. gov. Review Journal of Autism and Developmental
for AI implementation, particularly in supervised learning Disorders. 2023 Jun;10(2):367-82.
algorithms like classification. Machine learning involves 6. Yang Y, Lin Q, Ma L, Lai Z, Xie J, Zhang Z, Wu X, Luo W, Hu
selecting mechanisms, validation metrics, labeled data, P, Wang X, Guo X. Maternal fasting glucose levels throughout
problem statements, and data features. the pregnancy and risk of adverse birth outcomes in newborns:
Machine learning involves data collection, analysis, feature a birth cohort study in Foshan city, Southern China. European
Journal of Endocrinology. 2023 Jan 1;188(1):lvac019.
selection, engineering, data preprocessing, model selection,
7. Caparros-Gonzalez RA, de la Torre-Luque A, Romero-
training, prediction, and validation metrics. Data is imported, Gonzalez B, Quesada-Soto JM, Alderdice F, Peralta-Ramirez
analyzed for trends, and features are transformed into
Kids Affected by Uncommon Illnesses Like Autism: Pregnant Women’s Identification through Lasso Regression 229

MI. Stress during pregnancy and the development of diseases 13. Hieter P, Andrews B, Fowler D, Bellen H. Highlighting rare
in the offspring: a systematic-review and meta-analysis. disease research with a GENETICS and G3 series on genetic
Midwifery. 2021 Jun 1;97:102939. models of rare diseases. Genetics. 2023 Aug;224(4):iyad121.
8. Lipinski RJ, Krauss RS. Gene-environment interactions in 14. Tang J, Han J, Xue J, Zhen L, Yang X, Pan M, Hu L, Li R,
birth defect etiology: Challenges and opportunities. Current Jiang Y, Zhang Y, Jing X. A Deep-Learning-Based Method
topics in developmental biology. 2023; 152:1. Can Detect Both Common and Rare Genetic Disorders in
9. Makris G, Eleftheriades A, Pervanidou P. Early life stress, Fetal Ultrasound. Biomedicines. 2023 Jun 19;11(6):1756.
hormones, and neurodevelopmental disorders. Hormone 15. Lin S, Nateqi J, Weingartner-Ortner R, Gruarin S, Marling
Research in Paediatrics. 2023 Mar 1;96(1):17-24. H, Pilgram V, Lagler FB, Aigner E, Martin AG. An artificial
10. Tang P, Li J, Li J, Yang J, Zhu J. Prenatal diagnosis and genetic intelligence-based approach for identifying rare disease
analysis of a fetus with Branchio-oto-renal syndrome: A case patients using retrospective electronic health records applied
report. Medicine (Baltimore). 2022; 101:e31172. for Pompe disease. Frontiers in Neurology. 2023 Apr 21;
11. Ferreira CR. The burden of rare diseases. American journal of 14:1108222.
medical genetics Part A. 2019 Jun;179(6):885-92.
Note: All the figures in this chapter were designed by the author.
12. Schaefer J, Lehne M, Schepers J, Prasser F, Thun S. The use of
machine learning in rare diseases: a scoping review. Orphanet
journal of rare diseases. 2020 Dec;15:1-0.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
230 Algorithms in Advanced Artificial Intelligence

Blind People Assistant: Real-Time Objects

Detection and Distance Estimation with
Voice Feedback
37

Hemalatha Indukuri1*,
Professor, Department of Information Technology,
S.R.K.R. Engineering College, Bhimavaram, A.P, India
K. Kishore Raju2
Associate Professor, Department of Information Technology,
S.R.K.R. Engineering College, Bhimavara, A.P, India
P. KavyaSri3, M. Srija4, K. Srujana5, P. SivaPriya6
Student, Department of Information Technology,
S.R.K.R. Engineering College, Bhimavaram, A.P, India

Abstract: There are 285 million visually challenged people in India, or around 20% of the country’s overall population. It is
commonly known that 285 million people, or 20% of India’s population, are visually challenged. Their main obstacle is being
able to independently recognise faraway objects. Even the most fundamental necessities of life must be procured by someone
else for their benefit. Consequently, it’s not an easy task, and they really require a technological solution. Individuals with visual
impairments have access to a variety of aids. Our integrated machine-learning technology is designed to assist individuals with
visual impairments. According to the developers, they want their system to help people with things like distance calculation,
object detection, and classification in real time. If the user is approaching an object at an unsafe distance, the system will notify
them by sounding an alarm. In order to make things even better for the user, the system has the capability to provide vocal
comments. You can apply the same strategy to the Obstacle Detection Mechanism if you so like. We employ Python and a
Tensorflow-based technique to solve the object identification problem comprehensively.
Keywords: Convolution neural network (CNN), Object recognition, Object detection, Voice feedback

1. Introduction severe or moderate distant vision impairment, hindering

their ability to carry out daily chores. In 2021, the World
Those who are visually impaired often struggle to make Health Organisation (WHO) conducted a study that found
out the smallest of details, even when their eyes are in good one billion individuals globally to have severe or moderate
health. When a person’s horizontal vision field when both distant vision impairment. The greatest challenge for people
eyes are open is 20 degrees or less, or when their visual acuity who are sight impaired is learning to navigate on their own.
is 6/60 or lower, we call it blindness. The individual in issue Those who can see better should lend a hand to those who
would be diagnosed with severe vision impairment if they can’t. A visually-based module, Blind People’s Assistance, is
were to satisfy this extremely high threshold for blindness. designed with blind victims in mind. A wirelessly networked
In 2021, the World Health Organization (WHO) conducted system that relies on laptops can get live video broadcasts
a study that found one billion individuals globally to have through an app. People with visual impairments are the

indukurihemalatha@gmail.com, 2kkrsrkrit@gmail.com, 3kavyasreepedalanka107@gmail.com, 4srijasrinivas20@gmail.com,

5
srujanakothapalli133@gmail.com, 6priyapatnala26@gmail.com

DOI: 10.1201/9781003529231-37
Blind People Assistant: Real-Time Objects Detection and Distance Estimation with Voice Feedback 231

intended users of this device. Using the SSD algorithm and individuals. The process involves several steps, starting from
the TENSORFLOW APIs, this system is able to recognise extracting frames and comparing them with objects in a
objects in real-time. The most notable features are its distance database to detect items in each frame. Our system is capable
calculation skills, which include the ability to approximate of recognizing and locating objects in both photos and videos.
distance computation and produce wireless feedback based An audio file containing information about each detected
on voice commands. It streamlines, expedites, and ensures object is then played. Therefore, our system addresses both
the dependability of the blind’s work by wirelessly providing object detection and identification simultaneously.
voice-based input on the proximity of objects. • The system is designed to capture real- time frames and
process them in the Laptop Based Server.
2. Literature Survey • The server has a pre-trained SSD detection model, which
Developing real-time object detection systems that is trained on COCO DATASET, to recognize the output
incorporate voice input is a primary focus of computer vision class with different accuracy metrics. After the testing
and human-computer interaction researchers. Zhang et al. process, the class of the detected object is translated into
examined numerous methodologies and approaches in their default voice messages using voice modules to assist
2021 literature survey on real-time object identification with blind individuals.
aural feedback using sensors. Cameras, microphones, and • Along with object identification, an alert system is also
accelerometers were the primary sensors that the authors implemented that calculates the distance approximation
focused on for object recognition and voice input to the buyer. between the object and the blind person.
And they discussed the problems with current approaches • The system generates voice-based outputs with distance
and proposed solutions for future research. The review units to inform the person whether they are close to the
found that deep learning methods, such as convolutional object or at a safer distance.
neural networks, could identify objects and voices with
some degree of success. In their research, the authors 3.1 Video Streaming
found that voice feedback and real-time object detection Real-time video streaming for object detection for the blind
had the potential to significantly enhance HCI and open is an important application of object detection technology.
up technology to individuals with disabilities [4]. Research Real-time video streaming is a process of transmitting video
into real-time object recognition with vocal input could be data over the internet in real-time, allowing users to watch the
useful in many areas, such as assistive technology, security, video as it happens. This is achieved by breaking down the
and robots. Wang et al. (2021) conducted a comprehensive video into small packets and sending them over the internet
literature review that looked into various methods for real- in sequential order. As the packets arrive at the receiver, they
time object detection with voice feedback. After analysing are reassembled into a continuous video stream.
different deep learning models, including YOLOv4 and
EfficientDet, the study highlighted the benefits of utilising
voice input to enhance the performance of object detection
4. System Modules
systems. Enhancing the precision and resilience of real-time There are four modules in our proposed system:
object recognition algorithms was one of the future aims of (a) Object detection
the authors’ discussion of the challenges and opportunities in
(b) Object Identification
this field of study [5]. J. J. Wang, J. H. Kim, and Y. S. Park
published an article in 2021 suggesting that the size of objects (c) Depth Estimation
can be used to estimate their distance using various computer (d) Voice Assistance
vision methods. This paper explores the most recent advances (a) Identification of Objects: Object detection is the technique
in distance-estimating methods that are based on the size of used to determine the location and presence of an object
objects. Stereovision, monocular depth estimation, and light within an image or video stream. It entails identifying things
detection and reception (LiDAR)-based methods are only a in a picture and placing a bounding box around them to
few of the various distance measurement approaches covered show where they are. Typically, object detection algorithms
in detail by the authors [10]. group items into predetermined groupings or categories,
such as humans, cars, or animals. In numerous applications,
3. Research Design including object tracking, autonomous cars, and surveillance,
object detection is a crucial task.
Our proposed design for a system aims to detect objects
(b) Identifying Objects: On the other hand, the process
and obstacles in the environment to assist visually impaired
of identifying the kind or category of object inside the
232 Algorithms in Advanced Artificial Intelligence

bounding box that object detection creates is known as object present in each cell and its offset from the cell center,
identification. Stated differently, object identification is the the SSD model divides the image into a grid of cells. A
process of identifying what the object is—for example, a collection of bounding boxes that show the positions of
person or an automobile. Typically, object identification the identified items in the image are the output of the
entails applying machine learning or deep learning algorithms SSD model.
to classify the object inside the bounding box. To put it briefly, (c) Depth Estimation: The process of determining the
object identification is the process of identifying the kind distance between the detected object and the user. This is a
or category of the located items, whereas object detection critical component of the system as it helps provide accurate
is the act of finding objects in an image or video stream. warnings to the user based on the proximity of the object.
These tasks are frequently combined in applications such as
robotics, autonomous cars, and surveillance systems in the Our prototype has been developed to aid individuals with
field of computer vision. Identifying and detecting objects visual impairment by providing them with warning alerts
requires a number of steps, which include: regarding any obstacles in their path. In order to achieve
this, we require the ability to determine the distance between
1. Training Information: To learn how to identify and
the person and the obstacle in real-time scenarios. When an
locate various items in a picture, object recognition
object is detected, a rectangular box is generated around it.
models need a lot of labeled training data. Typically,
If the object occupies a significant portion of the frame, we
this training data The <text> comprises pictures and
use certain constraints to determine an approximate distance
annotations that provide detailed explanations about
between the object and the individual. We utilize code to
the position and type of each element depicted in the
identify objects and provide information regarding their
picture.
location and distance.
2. Feature extraction: This stage, which entails removing
significant features from the unprocessed picture data, (d) Voice Generation Module: Voice generation modules
is one of the most crucial in object detection. A common are an essential component of a real-time object detection
method for feature extraction in object detection is to system with distance and voice alerts for blind people. These
use a convolutional neural network (CNN) that has modules are responsible for converting warning messages
already been trained on a significant quantity of picture into speech to alert the user of potential obstacles or hazards.
data, such as MobileNet, since it can extract relevant There are several text-to-speech (TTS) software libraries and
information from photos. frameworks that can be used to generate speech in real-time.
3. Feature Fusion: Using feature fusion can increase the Some of the commonly used TTS modules include:
object detection model’s accuracy. To capture both These TTS modules can be integrated into the object detection
high-level and low-level elements of the image, this system to provide real- time voice alerts to the user. The
entails integrating the features retrieved from various system can use pre-recorded warning messages or generate
layers of the CNN. new messages based on the type and distance of the detected
4. Dimension Reduction: Because the feature maps object. The TTS module can then convert these messages into
produced by the CNN are sometimes very large, a speech and play them back through a speaker or headphones
method known as dimension reduction is employed to for the user to hear.
minimize the number of features while maintaining the Pyttsx3 is a Python module that converts text to speech.
highest level of detail. Frequently used for dimension It is a straightforward tool that works by calculating the
reduction, researchers often employ Principal approximate distance every time an item is detected and
Component Analysis (PCA) as a technique. displaying the corresponding text on the screen using the
5. Training the Classifier: After extracting, fusing, and cv2 library and the cv2.putText() function. To recognize any
reducing the dimensions of the features, we train a buried text in an image, the system utilizes Python-tesseract,
classifier to determine if each area of the image contains which is an OCR (Optical Character Recognition) tool that
an object of interest. One can use various machine scans and analyzes the image to detect any text content and
learning algorithms, such as logistic regression and encode it in a computer-readable format. Once the text is
support vector machines (SVM), to accomplish this recognized, it is linked to pyttsx to generate audio commands
task. as output. For instance, if an object is too close, the system
6. Object Detection Model: Lastly, the characteristics can generates a voice warning that says, “Warning: The object
be used to train a single-shot detector (SSD) model that (class of object) is too close to you,” while a voice saying,
identifies objects in the image using the classifier. In “The object is at a safe distance” is generated if the object is
order to forecast the likelihood that an object will be at a safe distance. The system makes use of various libraries,
such as engine.io,pyttsx3, PyTorch, pytesseract.
Blind People Assistant: Real-Time Objects Detection and Distance Estimation with Voice Feedback 233

5. Single Shot Multi Box Detection as their output. The purpose of these feature maps is to
identify items with varying sizes and proportions. For
(SSD) each point in the feature maps, we construct a set of
The SSD (Single Shot Multi-Box Detection) architecture, default bounding boxes with varying scales and aspect
which is a model based on deep learning, quickly and ratios. Predict the final bounding boxes using these
accurately detects objects in images. It is more efficient and default boxes as anchors.
quicker than two-stage detectors since it only needs one pass • Layers for Bounding Box Regression: These layers
over the input picture to detect objects; this is because it is a forecast the Offsets are applied to every default box to
one-stage detector. generate the final projections for the bounding boxes.
• Object Class Prediction Layers: These layers utilise
5.1 Mobile Net
object class probabilities for each default box to make
When it comes to embedded vision and mobile applications, predictions. The NMS layer eliminates overlapping
one prominent convolutional neural network (CNN) bounding boxes and retains only the most certain
architecture is MobileNet. It is perfect for devices with predictions; it is part of the Non-Maximum Suppression
limited processing power because it is both computationally approach.
efficient and has a minimal memory footprint. A depth-wise
• The output layer returns the last set of predicted bounding
convolution followed by a point-wise convolution is the basis
boxes and class probabilities. The SSD network built
of the MobileNet design. Unlike pointwise convolution,
on MobileNet employs a mix of convolutional layers,
which combines the output of depth-wise convolution for all
default boxes, and feature extraction to identify objects
input channels, depth-wise convolution applies a single filter
in images.
to each input channel independently. A “bottleneck layer,”
an additional component of the MobileNet architecture,
Table 37.1 The architecture of the single shot detector (ssd)
lowers the computational cost of depth-wise convolution by with mobilenet as the base feature extractor
lowering the number of input channels.
Convolution Size No of Output Size (W
5.2 SSD (Single Shot Detector) + MOBILENET Layer (W x H x D) Bounding x H x D)
Boxes
The SSD (Single Shot Detector) technique is frequently
Input 300 x 300 x 3 – 300 x 300 x 3
employed for object detection due to its speed and accuracy.
Conv2D 3x3 300 x 300 x 32 – 300 x 300 x 32
In contrast, MobileNet is an architecture for neural networks
developed with mobile and embedded devices in mind. It Conv2D 3x3 150 x 150 x 64 – 150 x 150 x 64
is ideal for real-time applications on devices with limited Conv2D 3x3 75 x 75 x 128 – 75 x 75 x 128
processing capabilities because of its lightweight architecture, Conv2D 3x3 38 x 38 x 256 – 38 x 38 x 256
which prioritises low latency and low power consumption.
Conv2D 3x3 19 x 19 x 512 – 19 x 19 x 512
The combination of SSD and MobileNet allows for accurate,
Conv2D 3x3 10 x 10 x 512 – 10 x 10 x 512
real-time object identification on embedded and mobile
devices. Conv2D 3x3 5 x 5 x 512 – 5 x 5 x 512
Conv2D 3x3 3 x 3 x 512 – 3 x 3 x 512
5.3 Working of Layers in the MobileNet-based Conv2D 1x1 1 x 1 x 512 3 1 x 1 x 3,072
SSD Network
CONV2D 1X1 1 x 1 x 256 6 1 x 1 x 1,536
The input layer receives the image and resizes it to a CONV2D 1X1 1 x 1 x 128 6 1 x 1 x 768
predetermined size, typically 300x300. Layers of the Base
CONV2D 1X1 1X1X128 6 1 x 1 x 768
Network: Convolutional and pooling layers make up the
base network, which is responsible for feature extraction. CONV2D 1X1 1 X 1 X 128 6 1 x 1 x 768
The foundational network of an SSD network that uses CONV2D 1X1 1 X1 X 128 6 1 x 1 x 768
MobileNet is usually a pre-trained MobileNet design. The
detection network employs the convolutional layers of the • Convolution layer: A type of neural network layer that
base network to retrieve features from its output. To better applies a filter to the input data to extract features.
capture the image’s finer details, these layers often use lower • Size: The dimensions of the input data in width, height,
kernel sizes. and depth (number of channels).
• Feature Maps at Multiple Scales: Convolutional layers • Number of bounding boxes: The quantity of pre-made
produce feature maps with varying spatial resolutions boxes utilised for object detection, each with its own
234 Algorithms in Advanced Artificial Intelligence

unique scale and aspect ratio,.The convolution operation • Object instance segmentation: In addition to object
on the input data produces a feature map with a number of bounding boxes, the COCO dataset also includes
dimensions, which determines the output size. This table segmentation masks for each object instance, allowing
details the SSD MobileNet architecture’s convolution for more precise object localization and segmentation.
layers, along with their input and output sizes, the
number of bounding boxes generated, and other relevant 7. Loss Function
information. The top row of the table shows an input
image to the network, which is a 300 x 300 x 3 (width To find objects in images, object detection uses a loss
x height x depth) image, where depth represents the function to measure how different the model’s predicted class
number of color channels. The rows that follow show labels and bounding boxes are from the real ones. During
each convolutional layer of the network. A feature map training, the goal of the loss function is to decrease this
with fewer spatial resolutions and more channels is mismatch so that the model can accurately detect objects in
produced by each convolution layer by applying a set of images. The localization loss measures how much an object’s
learnable filters to the input. Layer after layer, the feature actual bounding box differs from its projected bounding
maps take on different sizes. Seven 1x1 convolution box. In contrast, the classification loss measures how much
layers make up the other seven layers of the table; they an object’s actual class label differs from its projected class
are responsible for producing the projected class scores label.
and bounding box offsets for each anchor box per feature Object categories: The COCO dataset
map location. Since these layers process feature maps
with varying spatial resolutions, the quantity of bounding 𝐿(𝑥, 𝑐, 𝑙, 𝑔) = 1 (𝐿𝑐𝑜𝑛𝑓 (𝑥, 𝑐) + 𝛼𝐿𝑙𝑜𝑐 (𝑥, 𝑙, 𝑔) ) (1)
𝑁
boxes produced by them varies. The final convolution
contains 80 object categories, including people, animals,
layer’s output measures 1 x 1 x 3,072, which is the same
vehicles, and household items.
as the anticipated class scores and bounding box offsets
for all 8732 network anchor boxes. Object annotations: Each image in the COCO dataset is
annotated with object bounding boxes, segmentation masks,
and category labels for all objects present in the image. The
6. Coco Dataset annotations are in JSON format.
COCO (Common Objects in Context) is a popular large-scale The classification loss (Lconf), found in Eq. (1), measures the
dataset for object detection, segmentation, and captioning discrepancy between the anticipated and ground truth class
applications. With approximately 2.5 million object instances labels for the matched default boxes. This metric quantifies
categorized into 80 distinct object categories, it comprises the model’s ability to identify and label picture items. For
more than 330,000 photos.The COCO dataset’s salient each set of matched default boxes, the localization loss (Lloc)
characteristics for object detection are: measures how far the predicted bounding boxes deviate
• Image diversity: The COCO dataset includes images from the ground truth bounding boxes. By forecasting the
with a wide range of object sizes, shapes, and occlusion coordinates of the bounding boxes, it evaluates the model’s
levels, as well as images with multiple objects and accuracy in item localization in the image. To make the loss
complex scenes. consistent across all matching default boxes, we utilise 1/N.
We add up the two losses to form a single loss function, and
then we back propagate the final loss via the network to
update the model’s parameters.

8. Tensor Flow
With TensorFlow, you can build a deep learning model
to identify objects in photos and videos. Modern object
detection models trained on massive datasets are available
via the TensorFlow Object Detection API. These include
Mask R-CNN, SSD, and Faster R-CNN. The application
programming interface (API) gives tools for evaluating
and deploying trained models and makes it easier to
customise models. Applications including self-driving cars,
surveillance, and robotics have led to its widespread adoption
Fig. 37.1 Cocodata set examples across industries.
Blind People Assistant: Real-Time Objects Detection and Distance Estimation with Voice Feedback 235

9. Experiments and Results A warning has been given since the final distance between
the object and the webcam frame is 0.9 units, which is quite
close. “Warning: Bed is at a safer distance.” is the warning
message that appears once the system’s voice output identifies
the object as a bed.

Fig. 37.2 Output snapshot of the precision of a object cup

is 99%

A caution has been issued due to the extremely near-final

distance of 0.2 units between the object and the webcam
frame. “Warning: The cup is very close to the frame.”
The system’s speech output further confirms the object’s
identification as a cup.
Fig. 37.5 Output snapshot of the precision of a object TV is
96%

A warning has been given because the final distance between

the object and the frame of the webcam is 0.8 units, which
is very close. “Warning: TV is at a safer distance.” is the
warning message that appears once the system’s voice output
identifies the object as a TV.

10. Evaluation Metrics

To see how well our system holds up over time, we made
Fig. 37.3 Output snapshot of the precision of a object remote a graph. The X-axis shows seconds, while the Y-axis shows
is 98% percentages for the accuracy metric.

The webcam’s frame has issued an alert due to the object’s

extremely close proximity (0.8 units). The object is
recognised as a remote, according to the system’s voice
output, and a warning message reads, “Warning: Remote at
a safer distance.”

Fig. 37.6 Output snapshot of accuracy of cup over time

Once the system is fully functional, it can correctly detect

and name more than 90 objects. Additionally, the model
detects the approaching proximity of an object by means of
an auditory reaction and approximates the distance between
the two.
Fig. 37.4 Output snapshot of the precision of a object bed
is 96%
236 Algorithms in Advanced Artificial Intelligence

of software components, while hardware components like

cameras or webcams are examples of software components.
Object detection, distance estimation, data collection, model
training, and voice alarms are all part of the process. To
train object identification models and reliably recognise and
locate things in the picture, one can utilise machine learning
algorithms like convolutional neural networks (SSD with
Mobilenet). The technology is able to give users crucial
information about their environments and themselves by
integrating object detection, distance estimation, and audio
Fig. 37.7 Output snapshot of accuracy of object remote over alarms. People should be made aware of such dangers. The
time proposed system can greatly benefit people who are visually
challenged in terms of their independence and quality of life.

12. Future Work

There is a lot of potential for improvement in the future
when it comes to real-time object recognition with distance
and auditory alerts for visually impaired individuals. To
start, the framework can be enhanced by incorporating a
larger dataset that can identify a broader range of indoor and
outdoor objects. Individuals with vision impairments may
find their way around more easily with this. Incorporating
additional blind-friendly features can upgrade the system
to a two-way interaction system. The user can be informed
Fig. 37.8 Output snapshot of accuracy of object bed over about the object’s colour, distance from them, and other
time qualities through the design of features. To top it all off, the
model can be trained to recognise certain friends and family
members’ faces, which means no more miscommunication.
Family members can access the user’s position and find them
when needed thanks to the system’s Wi-Fi and GPS features.
Incorporating the capability to detect the amount of money
held by the user can further aid in the prevention of theft.
There are a lot of open-ended possibilities for improving and
modifying the framework to accomplish our overarching goal
of making blind people’s daily lives easier.

References
1. Harish Adusumalli, D. Kalyani, R. Krishna Sri, M. Pratapteja,
Fig. 37.9 Output snapshot of accuracy of object TV over time P V R D Prasada Rao “Face Mask Detection Using OpenCV”.
In IEEE, 2021.
11. Conclusion 2. Ayushi Sharma, Jyotsna Pathak, Muskan Prakash, J N Singh,
“Object Detection using OpenCV and Python”. In IEEE, 09
Finally, a technological advancement that can help the March 2022.
visually impaired immensely is a blind person assistant that 3. P Viola and M Jones, “Rapid object detection using a boosted
uses real-time object-detection-with-distance and audio cascade of simple features”, Proceedings of the 2001 IEEE
alarms. Machine learning methods and computer vision Computer Society Conference on Computer Vision and
techniques are used by the system to detect and localise Pattern Recognition (CVPR 2001), December 8-14, 2001.
objects in the user’s surroundings. The system issues 4. Dr. S.V. Viraktamath, Madhuri Yavagal, Rachita Byahatti,
“Object Detection and Classification using YOLOv3”.
audible warnings when it detects possible barriers.Object
InIJERT, February-2021
detection libraries and text-to-speech software are examples
Blind People Assistant: Real-Time Objects Detection and Distance Estimation with Voice Feedback 237

5. Zhang, Y., Chen, Y., Zhang, Y., & Zhang, Y. (2021). Real-time 13. YacineMessai, Kheireddine Chara, FawziSrairi “Object
object detection with voice feedback using sensors-based: A Tracking Platform for Color Object Detection using Genetic
literature review. Sensors, 21(3), 777. Algorithm Optimization”, 2020.
6. Abdelrahman Abdou, Sherif Abdelazeem, and Mahmoud 14. K.Vijiyakumar,K.Ajitha, A.Alexia, M.Hemalashmi,
Refaat(2021) https://www.mdpi.com/2076- 3417/11/16/7342. S.Madhumitha “Object Detection For Visually Impaired
7. Shreyas N Srivatsa, Amruth, Sreevathsa, Vinay , Mr. Elaiyaraja, People Using SSD Algorithm”, 2020.
“Object Detection using Deep Learning with OpenCV and 15. Zhang Qian, Liu Xiao-jun “Video Image Fire Recognition
Python”. In IRJET, JAN 2021 Based on Color Space and Moving Object Detection”, 2020.
8. Priyal Jawale, Hitiksha Patel, Nivedita Rajput, Prof. Sanjay 16. Sunit Vaidya, Naisha Shah, Niti Shah, Prof. Radha
Pawar , “Real-Time Object Detection using TensorFlow”. In Shankarmani “Real-Time Object Detection for Visually
IRJET, Aug 2020 Challenged People”, 2020.
9. K.Vijiyakumar,K.Ajitha, A.Alexia,S.Madhumitha(2020): 17. Hao Shi, Qi Peng, Jiachen Yang, Xudong Bai, YiqiZhuang
Object detection using SSD. “A Practical ROI and Object Detection Method for Vision
10. Chia-Hung Yeh, Chu-Han Lin,Li-Wei Kang, ChihHsiang Robot”, 2020.
Huang, Min-Hui Lin, Chuan- Yu ChangChuaChin Wang 18. Mr. AkshayWankhade, Prof. Pramila M. Chawan“Design and
“Lightweight Deep Neural Network for Joint Learning of Deployment of an Online ShoppingPortal for the Color Blind
Underwater Object Detection and Color Conversion”, 2021. People”, 2019.
11. Martin Stancel, Branislav Mados, Martin Chovanec, Peter 19. Ashwani Kumar, S S Sai Satyanarayana Reddy, Vivek
BalazHybrid “Object Detection Using Domain-specific Kulkarni “An Object Detection Technique For Blind People in
Datasets,” 2021. Real-Time Using Deep Neural Network”, 2019.
12. Manuel G. Forero, Julián Ávila-Navarro, and Sergio Herrera-
Note: All the figures and table in this chapter were designed by the
Rivera “New Method for Extreme Color Detection in Images”,
author.
2020.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
238 Algorithms in Advanced Artificial Intelligence

Standard Encryption Methodologies to

Process Multi-Modality Medical Images for
Diagnosing in Telemedicine
38

P. Shyamala Madhuri*, B. Amutha

Department of Computing Technologies, School of Computing, College of Engineering and Technology,
SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamilnadu, India
D. J. Nagendrakumar
Department of Information Technology, Vishnu Institute of Technology,
Bhimavaram, Andhra Pradesh, India-504 202

Abstract: Telemedicine has revolutionized the healthcare industry by enabling remote diagnosis and therapy and utilizing
image digitization in the medical field. However, ensuring the security of multimodal medical images in telemedicine presents
unique challenges. This study uses different encryption standards as the underlying security mechanism to protect multimodal
medical images. The use of modular encryption standards can effectively ensure the security, confidentiality and integrity
of transmitted multimodal images. To prevent unauthorized persons from accessing encrypted modules, it is also important
to ensure that sensitive multimodal medical images can only be viewed and edited by authorized users. The main objective
of the work is to briefly encapsulate and evaluate the different algorithms inherent in each methodology, with a focus on
various access control mechanisms to ensure multimodal medical image privacy in telemedicine, thereby minimizing the risk
of unauthorized disclosure or manipulation.
Keywords: Telemedicine, Multi-modality medical images, Security, Modular encryption standards, Confidentiality, Integrity,
Access control, Advanced encryption algorithms, Machine learning

1. Introduction Research studies have explored different aspects of the impact

of digitization and Medical image security in telemedicine
The digitization of medical images in healthcare capabilities including:
has been significantly enhanced providers to diagnose and 1. The effectiveness of various security measures in
treat patients remotely through telemedicine. [1] This has protecting medical images in telemedicine.
led to improved access to healthcare, lowered expenses
2. The challenges and opportunities for developing
and enhanced results for patients. However, the security of
and implementing effective security strategies in
medical images in telemedicine remains a noteworthy concern
telemedicine.
emerges as a result of the sensitive trait of patient information
[2]. Various security measures have been implemented 3. The impact of telemedicine on healthcare access,
to address this concern, such as encryption, secure data quality, and costs.
transfer protocols, and access control mechanisms. However, “Multi-modality medical images” [4] refer to a set of imaging
challenges still exist in the standardization of security data acquired from different imaging modalities or techniques
protocols and the development of more advanced security to provide an extensive analysis of a patient’s condition. [5]
measures to protect against evolving threats [3]. In medical diagnostics and treatment planning, different

Corresponding author: sp6331@srmist.edu.in

DOI: 10.1201/9781003529231-38
Standard Encryption Methodologies to Process Multi-Modality Medical Images for Diagnosing in Telemedicine 239

imaging modalities offer unique information about various treatment planning, and monitoring of various medical
aspects by delving into the intricacies of the human body, conditions, providing a comprehensive view of the patient’s
healthcare professionals can gain a more comprehensive health and facilitating informed decision-making by
understanding of a patient’s well-being. healthcare professionals.
Here are some commonly used imaging modalities in the The security of health information (HI) is an ongoing process
arena of medical imaging: that requires constant review and adaptation to keep up with
(a) X-ray: X-ray imaging uses ionizing radiation to produce changes in healthcare environments and technology. Small
images of bones, tissues, and organs. It is commonly used for healthcare centers face challenges in recognizing threats and
examining fractures, detecting abnormalities in the chest, and securing HI. The research aims to help healthcare practices
evaluating dental conditions [6]. prepare for these challenges and provide suitable security
approaches through effective risk assessment [15].
(b) Computed Tomography (CT): CT scans employ a
combination of X-ray imaging and advanced computer Table 38.1 Comparison of multi-modality of diagnostic
processing methods to produce precise cross-sectional imaging
representations of the body. CT scans are useful for detecting
Modality Images Characteristics Advantages
tumors, evaluating injuries, and providing precise anatomical
information [7]. X-ray Detect Detects
Features and Fractures
(c) Magnetic Resonance Imaging (MRI): Magnetic Resonance abnormalities in and
Imaging (MRI) employs a robust magnetic field and radio bone Positions abnormalities
waves to produce intricate visuals of the internal structures in bones.
within the body. It is particularly adept at capturing precise
images of soft tissues such as the brain, muscles, and organs,
CT Offers Concise
and is commonly used for neurological, musculoskeletal, and
comprehensive scans with
abdominal imaging. information superior
(d) Ultrasound: Ultrasound imaging utilizes high-frequency about dense spatial
structures, imaging
sound waves to create real-time visuals of organs and tissues,
particularly precision.
providing dynamic imaging capabilities. It is frequently used bones.
for examining the abdomen, monitoring pregnancies, and Exceptional for
guiding minimally invasive procedures [8]. clearly outlining
skeletal
(e) Positron Emission Tomography (PET): PET scans features. [ 11 ]
necessitate the introduction of a small quantity of radioactive
MRI Reveals details Enhanced
material into the body via injection. The emitted positrons are about abnormal resolution
detected, enabling the creation of images that show the body’s soft tissues. showcasing
metabolic activity. PET scans are valuable in oncology for Widely used anatomical
detecting tumors and assessing treatment response. in confidential intricacies.
clinical settings
(f) Single-Photon Emission Computed Tomography (SPECT): for medical
SPECT imaging uses injected radioactive tracers and examinations.
specialized cameras to produce 3D images of organ function. PET It scans High
It is often used in cardiology and neurology to assess blood the brain Sensitivity
flow, diagnose certain diseases, and evaluate organ function & provides & high
[9]. functional penetration
information. depth.[14]
By combining information from multiple imaging modalities, Enables
healthcare professionals can obtain a more comprehensive recording
and detailed understanding of a patient’s condition. For variations in
normal brain
example, a [10] combination of CT and PET scans, known activity and
as PET-CT, allows for the localization of abnormal metabolic symptoms
activity seen in PET scans within the anatomical context of various
provided by CT scans. diseases.

Multi-modality medical images hold a significant role in

ensuring precise Diagnosis coupled with effective healing,
240 Algorithms in Advanced Artificial Intelligence

Modality Images Characteristics Advantages coordination of dispersed computing using mobile devices.
Although CE can provide major benefits such as extended
SPECT A non-intrusive High
approach Sensitivity
battery life and increased storage capacity, its adaptability,
involves & high flexibility, and security issues remain important impediments.
capturing penetration [18] Health information security (HIS) is an iterative
cross-sectional depth technique that undergoes innovative adjustments as medical
images using care contexts evolve. It is crucial to evaluate the efficacy and
radiotracer [12],
revealing the
applicability of HIS security systems and procedures in light
organization of new developments. To ensure HIS security, a thorough
inside the risk assessment and implementation of appropriate security
human body. measures are imperative [19].
Ultrasound Utilizes high- High Spatial
frequency Resolution
sound waves and low cost.
for diagnostic
information,
offering both
qualitative and
quantitative
insights.[13]

Digitization is a potential approach that can provide versatile

electronic services and may be useful in monitoring the
healthcare space, offering new services and facilities for
patients and caregivers. Therefore, greater care must be taken
to protect medical picture transfers over public networks.
Cryptography, steganography, and watermarking are
common methods used in medical picture security. To address Fig. 38.1 Types of attacks on medical images
the security challenges associated with the digitization of The digitization of healthcare has the potential to enhance
medical images in telemedicine, various techniques are used, healthcare outcomes and decrease costs by allowing for
including. efficient and rapid handling of diverse data. This is made
Encryption: Utilizes encryption algorithms to obfuscate possible through efficient data storage and organization
medical images, permitting exclusive access via relies on the essential components of data warehouses and
authorized decryption keys, ensuring stringent patient data cloud-based data management technologies. However, while
confidentiality. [16]. big data can yield valuable insights, it is crucial to have the
Secure data transfer protocols: This technique involves the appropriate IT infrastructure, visualization techniques, and
use of secure communication channels to transmit medical user interfaces in place. Thus, there is a need to modify
images between healthcare providers, ensuring that they are existing procedures and regulations concerning database use,
not intercepted or accessed by unauthorized individuals. data access, sharing, privacy, and sustainability to maximize
the [20] benefits of big data in healthcare.
Access control mechanisms: This technique involves
controlling access to medical images based on the [17] user’s Effective healthcare information (HI) security requires a
role, credentials, and need-to-know basis. This measure comprehensive approach that integrates confidentiality,
guarantees that only individuals with authorization can obtain privacy, and security safeguards. This involves identifying
medical images, and that the images are strictly utilized for and classifying HI data based on its sensitivity and
their intended purpose. potential risks, such as patient identifiers, medical records,
and financial information [21]. Appropriate technical
Cloud service providers are frequently relied upon for the solutions, including encryption and access controls, must be
outsourcing and storage of health records, which raises implemented to protect HI data in transit or at rest. [22] The
security and privacy concerns. The utilization of smart MES algorithm provides a multi-tiered and modular security
technology, such as cell phones and laptops, is booming to strategy implemented to safeguard healthcare records stored
allow users to gain access to a multitude of information on in the cloud environment, involving [23] entropy-based key
numerous organizations using adaptive programs such as generation, [24] compression and extension of records, and
Google and iPhone apps. A “cloud environment” (CE) is the multi-cloud-based storage.
Standard Encryption Methodologies to Process Multi-Modality Medical Images for Diagnosing in Telemedicine 241

Table 38.2 Types of attacks 2. Literature Available on Encryption

Category Attack Description Example Techniques
Adversarial Rank Filter Non-linear
Noise filtering In the domain of Medical Image Classification and
Reduction techniques Medical Image Security, numerous subsidiary studies were
to effectively undertaken. These studies collectively aimed to review the
eliminate noise.
existing research landscape and traditional investigations,
JPEG Resizing It linearly scales which has become increasingly important due to the rise in
Encoding images by significant studies in this field.
either reducing
or enlarging To address algorithms designed to safeguard the confidentiality
their size. of medical images during storage and transmission. The
JPEG It’s a common
research works are organized into distinct groupings based
Encoding image on security methodologies, encompassing encryption, secret
compression sharing, and image concealment techniques, all directed
technique using towards ensuring the confidentiality and management of
region-based medical images. These works include:
segmentation.
Spatial Skewing It is a
1. Conventional algorithms for encrypting medical
Attacks transformation images.
that shifts a 2. Encryption of medical images using chaotic maps.
specific area
of an image
3. Multiple algorithms encrypt medical images for
in a different security.
direction. 4. Search-preserving encryption algorithms for medical
Angular It is a rotation images.
transforma of an image 5. Cryptanalysis of medical images.
tion around a
central point 1. Conventional algorithms for encrypting medical images
in a circular Categorized based on various security methods, including
motion.
encryption algorithms, secret sharing algorithms, and hiding
algorithms for medical images.
Clipping It is the
elimination 2. Encryption of medical images using chaotic maps
of undesired
Chaotic maps have become a prominent area of research in
areas from an
image. medical image encryption algorithms, widely utilized and
highly popular. Among the papers dedicated to medical image
encryption, approximately two-thirds (44 papers) employed
chaotic technologies, demonstrating their significant presence
Noise White noise It is a statistical
Injection noise with
[32].
a Gaussian 3. Multiple algorithms encrypt medical images for
distribution security
and probability
density In addition to classical algorithms - Hybrid encryption using
function. traditional and chaotic maps, this section also includes the
Impulse It is a form exploration of alternative encryption methods for medical
noise of noise that images.
introduces
white and black 4. Search- Preserving encryption algorithms for medical
pixels into images
images.
The effective and accurate searching of medical image data
Image Histogram It enhances without decryption, while preserving privacy, can be achieved
Tampering Normaliza image contrast
through the combination of homomorphic encryption and
tion through
an image feature vectors. This approach enables secure retrieval of
processing relevant information while maintaining the confidentiality of
technique. sensitive data.
242 Algorithms in Advanced Artificial Intelligence

Fig. 38.2 Literature taxonomy: Medical image confidentiality technology

Table 38.3 Conventional algorithms for encrypting medical images

Classical Encryption Algorithm Encryption Technique

Upgraded RSA algorithm The proposition suggested employing Fixed Mason Prime for the
encryption of medical ultrasound images. [26].
Rijndael cipher Simultaneously encrypting multiple medical images [27].
Algorithm based on transcendental numbers Algorithm based on Confusion is introduced using the Fermat expansion of an irrational
transcendental numbers number, such as PI, while diffusion is accomplished through the XOR
operation [28].
Cryptography based on elliptic curves (ECC). To secure the header section of the file, the encryption utilized the
Extend Tiny Encryption Algorithm (XTEA), while the image section of
the file was encrypted using the same algorithm [29].
GGH is a straightforward public-key cryptographic system. The Closest Vector Problem (CVP) was utilized for encrypting medical
images [30].
Combined with ECC, AES, and Whirlpool hash function, two DICOM To encrypt both file's header and image segments. [31].
encryption algorithms
Modified Vigenere cipher DICOM was partitioned into separate sections: the header and image
parts. Both sections were encrypted using the modified Vigenere
cipher.
Standard Encryption Methodologies to Process Multi-Modality Medical Images for Diagnosing in Telemedicine 243

Table 38.4 Encryption of medical images using chaotic maps Reference Chaotic Maps Encryption Method
Reference Chaotic Maps Encryption Method Id Used
Id Used [53] SHA-256 hash function Initial parameter generation
[33] 2D Zaslavsky, 2D Live wireless capsule for Logistic Tent system
Logistic endoscopy encryption [54] Arnold's cat map The integration of ECC
[34] The Extended Bimodal Stream cipher applied to transformation. and ElGamal encryption
Logistic function. encrypt medical images. techniques.

[35, 36] 4D hyperbolic Sine, Medical Image [55] Henon map Combined with number
2D Sine Logistic cryptosystem with verified theory for encryption
modulation chaotic properties [56] Applying the Skew Integrated for the purpose
[37] Logistic-sine Software platform: Mod Tent Map alongside of encryption.
operation, Hardware matrix cyclic shift
platform: XOR operation operations.

[38] Combinatorial of sine Chaotic economic map [57] The integration of a Unified for the purpose of
and cosine functions optimization for medical logistic coupled map encryption.
image encryption in 2D and a linear
congruential generator.
[39] Logistic Medical image encryption in
16x16 blocks [58] Galois field Improvement of the coupled
multiplication hyper-chaotic Chen 6D
[40] Logistic, Kent Stream cipher technology system.
for medical image
encryption [59] Genetic algorithms Integration of artificial
intelligence (AI) technology
[41] Default double-sine Medical image bit plane and chaotic systems for
encryption encrypting medical images.
[42] Logistic Medical image encryption [60] Neural network The amalgamation of
with classical permutation- chaotic systems and AI
diffusion architecture technology for securing
[43] Three different Logistic Three-step permutation medical image data.
maps diffusion-p ermutation [61] 3D Cellular automata Combining chaotic systems
structure for medical image (CA) and AI technology for the
encryption encryption of medical
[44] Three different Logistic Permutation-diffusion- images.
maps permutation structure for [62] 2D Chaotic map Encryption, information
medical image encryption entropy calculation, and
[45] 2D Arnold cat, 1D Combined chaotic parameter optimization.
Logistic, 2D Henon systems for medical image [63] Logistic, Tent Confusion and diffusion
encryption using chaotic maps,
[46] 3D Intertwining Chaotic maps used optimization with
Logistic, Logistic-sin, individually for medical grasshopper optimization
enhanced chaotic image encryption [64] A quantum encryption Quantum encryption with
economic, time-delay framework based on chaotic maps for medical
chaotic Gray code integrated images
[47] DNA-based computing Indestructible cryptographic with chaos maps.
and dynamical systems system leveraging DNA
computing.
Table 38.5 Encryption methods, optimization methods,
[48], [ 49] Chaotic system with Image encryption scheme
Feature
S-box combining chaotic system
and S-box Reference Encryption Method Optimization Method/
[50] 2D Logistic Combined with RC5 Features
algorithm for encryption [65] Blowfish algorithm OFP Signcryption: Key
[51] Chaotic system based The combination of ECC Optimization
on the Mersenne and ElGamal encryption. [66] Cosine number transform CosineCrypt
Twister algorithm.
[67] NDP Structure NICSP Permutation
[52] Hash function (512 bits) Chaotic parameter and
[68] Full, Middle-full, Selective Three modes for medical
initial value generation
modes image encryption: full,
middle-full, selective
244 Algorithms in Advanced Artificial Intelligence

Reference Encryption Method Optimization Method/ Reference Secret Sharing Proposed Scheme
Features Scheme
[69] Quaternion Feistel Improved encryption of [79] Shamir secret A cloud storage scheme
structure medical images using sharing scheme that utilizes secret sharing
Quaternion Feistel techniques.
structure [80] Visual Halftone visual cryptography
cryptography based on Region of non-interest
[70] Optical encryption TrichromaticCrypt
(RONI)
method
[81] N/A Medical image segmentation
[71] Genetic Algorithms-b GeneticGuard method with endpoint
ased approach coordinates reconstruction

Table 38.6 Encryption methods and retrieval methods/

technologies 3. Evaluation Performance Analysis
Reference Encryption Method Retrieval Method/Technology The evaluation and analysis of the concept described above
[72] RSA Feature vectors for retrieving involve assessing the effectiveness and performance of the
encrypted medical images proposed cryptographic techniques for securing multi-
[73] Blockchain Feature vectors for retrieving
modality medical images in telemedicine. Key aspects
block chain-protected medical to consider include the security features, computational
images complexity, resistance against attacks, and comparison
[74] Homomorphic Homomorphic encryption with existing methods. The evaluation and analysis of the
encryption using method using Public-key with concepts presented in the aforementioned studies will offer
Public-key proof of homomorphism valuable perspectives into the effectiveness, performance,
[75] Homomorphic Direct extraction of and suitability of the proposed cryptographic techniques
encryption watermarks from cipher for enhancing the safeguarding of multi-modality medical
images without the need for images in telemedicine.
decryption.

Table 38.9 Performance metrics

5. Cryptanalysis of medical images
Measures Formula Optimum Value
Cryptanalysis strengthens medical image encryption PSNR PSNR is a metric that gauges High as possible
by identifying and addressing vulnerabilities in diverse how closely a watermarked
algorithms. image resembles the original.
A higher PSNR value indicates
a stronger similarity between
Table 38.7 Algorithms being attacked, the attack methods
the two images, implying
employed, and their results better image quality PS NR =
Reference Attack Method Results 10 * log 10((MAX^2) / MSE)
MSE Mean Square Error: Range from 0 to
[76] Chosen plain image Ineffective resistance against
MSE = (1 / (M * N)) 1 Ideally =0 This
attack differential attack
* Σ[Σ(I_original(x, y) - value means the
[77] Chosen plain image Complete recovery of original I_watermarked(x, y))^2] two images are
attack image from encrypted image identical.
NC NC is used in calculating Ideally, NC = 1 but
6. Secret sharing algorithms in Medical Imagery the similarity between the 0.7 is acceptable
There are two distinct subcategories within medical image extracted and the original
watermark coefficient
secret sharing algorithms. value range between 0 and
1. It can be mathematically
Table 38.8 Type of secret sharing scheme used, and the represented as : NC =
proposed schemes Σ[(X(i) - μX) * (Y(i) - μY)]/[sX
* sY]
Reference Secret Sharing Proposed Scheme
Scheme NPCR NPCR quantifies the Range from 0 to 100
percentage of distinct pixel Ideally =100
[78] Shamir secret A scheme called XOR-based
values between the plain and
sharing scheme continuous-tone multi secret
encrypted images.
sharing is employed for storing
NPCR = (Number of Pixels that
and forwarding medical images.
Change) / (Total Number of
Pixels)
Standard Encryption Methodologies to Process Multi-Modality Medical Images for Diagnosing in Telemedicine 245

Measures Formula Optimum Value 4. Conclusion

UACI Average Changing Intensity: it Range from 0 to 100
denotes the average intensity Ideally =100 The realm of medical image confidentiality technology has
of differences between the experienced a significant upsurge in research publications
plain image and the encrypted in recent years. Based on the provided information, the
image:
medical picture encryption technique based on a chaotic
UACI = (1 / N) * Σ[|X(i,j) - Y(i,j)|]
map shows the highest rate of betterment among the given
Number of Pixels Change Rate (NPCR), Unified Average Changing references for medical image encryption. However, it is
Intensity (UACI), ET(Encryption Time), DT (Decryption Time).
important to acknowledge that certain aspects, such as
related descriptions and limitations, still lack clarity and
Table 38.10 Evaluation & best performance analysis
require further elucidation. Therefore, there is a need
S. No Reference Category type Evaluation Metrics for continued investigation and research to address these
1 [28] Conventional NPCR = 99.609%. gaps and enhance our comprehension of medical image
algorithms for UACI = 33.002 confidentiality technologies. However, contemporary
encrypting medical % ET = 0.1370 medical image confidentiality methods have drawbacks,
images. DT = 0.1372
particularly when it comes to multi-modality security. These
2 [33] Encryption of medical NPCR = 99.6309%. limitations include prolonged processing times and potential
images using chaotic UACI = 33.465
security vulnerabilities, which may manifest in systematic
maps. % ET = 0.950
DT = 0.96 or intricate ways through attention models have emerged
[37]
as a promising approach in multi-modality medical image
NPCR = 99.9985%.
UACI = 33.3338% analysis. By selectively focusing on informative regions or
MIE features from different imaging modalities, attention models
BX(ET) = 0.465 6 can enhance diagnostic accuracy and reliability. They have
MIE BX(DT) = 0.469 7 the potential to overcome challenges associated with noise
MIE MA(ET) = 0.106 5
and low-quality images by reducing the impact of irrelevant
MIE MA(ET) = 0.105 7
information. To further advance the field, future research
[42] NPCR = 99.9984%.
should focus on refining and optimizing keys in blockchain
UACI = 32.7396%
ET = 0.023277 based attention models for multi-modality medical image
DT = 0.022404 analysis. By harnessing the power of blockchain and deep
[65] Multiple algorithms NPCR = 99.43%.
learning attention-based approaches, the field of multi-
encrypt medical UACI = 33.7% modality medical image analysis can contribute to improved
images for security. ET = 5.95 healthcare delivery, patient outcomes, and the overall
DT = 6.8 advancement of medical research.
[69] NPCR = 99.61%.
UACI = 33.46%
ET = 0.247
References
DT = 0.3328
1. M. Paul, L. A. Maglaras, Mohamed Amine Ferrag, and Iman
Number of Pixels Change Rate (NPCR), Unified Average Changing Almomani, “Digitization of healthcare sector: A study on
Intensity (UACI), ET(Encryption Time),DT(Decryption Time). privacy and security concerns,” Feb. 2023, doi: https://doi.
In specific situations, healthcare image security systems org/10.1016/j.icte.2023.02.007
2. M. Magdy, K. M. Hosny, N. I. Ghali, and Said Ghoniemy,
consider the potential for data deterioration during the
“Security of medical images for telemedicine: a systematic
process of storing and transmitting information, particularly
review,” vol. 81, no. 18, pp. 25101–25145, Mar. 2022, doi:
for multi-modality images. Robustness evaluation may https://doi.org/10.1007/s11042-022-11956-7.
involve subjecting the proposed schemes to noise attacks or 3. A. I. Stoumpos, Fotis Kitsios, and M. A. Talias, “Digital
cropping attacks to test their resilience in the face of such Transformation in Healthcare: Technology Acceptance and
challenges, while preserving the integrity of information Its Applications,” vol. 20, no. 4, pp. 3407–3407, Feb. 2023,
across different modalities. However, it is important to note doi:https://doi.org/10.3390/ijerph20043407.
that not all medical image confidentiality schemes require 4. W. Tan, Joel, Hari Mohan Pandey, C. Moreira, and A. K.
robustness testing, as some prioritize the precise restoration Jaiswal, “Multimodal medical image fusion algorithm in the
of multi-modality medical images rather than resistance to era of big data,” Jul.2020, doi: https://doi.org/10.1007/s00521
attacks. 020-05173-2.
246 Algorithms in Advanced Artificial Intelligence

5. S. Hussain et al., “Modern Diagnostic Imaging Technique 18. W. Zhang, C. A. Gunter, D. Liebovitz, J. Tian, and B. Malin,
Applications and Risk Factors in the Medical Field: A “Role prediction using Electronic Medical Record system
Review,” vol. 2022, pp. 1–19, Jun. 2022, doi: https://doi. audits,” AMIA Annual Symposium proceedings. AMIA
org/10.1155/2022/5164970. Symposium, vol. 2011, pp. 858–67, 2011, Accessed: Jul. 05,
6. J. E. Ottenhoff, C. Thom, M. Kongkatong, M. Hewitt, and J. 2023. [Online]. Available: https://www.ncbi.nlm.nih.gov/
Phillips, “A Narrative Review of the Uses of Ultrasound in pmc/articles/PMC3243238/
the Evaluation, Analgesia, and Treatment of Distal Forearm 19. R. Sathya Prabha, K Kanagasabapathi, K. Sajeeth, and
Fractures,” vol. 63, no. 6, pp. 755–765, Dec. 2022, - Journal of M. Aishwarya, “Health Information Sharing in Cloud
Emergency Medicine. Environment Using Modular Encryption Standard,” Jan.
7. M. Shyni and Dr. Chitra E, “A comparative study of X-ray and 2023, doi: https://doi.org/10.3233/atde221238.
CT images in COVID-19 detection using image processing 20. M. Paul, L. A. Maglaras, Mohamed Amine Ferrag, and Iman
and deep learning techniques,” vol. 2, pp. 100054–100054, Almomani, “Digitization of healthcare sector: A study on
Jan. 2022, - ScienceDirect. privacy and security concerns,” Feb. 2023, doi: https://doi.
8. C. K. Sen, Subhadip Ghatak, Surya Gnyawali, S. Roy, and org/10.1016/j.icte.2023.02.007.
G. M.Gordillo, “Cutaneous Imaging Technologies in Acute 21. K. Batko and Andrzej Ślęzak, “The use of Big Data Analytics
Burn and Chronic Wound Care,” vol. 138, pp. 119S128S, Sep. in healthcare,” vol. 9, no. 1, Jan. 2022, doi: https://doi.
2016, doi:https://doi.org/10.1097/prs.0000000000002654. org/10.1186/s40537-021-00553-4.
9. Gianina Crișan, G. Andrieș, Calin Cainap, and Vasile Chiș, 22. M. Shabbir et al., “Enhancing Security of Health Information
“Radiopharmaceuticals for PET and SPECT Imaging: A Using Modular Encryption Standard in Mobile Cloud
Literature Review over the Last Decade,” vol. 23, no. 9, Computing,” vol. 9, pp. 8820–8834, Jan. 2021, doi: https://
pp. 5023–5023, Apr. 2022, doi: https://doi.org/10.3390/ doi.org/10.1109/access.2021.3049564.
ijms23095023. 23. Thsen Alouani, “Breaking (and Fixing) Channel-based
10. L. K. Griffeth, “Use of Pet/Ct Scanning in Cancer Patients: Cryptographic Key Generation: A Machine Learning
Technical and Practical Considerations,” vol. 18, no. 4, pp. Approach,” Aug. 2022, doi: https://doi.org/10.1109/
321–330, Oct. 2005, doi: https://doi.org/10.1080/08998280. dsd57027.2022.00058.
2005.11928089. 24. Y. Sain, “Review on Compression of Medical Images using
11. Y. Gong, “Decompose X-ray Images for Bone and Soft Various Techniques,” International Journal of Engineering
Tissue,” arXiv.org, 2020. https://arxiv.org/abs/2007.14510 Research & Technology, vol. 3, no. 8, Sep. 2014, doi: https://
(accessed Jul. 06, 2023). doi.org/10.17577/IJERTV3IS080880.
12. Haubner, “Radiotracer-based strategies to image angiogenesis,” 25. R. Kumar et al., “An Integration of blockchain and AI for
The quarterly journal of nuclear medicine : official publication secure data sharing and detection of CT images for the
of the Italian Association of Nuclear Medicine (AIMN) hospitals,” vol. 87, pp.101812–101812, Jan. 2021, doi:https://
[and] the International Association of Radiopharmacology doi.org/10.1016/j.compmedimag.2020.101812.
(IAR), vol. 47, no. 3, 2015,https://pubmed.ncbi.nlm.nih. 26. “Development of modified RSA algorithm using fixed
gov/12897710 mersenne prime numbers for medical ultrasound imaging
13. M. Dietzel et al., “Fusion of dynamic contrast-enhanced instrumentation,” Computer Assisted Surgery, 2019. https://
magnetic resonance mammography at 3.0T with X-ray www.tandfonline.com/doi/full/10.1080/24699322.20
mammograms: Pilot study evaluation using dedicated semi 19.1649070
automatic registration software,” Aug.2011, doi: https://doi. 27. Q. N. Natsheh, B. Li, and A. G. Gale, “Security of Multi
org/10.1016/j.ejrad.2011.04.017 frame DICOM Images Using XOR Encryption Approach,”
14. P. A. Segura Chávez et al., “Love Wave Sensor with vol. 90, pp. 175–181, Jan.2016, doi: https://doi.org/10.1016/j.
High Penetration Depth for Potential Application in Cell procs.2016.07.018.
Monitoring,” Biosensors, vol. 12, no. 2, p. 61, Feb. 2022, doi: 28. Ranjith Kumar. M and M. K. Viswanath, “A symmetric
https://doi.org/10.3390/bios12020061. medical image encryption scheme based on irrational
15. P. Kumar and S.-H. Lee, “Security Issues in Healthcare numbers,” Jan. 2018, doi: https://doi.org/10.4066/
Applications Using Wireless Medical Sensor Networks: A biomedicalresearch.29-17-1317.
Survey,” vol. 12, no. 1, pp. 55–91, Dec. 2011, doi: https://doi. 29. Dorgham O, Al-Rahamneh B, Almomani A, Khatatneh KF
org/10.3390/s120100055. (2018) Enhancing the security of exchanging and storing
16. Sharma K, Agrawal A, Pandey D, et al (2022) RSA based DICOM medical images on the cloud. Int J Cloud Appl
encryption approach for preserving confidentiality of big data. Comput (IJCAC) 8(1):154–172. https://doi.org/10.4018/
Journal of King Saud University - Computer and Information IJCAC.2018010108
Sciences 34:2088–2097. https://doi.org/10.1016/j. 30. Massoud Sokouti, A. Zakerolhosseini, and Babak
jksuci.2019.10.006 Sokouti,“Medical Image Encryption: An Application for
17. K. Anusudha, “A Theoretical Approach to Secure Medical Improved Padding Based GGH Encryption Algorithm,” vol.
Images by Combining Cryptography and Watermarking 10, no. 1, pp. 11–22, Oct. 2016, doi: https://doi.org/10.2174/1
Techniques,” vol. 7, no. 3, pp. 69–77, Jul. 2020, doi: https:// 874431101610010011.
doi.org/10.30726/esij/v7.i3.2020.73014.
Standard Encryption Methodologies to Process Multi-Modality Medical Images for Diagnosing in Telemedicine 247

31. A. Al-Haj, G. A. Abandah, and N. Hussein, “Crypto‐based Secure Medical Image Protection Scheme,” vol. 2015, pp.
algorithms for secured medical image transmission,” vol. 9, 1–11, Jan. 2015, doi: https://doi.org/10.1155/2015/913476.
no. 6, pp. 365–373, Nov. 2015, doi:https://doi.org/10.1049/iet 45. K. Jain, Aravind Aji, and P. Krishnan, “Medical Image
ifs.2014.0245. Encryption Scheme Using Multiple Chaotic Maps,” vol.
32. B. Zhang, Bahbibi Rahmatullah, Shir Li Wang, A. A. Zaidan, 152, pp. 356–364, Dec. 2021, doi: https://doi.org/10.1016/j.
B. B. Zaidan, and P. Liu, “A review of research on medical patrec.2021.10.033.
image confidentiality related technology coherent taxonomy, 46. K. Kiran, H. L. Gururaj, Meshari Almeshari, Yasser Alzamil,
motivations, open challenges and recommendations,” vol. R.Vinayakumar, and K. V. Sudeesh, “Efficient SCAN and
82, no. 14, pp.21867–21906, Aug. 2020, doi: https://doi. Chaotic Map Encryption System for Securing E-Healthcare
org/10.1007/s11042-020-09629-4. Images,” vol. 14, no. 1, pp. 47–47, Jan. 2023, doi: https://doi.
33. R. Hamza, Z. Yan, Sung Wook Baik, Paolo Bellavista, and org/10.3390/info14010047.
Faiza Titouna, “A privacy-preserving cryptosystem for IoT 47. J. Zheng, Z. Luo, and Z. Tang, “An Image Encryption
E-healthcare,” vol. 527, pp. 493–510, Jul. 2020, doi: https:// Algorithm Based on Multichaotic System and DNA
doi.org/10.1016/j.ins.2019.01.070. Coding,” vol. 2020, pp. 1–16, Sep. 2020, doi: https://doi.
34. Cortés IE, Venegas O, Gómez HW (2022) A Symmetric/ org/10.1155/2020/5982743.
Asymmetric Bimodal Extension Based on the Logistic 48. Farah MB, Farah A, Farah T (2019) An image encryption
Distribution: Properties, Simulation and Applications. scheme based on a new hybrid chaotic map and optimized
Mathematics 10:1968–1968. https://doi.org/10.3390/ substitution box. Nonlinear Dynamics:1-24. https://doi.
math10121968 org/10.1007/s11071-019-05413-8
35. Hua Z, Zhou Y, Pun CM, ChenCP(2015) 2D sine logistic 49. Farah MB, Guesmi R, Kachouri A, Samet M (2020) A
modulation map for image encryption. Inf Sci 297:80–94. new design of cryptosystem based on S-box and chaotic
https://doi.org/10.1016/j.ins.2014.11.018 permutation. Multimed Tools Appl:1-22. https://doi.
36. Liu J, Ma Y, Li S, Lian J, Zhang X (2018) A new simple chaotic org/10.1007/s11042-020-08718-8.
system and its application in medical image encryption. 50. Shahzadi R, Anwar SM, Qamar F, Ali M, Rodrigues JJ
Multimed Tools Appl 77(17):22787–22808. https://doi. (2019) Chaos based enhanced RC5 algorithm for security
org/10.1007/s11042-017-5534-8 and integrity of clinical images in remote health monitoring.
37. Hua Z, Yi S, Zhou Y (2018) Medical image encryption IEEE Access 7:52858–52870. https://doi.org/10.1109/
using high-speed scrambling and pixel adaptive diffusion. ACCESS.2019.2909554
Signal Process 144:134–144. https://doi.org/10.1016/j. 51. H. Liu, A. Kadir, and Y. Li, “Asymmetric color pathological
sigpro.2017.10.004 image encryption scheme based on complex hyper chaotic
38. “Confidential storage of medical images – a chaos-based system,” ResearchGate, Apr. 2016. Asymmetric color
encryption approach,” International Journal of Cloud pathological image encryption scheme based on complex
Computing, 2018. hyper chaotic system
39. Chuman T, Hitoshi Kiya (2022) Security Evaluation of 52. Meiliana Sumagita, Imam Riadi (2018) Analysis of Secure
Block-based Image Encryption for Vision Transformer Hash Algorithm (SHA) 512 for Encryption Process on Web
against Jigsaw Puzzle Solver Attack. 2022 IEEE 4th Global Based Application. International Journal of Cyber-Security
Conference on Life Sciences and Technologies (LifeTech). and Digital Forensics 7:373–382
https://doi.org/10.1109/lifetech53646.2022.9754937 53. Rahul Rahul, K. Kuppusamy, and A. Senthilrajan, “Bio
40. WangW,SiM,PangY,RanP,WangH,JiangX,LiuY,WuJ,WuW, Metric Based Colour Image Encryption using Multi Chaotic
Chilam kurtiN, Jeon G(2018) An encryption algorithm Dynamical Systems and SHA 256 Hash Algorithm,”
based on combined chaos in body area networks. ResearchGate, Jun. 29, 2023. Bio Metric Based Colour Image
ComputElectricEng65:282–291. https://doi.org/10.1016/j. Encryption using Multi Chaotic Dynamical Systems and SHA
compeleceng. 2017. 07. 026 256 Hash Algorithm.
41. CaoW,Zhou Y,Chen CP, Xia L(2017) Medical image 54. S. Banerjee and A. Patil, “ECC Based Encryption Algorithm
encryption using edge maps. Signal Process 132: 96–109. for Lightweight Cryptography,” ResearchGate, 2020.
https://doi.org/10.1016/j.sigpro.2016.10.003 55. S. Kanwal et al., “An Effective Color Image Encryption Based
42. M. Parvees, J. Abdul Samath, and B. Parameswaran Bose, on Henon Map, Tent Chaotic Map, and Orthogonal Matrices,”
“Protecting Large Size Medical Images with Logistic Map vol. 22, no. 12, pp. 4359–4359, Jun. 2022, doi: https://doi.
Using Dynamic Parameters and Key Image,” International org/10.3390/s22124359.
Journal of Network Security, vol. 19, no. 6, pp. 984–994, 56. A. Kanso, M. Ghebleh, and M. Bou Khuzam, “A Probabilistic
2016, doi: https://doi.org/10.6633/IJNS.201711.19(6).15. Chaotic Image Encryption Scheme,” Mathematics, vol.
43. Xu L, Gou X, Li Z, Li J (2017) A novel chaotic image 10, no. 11, p. 1910, Jun. 2022, doi: https://doi.org/10.3390/
encryption algorithm using block scrambling and dynamic math10111910.
index based diffusion. Opt Lasers Eng 91:41–52. https://doi. 57. X. Huang, L. Liu, X. Li, M. Yu, and Z. Wu, “A New Two-
org/10.1016/j.optlaseng.2016.10.012 Dimensional Mutual Coupled Logistic Map and Its Application
44. L. Zhang, Z. Zhu, B. Yang, W. Liu, H. Zhu, and M. for Pseudorandom Number Generator,” vol. 2019, pp. –10,
Zou,“Cryptanalysis and Improvement of an Efficient and May 2019, doi: https://doi.org/10.1155/2019/7685359.
248 Algorithms in Advanced Artificial Intelligence

58. Laiphrakpam Dolendro Singh, Rohit Thingbaijam, Kh 69. M. Luis, L. Daniel, A. Isabel, and Alvarado Deicy, “A new
Motilal, and Moatsum Alawida, “Encrypting Multiple Images multimedia cryptosystem using chaos, quaternion theory and
With an Enhanced Chaotic Map,” ResearchGate, 2022. (PDF) modular arithmetic,” Mar. 2023, doi: https://doi.org/10.1007/
Encrypting Multiple Images With an Enhanced Chaotic Map s11042-023-14475-1.
59. B. Zhang and L. Liu, “Chaos-Based Image Encryption: Review, 70. P. Wang, Y. Wang, J. Xiang, and X. Xiao, “Fast Image
Application, and Challenges,” vol. 11, no. 11, pp. 2585–2585, Encryption Algorithm for Logistics-Sine-Cosine Mapping,”
Jun. 2023, doi: https://doi.org/10.3390/ math11112585. Sensors, vol. 22, no.24, p. 9929, Jan. 2022, doi: https://doi.
60. Y. Mao, “Algorithm of Encrypting Digital Image Using Chaos org/10.3390/s22249929.
Neural Network,” vol. 2022, pp. 1–10, Sep. 2022, doi: https:// 71. L. Si, X. Hu, and B. Liu, “Image Matching Algorithm Based
doi.org/10.1155/2022/4160083. on the Pattern Recognition Genetic Algorithm,” ResearchGate,
61. B. Zhang and L. Liu, “Chaos-Based Image Encryption: Mar. 09, 2022. (PDF) Image Matching Algorithm Based on
Review,Application, and Challenges,” Mathematics, vol. the Pattern Recognition Genetic Algorithm.
11, no. 11, p. 2585,Jan. 2023, doi: https://doi.org/10.3390/ 72. D. Li et al., “Hybrid Encrypted Watermarking Algorithm for
math11112585. Medical Images Based on DCT and Improved DarkNet53,”
62. U. Erkan, Abdurrahim Toktas, Feyza Toktas, and Fayadh Electronics, vol. 12, no. 7, p. 1554, Jan. 2023, doi: https://doi.
Alenezi, “2D eπ-map for image encryption,” vol. 589, pp. 770– org/10.3390/electronics12071554.
789, Apr.2022, doi: https://doi.org/10.1016/j.ins.2021.12.126. 73. P. Prasad, Bethel, N. Singh, Vinit Kumar Gunjan, Samad
63. Montassar Aidi Sharif, Keasar Sabah Khalaf, and Mahmoud Baseer, and S. Miah, “Blockchain-Based Privacy Access
Shakir Wahhab, “Digital Communication Based on Image Control Mechanism and Collaborative Analysis for Medical
Security using Grasshopper Optimization and Chaotic Map,” Images,” vol. 2022, pp. 1–7, Jun.2022, doi: https://doi.
ResearchGate, Jul. 05, 2022. org/10.1155/2022/9579611.
64. Bassem Abd-El-Atty, Mohammed Ahmed El-Affendi, and 74. Rothblum RD (2011) Homomorphic Encryption: From
Fathi, “A novel image cryptosystem using Gray code, quantum Private-Key to Public-Key. Lecture Notes in Computer
walks, and Henon map for cloud applications,” vol. 9, no. 1, Science 219–234. https://doi.org/10.1007/978-3-642-19571
pp. 609–624, Jul. 2022, doi: https://doi.org/10.1007/s40747 6_14
022-00829-z. 75. Saci Medileh et al., “A Multi-Key with Partially Homomorphic
65. B. T. Geetha, P. Mohan, A.V.R Mayuri, T. Jackulin, J.L. Encryption Scheme for Low-End Devices Ensuring Data
Aldo Stalin, and Varagantham Anitha, “Pigeon Inspired Integrity,” vol. 14, no. 5, pp. 263–263, Apr. 2023, doi: https://
Optimization with Encryption Based Secure Medical Image doi.org/10.3390/info14050263.
Management System,” vol. 2022, pp. 1–13, Aug. 2022, doi: 76. F. Yu, X. Gong, H. Li, and S. Wang, “Differential cryptanalysis
https://doi.org/10.1155/2022/2243827. of image cipher using block-based scrambling and image
66. V. S. Lima, F. Madeiro, and J. Lima, “Encryption of 3D filtering,” vol. 554, pp 145–156, Apr. 2021, doi: https://doi.
medical images based on a novel multiparameter cosine org/10.1016/j.ins.2020.12.037.
number transform,” ResearchGate, Apr. 2020. Encryption of 77. S. Zhu and C. Zhu, “An Efficient Chosen-Plaintext Attack
3D medical images based on a novel multiparameter cosine on an Image Fusion Encryption Algorithm Based on DNA
number transform. Operation and Hyperchaos,” vol. 23, no. 7, pp. 804–804, Jun.
67. J. Fei et al., “DuMLP-Pin: A Dual-MLP-Dot-Product 2021, doi: https://doi.org/10.3390/e23070804.
Permutation-Invariant Network for Set Feature Extraction,” 78. Mbarek Marwan, Feda Alshahwan, F. Sifou, and H.
Proceedings of the AAAI Conference on Artificial Intelligence, Ouahmane,“Improving the security of cloud-based medical
vol. 36, no. 1, pp. 598–606, Jun. 2022, doi:https://doi. image storage,” ResearchGate, Feb. 2019. Improving the
org/10.1609/aaai.v36i1.19939. security of cloud-based medical image storage | Request PDF.
68. B. Zhang, Bahbibi Rahmatullah, Shir Li Wang, and Z. Liu, 79. E. Salih, “A Simple and Secure Secret Sharing Scheme for
“A plain-image correlative semi-selective medical image IoT,” doi: https://doi.org/10.1109/CSCI58124.2022.00272.
encryption algorithm using enhanced 2D-logistic map,” vol. 80. A. Patel and A. Bakshi, “Secure telemedicine using RONI
82, no. 10, pp.15735–15762, Sep. 2022, doi: https://doi. halftoned visual cryptography without pixel expansion,”
org/10.1007/s11042-022-13744-9. ResearchGate, Apr. 05, 2019. Secure telemedicine using
RONI halftoned visual cryptography without pixel expansion.
Note: All the figures and tables in this chapter were designed by
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Enhancing Dyslexia Detection and

Intervention through Deep Learning:
A Comprehensive Review and Future 39
Directions

Pavan Kumar Varma Kothapalli1, Cheepurupalli Raghuram2

Assistant Professor, Department of Computer Science and Engineerin
Sagi Rama Krishnam Raju Engineering College, Bhimavaram
Boddu LV Siva Rama Krishna3
Assistant Professor, Department of Computer Science and Engineering
SRM University, Andhra Pradesh

Abstract: Dyslexia, a neurodevelopment condition impacting reading and language abilities, presents notable difficulties
in promptly identifying and implementing effective interventions Traditional methods for diagnosing dyslexia often rely on
subjective assessments and standardized tests, leading to delays in recognition and support. This paper offers an extensive
examination of how deep learning techniques are applied in the domain of detecting and intervening in dyslexia. The integration
of deep learning algorithms into dyslexia research offers promising avenues for more accurate and timely identification of
individuals at risk. By leveraging neural networks and advanced machine learning models, researchers have begun to explore
novel approaches that analyze linguistic patterns, eye-tracking data, brain imaging, and behavioral markers associated with
dyslexia. Furthermore, this paper discusses the potential of deep learning in tailoring personalized interventions for individuals
with dyslexia. These interventions aim to adapt to the specific learning needs of each individual, providing targeted support
and enhancing the effectiveness of remediation strategies. While highlighting the advancements made in utilizing deep learning
for dyslexia, this review also addresses challenges, including data scarcity, model interpretability, and ethical considerations.
Additionally, it proposes future research directions that emphasize collaborative efforts among researchers, educators, and
technology developers to foster the development of robust and accessible tools for dyslexia assessment and intervention.
Keywords: Dyslexia, Deep Learning, Eye-tracking, Neural Networks

1. Introduction
Dyslexia[1], a neurodevelopmental disorder affecting
language and reading abilities, remains a persistent challenge
in educational settings and beyond. Its multifaceted nature,
characterized by difficulties in decoding words, recognizing
sounds, and understanding language, presents hurdles in
both early identification and effective intervention. Despite
concerted efforts in research and educational practices,
Dyslexia[33] continue to impact millions worldwide, Fig. 39.1 Brain Image with Dyslexia
highlighting the pressing need for innovative approaches to
address its complexities. In parallel, the rapid advancements in deep learning, a subset
of artificial intelligence (AI) that mimics the workings of
1
kdvpkvarma@gmail.com; 2cheepurupalliraghuram@gmail.com; 3sivaramakrishna.b@srmap.edu.in, krishna2928@gmail.com

DOI: 10.1201/9781003529231-39
250 Algorithms in Advanced Artificial Intelligence

the human brain through complex neural networks, offer Furthermore, difficulties in rapid automatized naming,
promising avenues for tackling intricate and nuanced orthographic processing[3], and working memory exacerbate
problems. The ability of deep learning models[32] to discern the obstacles related to reading and understanding language
intricate patterns and extract features from vast datasets has shown in below Fig. 39.2.
sparked considerable interest in their potential application to
dyslexia[2], particularly in revolutionizing detection methods 2. Literature Survey
and tailoring interventions to individual needs.
Various approaches for predicting dyslexia include
The paper presents a comprehensive review of the intersection
neuroimaging techniques to identify brain abnormalities,
between dyslexia and deep learning, aiming to explore the
behavioral assessments focusing on reading-related skills,
potential of these cutting-edge technologies in revolutionizing
and machine learning models that analyze linguistic patterns
dyslexia[1] detection and intervention. Through an in-depth
and cognitive features. Combining these diverse methods
examination of existing literature and methodologies, this
offers a comprehensive approach to early detection and
review seeks to elucidate how deep learning models can
intervention for individuals at risk of dyslexia, enhancing
enhance our understanding, detection, and remediation
strategies for dyslexia. the effectiveness of tailored educational strategies. Earlier
Works on These Lines by Various Authors are shown in
The introductory section provides an overview of dyslexia, Table 39.1.
delineating its defining characteristics, prevalence, and the
persisting challenges encountered in timely identification
and effective intervention. Subsequently, the introduction 3. Limitations and Challenges
highlights the fundamental principles of deep learning, The research on dyslexia collectively faces several challenges
elucidating the mechanisms by which neural networks and limitations. Firstly, the heterogeneity of dyslexia poses
process data and showcasing their potential to address difficulties in generalizing findings across diverse populations
intricate problems such as dyslexia. and severity levels. Interdisciplinary approaches, while
This paper endeavors to synthesize existing knowledge, valuable, demand expertise in neuroimaging, linguistics,
identify gaps, and propose future directions in leveraging and psychology, potentially limiting accessibility. Resource-
deep learning to advance dyslexia research. By amalgamating intensiveness is a recurring challenge, particularly in studies
expertise from the fields of neuroscience, education, and AI, involving neuroimaging and longitudinal approaches,
this exploration aims to contribute to the evolving landscape affecting the feasibility of large-scale research. Subjectivity,
of dyslexia studies and ultimately pave the way for more observer bias, and reliance on self-reported information
inclusive and effective approaches in detecting and supporting introduce potential inaccuracies.
individuals with dyslexia. Ethical considerations, especially in emerging areas like AI
applications[5] for dyslexia, present complex challenges in
1.1 Defining Characteristics of Dyslexia balancing ethical practices with empirical evidence. Some
The hallmark of dyslexia[32] encompasses challenges studies oversimplify dyslexia, focusing on specific aspects
in decoding words, accurately recognizing sounds and or proposing theoretical models, limiting the comprehensive
letters, and efficiently comprehending written text. People understanding of the condition. Generalizability issues arise
with dyslexia frequently face challenges in phonological from small sample sizes or specific participant characteristics.
awareness, the skill of manipulating and recognizing sounds Translating neurobiological findings into practical
in spoken words. This ability is essential for connecting interventions remains a complex challenge, highlighting the
sounds to letters during the process of learning to read. gap between neuroscience and real-world applications.
A lack of longitudinal data limits the understanding of the
Reading developmental trajectory of dyslexia over time. Dependency
Phonological on advanced technologies, such as Neuroimaging or AI,
difficulties Writing introduces challenges related to the rapid evolution of these
Dyslexia technologies. Genetic studies may oversimplify the complex
interplay between genetics and environmental factors in
Short term Orientationon in dyslexia. Some studies have a narrow scope, focusing on
memory time and space specific aspects like reading fluency or genetic connectivity,
Visual
perception potentially overlooking other crucial factors.

Fig. 39.2 Characteristics of Dyslexia

Enhancing Dyslexia Detection and Intervention through Deep Learning: A Comprehensive Review and Future Directions 251

Table 39.1 Examining diverse methods for predicting Dyslexia

Study Methodology Key Findings Pros Cons
Shaywitz&Shaywitz et.al [1] Standardized Identified phonological Widely used for Relies on subjective
Assessments deficits in dyslexia comparison interpretation
Vandermosten et al. [10] Neuroimaging Altered white matter Offers detailed brain Expensive equipment and
(2012) microstructure in dyslexics structure analysis data processing needed
Koyama et al. [6] (2013) Behavioral Analysis Observed distinct gaze Direct observation of Subject to observer bias
patterns in dyslexic behavior
readers
Norton et al. [11] (2014) Neural Signature Study Discovered specific neural Provides precise Complexity in interpreting
signatures in reading neurobiological data neural signatures
disorders
Zou &Schiebinger [8] Ethical Considerations Emphasized fairness in AI Addresses ethical May lack empirical data to
(2018) for dyslexia implications in AI support ethical claims
Ramus [4] (2014) Neuroimaging & Investigated phonological Combines neuroimaging Requires expertise in
Phonological Analysis deficits in dyslexia with linguistic analysis multiple domains
Hoeft et al. [13] (2011) Longitudinal Found brain connectivity Captures developmental Long-term studies can be
Neuroimaging differences in dyslexia changes over time resource-intensive
Galaburda et al. [30] Cognitive Model Proposed diagnostic Theoretical framework for May oversimplify
(2006) Approach model based on reading diagnosis multifaceted dyslexia
components
Ahissar et al. [31] (2001) Perceptual Anchoring Suggested failure to form Novel approach to Limited applicability
perceptual anchors in perceptual mechanisms beyond perceptual
dyslexia theories
Lefly& Pennington [29] Reading History Validated adult reading Easy and quick data Relies on self-reported
(2000) Questionnaire history questionnaire collection information, may be
subjective
Lyytinen et al. [26] (2015) Early Predictors Study Identified predictors of Aids early intervention Predictors may not
emergent literacy in at-risk strategies universally apply across
children different groups
Skeide et al. [28] (2017) Genetic Connectivity Established neural Provides insights into Genetic studies may
Study connectivity patterns genetic factors oversimplify complex
related to dyslexia risk conditions
Norton & Wolf [25] (2012) Reading Fluency Study Implications of rapid Relates to reading speed Limited scope to broader
automatized naming (RAN) and efficiency dyslexia aspects
in reading disabilities
Vandermosten [27] et al. Investigation into Patterns Unconventional Phonemic Identifies early markers of Limited generalization,
(2013) of Brain Activity Representations in Novice dyslexia small sample size
Readers with Familial
Predisposition
Hoeft et al. [13] (2011) Naming & Reading Shared Neurological Reveals shared neural Difficulty in isolating
Deficits Foundation for Naming pathways specific deficits
and Reading Challenges
in Dyslexia
Richlan et al. [12] (2011) Connectivity Analyses Triple deficit hypothesis Explores interrelation of Relies on theoretical
confirmation via deficits framework, may
connectivity analyses oversimplify
Pugh et al. [14] (2000) Studies on the Neurobiological Offers neuroscientific Complexity in translating
Neurobiology Foundations of Reading insight findings to practical
and Reading Disabilities applications
252 Algorithms in Advanced Artificial Intelligence

4. Considering Deep Learning-Based in analyzing complex patterns and data sets, showcasing
potential applications in dyslexia detection[15]. Various
Techniques for Dyslexia studies have employed deep learning techniques to scrutinize
multiple data types associated with dyslexia, including
4.1 Deep Learning Fundamentals linguistic patterns, eye-tracking data, brain imaging, and
Deep learning, a subset of machine learning, utilizes artificial behavioral markers.
neural networks[7] to mimic the cognitive processes of the Data Types and Features Analyzed
human brain when handling data. At its core are artificial
Linguistic Patterns: Explain how deep learning models
neurons, basic units that receive inputs, apply weights,
analyze linguistic patterns, including language structure,
and use activation functions to produce outputs, mirroring
syntax, and semantic relations, to identify potential dyslexia-
biological neurons’ signal transmission. Neural networks
related irregularities in text comprehension and production.
consist of layers—input, hidden, and output—allowing
hierarchical feature learning. Convolutional Neural Networks Eye-Tracking Data: Discuss how deep learning models
[19] (CNNs) excel in image and pattern recognition, making process eye movement data [16] to understand reading
them relevant for analyzing visual aspects in dyslexia behaviors and identify distinctive gaze patterns characteristic
research. Recurrent Neural Networks (RNNs) [19] handle of individuals with dyslexia.
sequential data, potentially useful for language-related Brain Imaging: Focus on research that utilizes Neuroimaging
patterns associated with dyslexia. Training deep learning data to identify structural or functional distinctions in the
models[32] involves backpropagation, adjusting parameters brains of individuals with dyslexia, offering valuable insights
during training, and gradient descent optimization[9] to into neural correlations [17].
minimize errors and enhance accuracy. Deep learning’s
Efficacy and Efficiency of Models Utilizing Deep Learning
strengths include extracting intricate patterns from vast
datasets and solving complex problems, but challenges Assessing the Effectiveness of Deep Learning Models for
like overfitting, interpretability, and data requirements Dyslexia Detection in Comparison to Traditional Assessment
persist. Beyond dyslexia, deep learning finds applications Approaches. Explore studies showcasing the accuracy,
in computer vision, natural language processing, healthcare, sensitivity, and specificity of these models in identifying
and various domains, highlighting its versatility and broad dyslexia based on diverse datasets and features.
impact represented in Fig. 39.3. Identified Diagnostic Markers or Patterns
Discuss potential diagnostic markers or patterns identified
by deep learning models that distinguish individuals with
dyslexia. Highlight any specific linguistic, visual, or
neurological features recognized as reliable indicators of
dyslexia through deep learning analyses.
Challenges and Limitations
Data Availability and Quality
Address challenges related to the availability, size, and
quality of datasets used for training deep learning models for
dyslexia detection, emphasizing the need for comprehensive
and diverse datasets.
Interpretability and Explain ability
Discuss the challenge of interpretability in deep learning
models, emphasizing the importance of understanding how
these models arrive at their conclusions for clinical adoption.
Fig. 39.3 Traditional way and deep learning-based Deep Learning in Personalized Dyslexia Interventions
Diagnosis of Dyslexia patient Adaptive Intervention Strategies Using Deep Learning
Deep learning presents an opportunity to develop adaptive
4.2 Application of Deep Learning in Dyslexia intervention strategies tailored to the individual needs and
Detection learning profiles of those with dyslexia. These models
Utilizing Deep Learning Models for Dyslexia Detection have the capacity to dynamically adjust and personalize
Deep learning models have emerged as powerful tools interventions based on an individual’s response and progress.
Enhancing Dyslexia Detection and Intervention through Deep Learning: A Comprehensive Review and Future Directions 253

Table 39.2 Comparison of various datasets of Dyslexia using ML & DL approaches

Approach Trained data Tested data size Model Accuracy Findings
size Predicted (%)
fMRI 640 - Probabilistic NN model 91 Participants in the study are younger
than 20 years old.
296 - L2-Logistic regression model 74 Participants with matched IQ levels
and similar ages.
240 - Support Vector Machine 78 The selected samples fall within the
age range of 12 to 19 years.
872 - Support Vector Classification 69 -
889 220 Support Vector Machine 63 -
201 50 Random Forest 93 The portion of the given dataset.
92±40 - KNN/ Support Vector 73 -
Machine (SVM)
sMRI 47±31 - Random Forest 81±8 -
130 - Support Vector Machine 78 -
63 - Support Vector Machine 69 Two locations sourced from the given
dataset.
651 - KNN/ Support Vector Machine 54±7 Participants under the age of 10 have
(SVM) been omitted.
43 - Support Vector Machine 78 -
140 45 Projection-based learning 71 dataset affiliated with New York
University
85 - Random Forest 80±2 -
40 - Projection-based learning 97±2 In this context, the samples consist of
adult females.
sMRI+fMRI 872 - Graph-based CN model 71 The samples extracted from the
NDAR dataset span the age range of
5 to 10 years.
805 310 Support Vector Machine 63
49 - Fully Convolutional Network 93
185 - Deep Belief Network 66
815 - Multilayer Perceptron 84
810 - Multi-channel ANN 74

Individualized Support and Adaptive Learning Models Ethical Considerations Ethical Implications
Discuss how deep learning models can facilitate the creation Address ethical concerns surrounding personalized
of personalized intervention strategies. Explore the potential interventions, such as data privacy, informed consent, and
for these models to adaptively adjust learning materials, the responsible use of individualized data for intervention
pacing, and methodologies to suit the unique strengths and development. Model Bias and Fairness: Discuss the
weaknesses of each dyslexic individual. importance of ensuring that deep learning models used to
Enhancing Intervention Effectiveness personalize interventions are fair and free from biases that
Highlight the potential impact of personalized interventions might perpetuate disparities or discrimination.
derived from deep learning models in enhancing the Potential Impact and Future Directions Educational
effectiveness of dyslexia interventions. This could Transformation
include improved engagement, better learning outcomes, Explore the potential transformative effect of personalized
and increased retention rates compared to standardized interventions derived from deep learning on educational
interventions. practices for individuals with dyslexia. Future Research
254 Algorithms in Advanced Artificial Intelligence

Avenues: Propose future research directions focusing on Relevance: A high F1 score suggests a model that achieves
refining deep learning models for personalized interventions, precision while effectively capturing relevant instances.
addressing ethical considerations, and conducting
Precision * Recall
longitudinal studies to assess the long-term impact of F1 score = 2 * (4)
personalized interventions. Precision + Recall
Area Under the ROC Curve (AUC-ROC): AUC-ROC
4. Reviews on Performance Metrics assesses the model’s capability to differentiate between
dyslexic and non-dyslexic cases under various threshold
Evaluating the performance of the integrated deep learning settings.
methodologies in dyslexia research and intervention requires
Relevance: A high AUC-ROC score signifies a robust model
careful consideration of relevant metrics. The following are
capable of effective discrimination.
pivotal performance metrics [19]:
AUC – ROC = ∑_(i = 1)^(n – 1) [1/2 * (TPR_i + TPR_
Accuracy: Accuracy evaluates the overall correctness of
(i + 1)) * (FPR_(i + 1) – FPR_i)] (5)
models for dyslexia detection and intervention. It quantifies
the proportion of accurately predicted instances relative to RMSE, portrayed as the standard deviation of residuals
the total instances. (prediction errors), quantifies the accuracy of predictions
across quantitative data.
Relevance: A high accuracy score indicates the effectiveness
of the models in correctly identifying dyslexia and
Â i=1 ( xi x̂i )
N 2
implementing personalized interventions. RMSE = (6)
N
(TP + TN)
Accuracy = *100% (1)
The coefficient of determination () is represented as the
(TP + FN + FP + TN)
proportion of variation in the dependent variables explained
Precision and Recall: Precision is a metric that gauges the by the independent variables.
accuracy of positive predictions by assessing the ratio of
RSS
true positive instances to the sum of true positives and false R2 = 1 (7)
positives. Recall, also known as sensitivity or true positive TSS
rate, measures the effectiveness of a model in capturing and In this context, RSS denotes the sum of residual squares,
correctly identifying all relevant instances. It calculates the while TSS signifies the total sum of squares.
ratio of true positive instances to the sum of true positives When comparing the performance of existing models
and false negatives. for dyslexia prediction, several performance metrics are
TP commonly used to assess the effectiveness of the models.
Precision = Here are the key metrics and some existing models that are
(TP + FP) (2) often employed in this context
TP
Recall = (3)
(TP + FN) 5. Future Directions and
Relevance: Precision is crucial to avoid false positives, while Recommendations
recall ensures that dyslexic cases are not overlooked.
Refining Deep Learning Models for Dyslexia
F1 Score: The F1 score, a harmonic mean of precision and
Enhanced Data Collection: Propose the collection of diverse
recall, offers a balanced evaluation, particularly useful in
and comprehensive datasets, including longitudinal data,
scenarios with class imbalance.

Table 39.3 Evaluation metrics for various models

Metrics Accuracy Precision Recall F1 score AUC ROC RMSE
Ensemble modeling 93 92.12 95.89 94.24 0.987 0.86 0.077
Linear Support Vector Machine 77.8 78.9 77.1 88.25 0.85 0.855 0.089
Hybrid Support Vector Machine-Particle 73.1 61.4 74.9 66.8 0.86 0.85 1.04
Swarm Optimization
Random Forest 85.1 84.5 75.8 71.7 0.76 0.85 1.56
Naive Bayes 84.01 83.1 74.7 78.2 0.86 0.79 1.63
Enhancing Dyslexia Detection and Intervention through Deep Learning: A Comprehensive Review and Future Directions 255

Enhancing the resilience and applicability of deep learning for transforming our approach to this neurodevelopmental
models in the realm of dyslexia detection and intervention. condition. This exploration has illuminated the intricate
Multi-modal Approach: Suggest exploring multi-modal characteristics of dyslexia, its prevalence, and the persistent
approaches that combine different data types (linguistic, challenges in early detection and tailored interventions. The
imaging, behavioral) to develop more comprehensive and introduction of deep learning, with its capacity to decipher
accurate models. complex patterns within dyslexia-related datasets, marks a
significant step forward. The application of deep learning in
Interpretability and Explain ability dyslexia detection exhibits encouraging progress, offering
Advancing Model Interpretability: Highlighting the insights into potential diagnostic markers through the analysis
significance of improving the interpretability and explain of linguistic patterns, eye-tracking data, and brain imaging.
ability of deep learning models in the context of dyslexia, However, this promising trajectory is accompanied by
fostering a clearer understanding and acceptance among challenges, including the need for diverse datasets, ensuring
clinicians and educators. model interpretability, and addressing ethical considerations.
Ethical Guidelines and Standards The imperative of refining models and establishing ethical
frameworks is underscored to ensure responsible and
Ethical Frameworks: Advocate for the establishment of equitable implementation. The potential of deep learning
ethical guidelines and standards governing the use of deep in crafting personalized interventions, tailoring support to
learning in dyslexia research and intervention to ensure individual learning profiles, heralds a new era in dyslexia
responsible and equitable practices. intervention strategies.
Longitudinal Studies and Real-world Application As we conclude this exploration, it is crucial to recognize
Long-term Impact Assessment: Recommend conducting that our journey does not end here. Continuous collaboration,
longitudinal studies to assess the long-term effectiveness interdisciplinary efforts, and the translation of research
and impact of personalized interventions derived from deep findings into practical applications are paramount.
learning models. Researchers, educators, policymakers, and practitioners are
Real-world Implementation: Promote the application of called upon to embrace these advancements, refine models,
research discoveries in practical educational settings, and implement ethical guidelines. The fusion of deep learning
fostering partnerships between researchers and practitioners and dyslexia research signifies not just a scientific endeavor
for real-world impact. but a societal commitment toward inclusivity, equity,
and personalized support. Together, let us embark on this
Collaborative Efforts and Knowledge Exchange ongoing journey, striving for a future where every individual,
Interdisciplinary Collaboration: Stress the significance irrespective of their challenges, receives tailored and effective
of interdisciplinary collaboration between researchers, support, fostering a world of learning and opportunity for all.
educators, clinicians, and technologists to address the
multifaceted challenges of dyslexia. Acknowledgement
Knowledge Exchange Platforms: Propose the development of
platforms or networks facilitating the exchange of knowledge The authors gratefully acknowledge the students, staff, and
and best practices among stakeholders in dyslexia research authority of Physics department for their cooperation in the
and intervention. research.

Empowering Stakeholders and Education Systems:

References
Teacher Training and Support: Advocate for training and
support programs to empower educators with the knowledge 1. S. E. Shaywitz and B. A. Shaywitz, “Dyslexia (specific
and tools necessary to implement personalized interventions reading disability),” Biological Psychiatry, vol. 57, no. 11,
in classrooms effectively. pp. 1301–1309, 2005.
2. J. D. Gabrieli, “Dyslexia: a new synergy between education
Policy Implementation: Urge the integration of personalized and cognitive neuroscience,” Science, vol. 325, no. 5938,
dyslexia interventions based on deep learning into educational pp. 280–283, 2009.
policies and frameworks to ensure widespread accessibility. 3. S. Dehaene and L. Cohen, “The unique role of the visual word
form area in reading,” Trends in Cognitive Sciences, vol. 15,
no. 6, pp. 254–262, 2011.
6. Conclusion 4. F. Ramus, “Neuroimaging sheds new light on the phonological
The integration of deep learning methodologies into deficit in dyslexia,” Trends in Cognitive Sciences, vol. 18, no.
dyslexia research and intervention holds immense promise 6, pp. 274–275, 2014.
256 Algorithms in Advanced Artificial Intelligence

5. F. Hoeft et al., “Neural systems predicting long-term outcome 21. D. Froyen et al., “Atypical structural asymmetry of the planum
in dyslexia,” Proceedings of the National Academy of temporale is related to family history of dyslexia,” Cerebral
Sciences, vol. 108, no. 1, pp. 361–366, 2011. Cortex, vol. 19, no. 10, pp. 2641–2649, 2009.
6. M. S. Koyama et al., “The semantic organization of words 22. E. L. Grigorenko and A. J. Naples, “Dyslexia genetics:
in the brain: evidence from category- and modality-specific Integrating genetics, neuropsychology, neurobiology, and
deficits,” Frontiers in Psychology, vol. 4, p. 690, 2013. genomics,” Journal of Developmental and Behavioral
7. G. E. Hinton and R. R. Salakhutdinov, “Reducing the Pediatrics, vol. 30, no. 1, pp. 6–22, 2009.
dimensionality of data with neural networks,” Science, vol. 23. S. Mascheretti et al., “Neurogenetics of developmental
313, no. 5786, pp. 504–507, 2006. dyslexia: from genes to behavior through brain neuroimaging
8. L. Zou and L. Schiebinger, “AI can be sexist and racist – it’s and cognitive and sensorial mechanisms,” Translational
time to make it fair,” Nature, vol. 559, no. 7714, pp. 324–326, Psychiatry, vol. 7, no. 1, p. e987, 2017.
2018. 24. F. Richlan et al., “A common left occipito-temporal
9. N. Langer et al., “White matter alterations in dyslexia: a dysfunction in developmental dyslexia and acquired letter-by
DTI tract-based spatial statistics study,” Brain Structure and letter reading?” PloS One, vol. 8, no. 9, p. e78959, 2013.
Function, vol. 220, no. 4, pp. 1905–1916, 2015. 25. E. S. Norton and M. Wolf, “Rapid automatized naming
10. M. Vandermosten et al., “A tractography study in dyslexia: (RAN) and reading fluency: Implications for understanding
neuroanatomic correlates of orthographic, phonological and and treatment of reading disabilities,” Annual Review of
speech processing,” Brain, vol. 135, no. 3, pp. 935–948, 2012. Psychology, vol. 63, pp. 427–452, 2012.
11. E. S. Norton et al., “An investigation of the neural signature 26. H. Lyytinen et al., “A longitudinal study of the early predictors
of primary and secondary reading disorders,” Frontiers in of poor emergent literacy in children at familial risk of
Human Neuroscience, vol. 8, p. 904, 2014. dyslexia,” Journal of Experimental Child Psychology, vol.
12. F. Richlan et al., “Structural abnormalities in the dyslexic 137, pp. 157–177, 2015.
brain: A meta-analysis of voxel-based morphometry studies,” 27. M. Vandermosten et al., “Brain activity patterns of phonemic
Human Brain Mapping, vol. 30, no. 10, pp. 3299–3308, 2009. representations are atypical in beginning readers with family
13. F. Hoeft et al., “Neural basis of dyslexia: A comparison risk for dyslexia,” Developmental Science, vol. 16, no. 4,
between dyslexic and nondyslexic children equated for pp. 678–692, 2013.
reading ability,” Journal of Neuroscience, vol. 27, no. 37, 28. M. A. Skeide et al., “Genetic dyslexia risk variant is related
pp. 9878–9882, 2007. to neural connectivity patterns underlying phonological
14. K. R. Pugh et al., “Neuroimaging studies of reading awareness in children,” Neuro Image, vol. 146, pp. 526–533,
development and reading disability,” Learning Disabilities 2017.
Research & Practice, vol. 15, no. 1, pp. 55–66, 2000. 29. D. L. Lefly and B. F. Pennington, “Reliability and validity of
15. G. F. Eden et al., “Neural changes following remediation the adult reading history questionnaire,” Journal of Learning
in adult developmental dyslexia,” Neuron, vol. 44, no. 3, Disabilities, vol. 33, no. 3, pp. 286–296, 2000.
pp. 411–422, 2004. 30. A. M. Galaburda et al., “Developmental dyslexia: a diagnostic
16. I.Altarelli et al., “Letter and speech sound association in approach based on the componential model of reading,” Brain,
emerging readers with familial risk for dyslexia,” Brain, vol. vol. 123, no. 12, pp. 2373–2399, 2006.
136, no. 10, pp. 3403–3417, 2013. 31. M. Ahissar et al., “Dyslexia and the failure to form a perceptual
17. N. M. Raschle et al., “Investigating the neural correlates of anchor,” Nature Neuroscience, vol. 4, no. 7, pp. 732–734,
voice versus speech-sound directed information in pre-school 2001.
children,” PloS One, vol. 6, no. 10, p. e25803, 2011. 32. Kothapalli, Pavan Kumar Varma, V. Rathikarani, and Gopala
18. P. E. Turkeltaub et al., “The neural basis of aphasia: evidence Krishna Murthy Nookala. “A Comprehensive Survey on
from functional neuroimaging,” Aphasiology, vol. 17, no. 4, Predicting Dyslexia and ADHD Using Machine Learning
pp. 327–350, 2003. Approaches.” Inventive Systems and Control: Proceedings of
19. C. Raghuram and M. Thenmozhi, Short Review on Contrastive ICISC 2022 (2022): 105–121.
Learning-based Segmentation Techniques for Medical Image 33. Kothapalli, Pavan Kumar Varma, V. Rathikarani, and Gopala
Processing, 2023 International Conference in Advances Krishna Murthy Nookala. “Prediction of dyslexia and attention
in Power, Signal, and Information Technology (APSIT), deficit and hyperactivity disorder prediction using ensemble
Bhubaneswar, India, 2023, pp. 290–296, doi: 10.1109/ classifier model.” International Journal of System Assurance
APSIT58554.2023.10201707. Engineering and Management (2022): 1–12.
20. B. Boets et al., “Intact but less accessible phonetic
Note: All the figures and tables in this chapter were designed by
representations in adults with dyslexia,” Science, vol. 342, no.
the author.
6163, pp. 1251–1254, 2013.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

A Study of YOLO
(You Only Look Once) to YOLOv8 40

Immidisetty V. Prakash1
Research Scholar, Dept. of Electronics and Communication Engineering,
Anna University, Chennai
M. Palanivelan2
Professor, Dept. of Electronics and Communication Engineering,
Rajalakshmi Engineering College, Thandalam, Chennai

Abstract: YOLO, which stands for “You Only Look Once,” is an object detection algorithm that revolutionized real-time
computer vision tasks by enabling fast and accurate object detection in images or videos. Traditional object detection algorithms
involve multiple stages and are computationally expensive. but the YOLO is, on the other hand, approaches object detection as
an issue with regression, predicting class and bounding box probabilities in a single pass straight from the unprocessed image
pixels. , To forecast bounding boxes, the YOLO algorithm divides the input image into a grid. Objectness scores and class
probabilities for objects present within every grid cell. These grid-based approaches allow YOLO to detect multiple objects
of different classes in a single forward pass. By predicting bounding boxes and class probabilities together, YOLO achieves
real-time processing speeds, making it highly suitable for applications such as autonomous driving, surveillance, and robotics.
YOLO is a groundbreaking object detection algorithm that employs a grid-based approach to predict bounding boxes and class
probabilities directly from input images, enabling real-time and efficient object detection for a wide range of applications.
Keywords: YOLO, Object detection, Bounding boxes, Regression problems

1. Introduction separately address region proposal and object classification,

YOLO re-frames the task by directly predicting bounding
The introduction of the You Only Look Once (YOLO) box coordinates together with class probabilities within
algorithm marks a significant advancement in the area of one neural network architecture. This unified approach not
computer vision and object detection. Traditional object only reduces the complexity of object detection but also
detection methods often involve multi-stage pipelines, significantly accelerates the processing speed, hence making
which can be computationally expensive and challenging it perfect for applications. Whereas real-time analysis is
to optimize. YOLO, presented in the study “You Only Look crucial, such as autonomous vehicles, surveillance systems,
Once: Unified, Real-Time Object Detection” by Joseph and interactive robotics.
Redmon et al.” (2016), revolutionizes this paradigm by The fundamental innovation of YOLO lies in its grid-based
offering a real-time, single-pass solution for detecting objects prediction strategy. A grid is created from the input image,
within images and videos. and every grid cell bears the responsibility of forecasting
At its core, YOLO approaches object detection as a bounding boxes for objects located within that cell. This
regression problem. Unlike conventional methods that grid structure enables YOLO to detect multiple objects

1
erivprakash@gmail.com, 2velan.research@gmail.com

DOI: 10.1201/9781003529231-40
258 Algorithms in Advanced Artificial Intelligence

simultaneously and avoids redundant computations that are images and video frames can be detected, localized, and
present in multi-stage detection methods. classified in a single pass through the neural network.
Furthermore, YOLO introduces the concept of”anchor This real-time capability is crucial for applications like
boxes” to improve the precision with which bounding box autonomous driving, surveillance, and robotics, where
predictions. These anchor boxes serve as predefined reference timely decision-making is essential.
shapes and sizes that the algorithm adjusts to better fit the 2. Unified Detection: YOLO seeks to unify object
shape of the object in question. This technique enhances the detection into a single process, as opposed to
versatility of YOLO in handling objects of various scales and traditional methods that involve separate steps for
aspect ratios. region proposal and object classification. By predicting
object classes and bounding box coordinates together,
Overall, YOLO’s introduction marks a pivotal moment in the
YOLO simplifies the detection pipeline and reduces
evolution of object detection algorithms. By offering real-
computational overhead.
time capabilities and efficient detection through its grid-based
approach and anchor boxes, YOLO has set new benchmarks 3. Efficiency: YOLO aims to be computationally efficient
for speed and accuracy in object detection tasks, opening by avoiding redundant calculations. The grid-based
up new possibilities for applications that require instant and method splits the input image into cells, and the task
Reliable visual analysis. of each cell is to predict what’s within its boundaries.
This efficient division of labor allows YOLO to process
large images quickly and predict objects accurately.
4. Multi-Object Detection: YOLO’s objective is to detect
multiple objects of different classes within a single
image or video frame. The grid cells and anchor boxes
enable YOLO to simultaneously identify and locate
multiple objects, making it highly suitable for scenarios
Fig. 40.1 Example of object detection images where there may be various objects in the scene.
5. Handling Object Variability: YOLO aims to handle
objects with varying sizes, scales, and aspect ratios
2. Objective effectively. The introduction of anchor boxes allows
The primary objective of YOLO (You Only Look Once) YOLO to adjust predictions based on these variations,
is to provide an efficient and real-time solution for object improving the accuracy of bounding box localization.
detection in images and videos. YOLO aims to achieve this 6. Generalization: YOLO seeks to generalize well to
objective through several key goals: different types of objects, scenes, and environments.
1. Real-time Processing: YOLO’s foremost objective is This objective is crucial for deploying YOLO in diverse
to enable real-time object detection, where objects in real-world applications, where the algorithm should

Fig. 40.2 Grid-based approach

A Study of YOLO (You Only Look Once) to YOLOv8 259

possess the ability to detect a large variety of objects the scene, allowing YOLO to provide comprehensive
and adapt to different visual conditions. information about the visual content.
7. Accessibility: Another objective of YOLO is to 7. Non-maximum Suppression: After generating multiple
offer a relatively simple architecture that can be bounding box predictions for different objects, YOLO
easily understood and implemented by scholars uses minimal suppression to get rid of overlapping
and professionals in the computer vision field. and superfluous bounding boxes. The above function
This accessibility encourages wider adoption and makes sure that one precise bounding box represents
experimentation. every single object.
8. Efficiency and Simplicity: YOLO is designed to be
3. Functions efficient and relatively simple compared to multi-
phase techniques for object detection. With a single
YOLO (You Only Look Once) performs several key functions pass, estimating bounding boxes as well as class
within the context of object detection in videos and images. probabilities, YOLO reduces computational complexity
These functions are designed to enable efficient and accurate and simplifies the detection process.
detection of objects in real-time: 9. Generalization: YOLO’s function of generalization
1. Object Localization: YOLO’s primary function is to involves adapting to various object types, scenes,
accurately localize objects within an image or video lighting conditions, and environments. This function
frame. It achieves this by predicting bounding box ensures that YOLO can perform well in a wide range of
coordinates (x, y, width, and height) that enclose the real-world applications without extensive fine-tuning
detected objects. This localization information allows for each specific scenario.
users to precisely determine the location of objects in
the scene.
2. Object Classification: YOLO is responsible for
classifying the detected objects into different predefined
categories or classes. Each object is assigned a class
label, indicating what type of object it is, such as “car,”
“pedestrian,” “dog,” etc. This function enables users to
understand the content of the scene by identifying the
objects present.
3. Real-time Processing: YOLO’s architecture is designed
to obtain real-time detection of objects by using a
neural network to process both image and video frames Fig. 40.3 Functioning of YOLO
in one pass. This real-time capability is essential for
applications requiring instant decision-making, such as
autonomous vehicles and surveillance systems. 4. Applications
4. Grid-based Division: You Only Look Once (YOLO) The YOLO algorithm found a wide range of applications
splits an input image through a cell grid. Every cell is across various fields due to its real-time and efficient object
accountable for detecting objects within its boundaries. detection capabilities. Some notable applications of YOLO
This grid-based division allows YOLO to efficiently include:
process large images and detect objects across different
1. Autonomous Vehicles: YOLO is extensively used in
regions of the scene.
autonomous driving systems to identify and monitor
5. Anchor Boxes: YOLO uses anchor boxes to bicycles, cars, pedestrians, signs for traffic, as well
accommodate various ratios of aspect or object sizes. as other objects around the vehicle in real-time. This
These anchor boxes function as preset reference shapes, information is crucial for making informed decisions
and the algorithm adjusts them to better fit the shape of and ensuring the safety of passengers and pedestrians.
the detected objects. This function enhances YOLO’s
2. Surveillance and Security: YOLO is employed in
ability to accurately predict bounding box coordinates.
surveillance cameras and security systems to monitor
6. Multi-Object Detection: YOLO’s architecture enables and detect unauthorized activities, intruders, and
it to detect multiple objects of different classes within suspicious objects. Its real-time processing enables
a single image or frame. This function is crucial for rapid response to potential security threats.
scenarios where there are several objects present in
260 Algorithms in Advanced Artificial Intelligence

Fig. 40.4 Timeline of YOLO variants

3. Retail and Inventory Management: YOLO is used in 12. Textile Industry: YOLO can be used to inspect textile
retail settings for tracking products on shelves and quality, identifying defects or inconsistencies in fabrics
monitoring inventory levels. It can help automate stock during the production process.
management, preventing out-of-stock situations and
optimizing supply chain operations. 5. Different versions of YOLO
4. Healthcare: YOLO is applied in medical imaging
for detecting anatomical structures and anomalies in The YOLO algorithm has evolved over the years, resulting
X-rays, MRIs, and CT scans. It assists radiologists in in several versions and variations. Here are some notable
identifying specific regions of interest and potential methods and versions of YOLO:
health issues. 1. YOLO v1 (You Only Look Once version 1): The original
5. Industrial Automation: YOLO can be used in industrial YOLO algorithm presented the idea of using regression
settings for object detection in manufacturing processes. to solve the object detection problem. The input image
It can help identify defects, inspect product quality, and was divided into a grid, and for each grid cell, it
ensure proper assembly of components. predicted bounding boxes and class probabilities and
6. Robotics: YOLO is integrated into robots and robotic applied non-maximum suppression to refine detections.
systems for object recognition and manipulation. 2. YOLO v2 (YOLO9000): YOLO v2 introduced
Robots equipped with YOLO can identify objects in improvements such as anchor boxes, which enabled
their environment, enhancing their interaction with the a model to manage objects with different ratios
world. of aspect and sizes. It also incorporated a concept
7. Agriculture:YOLO is employed in precision agriculture called”Darknet-19,” a 19-layer architecture that
to monitor crop health, identify pests, and assess plant improved detection accuracy.
growth. It enables farmers to make informed decisions 3. YOLO v3: YOLO v3 further improved object detection
to optimize crop yield and reduce resource wastage. by introducing utilizing a feature pyramid network,
8. Sports Analytics:YOLO can track players and objects in which allows the model to identify objects at various
sports events, providing valuable data for performance scales within the image. It also introduced multiple
analysis, player tracking, and generating statistics that detection scales, providing a balance between speed
enhance coaching and strategic decisions. and accuracy.
9. Augmented Reality (AR) and Virtual Reality (VR):
YOLO can be used in AR and VR applications to
enhance the user experience by recognizing and
interacting with real-world objects and environments
in real-time.
10. Environmental Monitoring: YOLO is used for
monitoring wildlife, tracking endangered species, and
studying ecological patterns. It aids researchers in
understanding and protecting biodiversity.
11. Retail Analytics: YOLO helps retailers analyze
customer behavior in stores, such as tracking foot
traffic, monitoring customer interactions with products,
and optimizing store layouts for improved shopping Fig. 40.5 YOLO v2 results in comparison to the original
experiences. version and other modern models [1]
A Study of YOLO (You Only Look Once) to YOLOv8 261

between speed and accuracy: YOLO v5s (small): Faster

inference but slightly lower accuracy. YOLO v5m
(medium): A balance between speed and accuracy.
YOLO v5l (large): Improved accuracy at the cost of
slightly slower inference. YOLO v5x (extra-large):
Highest accuracy but requires more computation.
Backbone Architecture: YOLO v5 uses CSPDarknet53
backbone architecture. This architecture, inspired by
the Cross Stage Partial architecture, enhances feature
extraction capabilities, allowing the model to capture
more complex patterns. These different versions and
methods of YOLO represent ongoing efforts to improve
real-time object detection by enhancing accuracy, speed,
Fig. 40.6 YOLO3 and adaptability to different use cases and hardware.
Each version has introduced innovations to address
challenges and improve the overall performance of the
algorithm.
6. YOLOX: YOLOX is another variant of YOLO that
focuses on achieving higher accuracy and faster speed
simultaneously. It introduced the”YOLOX- Nano,”
”YOLOX-Tiny,” and”YOLOX-Large” models with
different trade-offs between speed and accuracy.
7. Scaled- YOLO v4: This approach scales the input image
resolution and adjusts Hyperparameters to achieve a
good equilibrium between computational efficiency as
well as detection accuracy.
8. YOLO v6: Li et al. suggested the YOLO v6 version
in 2022 as being an improved version over the earlier
iterations. The CNN architecture was the primary
distinction between the YOLO v5 and v6. YOLO v6
made use of the EfficientNet-L2 variant of the Efficient
Fig. 40.7 A comparison between the YOLO v4 and other
Net architecture. With fewer parameters and greater
cutting-edge object detectors [3]
computational efficiency, it is a far more efficient
4. YOLO v4: YOLO v4 brought several advancements, architecture than the one that EfficientDet uses in the
including the CSPDarknet53 backbone architecture for YOLO v5. The most advanced outcomes of the different
improved feature extraction, PANet (Path Aggregation object detection benchmarks from earlier iterations of
Network) for more effective multi-scale feature fusion,
and CIoU loss for better bounding box regression.
5. YOLO v5: YOLO v5 introduced a lightweight.
And efficient architecture, aiming for even faster
and more accurate object detection. It employed
the”CSPDarknet53” backbone with PANet and
introduced several optimization techniques for
improved speed. YOLO v5 is an iteration of the object
detection algorithm known as (You Only Look Once)
YOLO focuses on achieving high accuracy and speed
simultaneously. It was introduced as a response to
the need for an efficient yet accurate object detection
model. Developed by Ultralytics, YOLO v5 builds upon
the YOLO architecture while introducing new features
and optimizations. Model Variants: YOLO v5 offers Fig. 40.8 A comparison between the YOLO v6 and other
different model variants, each with varying trade-offs cutting-edge object detectors [4]
262 Algorithms in Advanced Artificial Intelligence

YOLO can help achieve it. A brand-new anchor box examples—the objects that are difficult to detect—a
generation technique known as “dense anchor boxes” focal loss combats the problem. When it comes to
was also introduced by the YOLO v6. The outcomes of resolution, v7 processes images at a resolution of 608
YOLO v6’s comparison with other cutting-edge object x 608 pixels, which < the YOLO v3 version’s 416 x
detectors are displayed below. 416 resolution. The v7 has better overall accuracy
9. YOLO v7: YOLO v7 is an additional YOLO version and can detect very small objects thanks to its higher
that has several enhancements over the previous resolution. Compared to existing detection algorithms,
iterations. The utilization of anchor boxes is the v7 processes images at a rate of 155 frames per second,
primary enhancement. The anchor boxes, which are which is a significant speed increase. Because of this, it
a collection of pre-defined anchor boxes with various is appropriate for delicate real-time applications where
aspect ratios and sizes, are used to identify objects with faster processing speeds are essential, like self-driving
various shapes. As opposed to the previous iterations, cars and surveillance. When it comes to accuracy, v7
this one employs nine anchor boxes, which enable it outperforms other Algorithms for detecting objects.
to recognize a greater variety of object forms and On the widely used COCO dataset, v7 achieves an
sizes assisting in their efforts to lower the quantity of average precision of 37.2%, which is comparable to
false positives. This version 7’s primary enhancement earlier detection algorithms as demonstrated below at
is the application of a novel loss function known as an intersection over a union (IoU) threshold of 0.5.
“focal loss.” Previous iterations of YOLO employ a 10. YOLO v8: The team released YOLO v8 in 2023. It
conventional cross-entropy loss function, which has was created using the original YOLO algorithm and is
been demonstrated to be less successful in identifying kept up to date by Ultralytics. YOLO v8 incorporates
minuscule objects. By down-weighting the loss for several enhancements and new features, building
well-classified examples and concentrating on the on the popularity of earlier iterations. The YOLO

Fig. 40.9 A comparison between the YOLO v7 and other cutting-edge object detectors [5]

Fig. 40.10 YOLO v8’s performance in comparison to other cutting-edge models [6]
A Study of YOLO (You Only Look Once) to YOLOv8 263

v8 model is designed for accurate, fast, and easy to 3. Training:

implement, making it an excellent option for a greater (a) Pretrained Weights: Download pre-trained weights
variety of tasks related to object detection and image for the chosen YOLO architecture to initialize
segmentation. It is capable of training on sizable your model’s weights. These weights are typically
datasets. (e.g.: COCO Dataset) and it is capable of trained on large datasets like ImageNet and help
running on different types of h/w platforms, from the the model converge faster.
CPUs to GPUs. YOLO v8 has 5 versions, those are (b) Train the Model: Use the annotated dataset and
ranging from YOLO v8n (small model, with a 37.3 the pre-trained weights to train the YOLO model.
mAP score on the COCO dataset) to YOLO v8x (the Train for multiple epochs, monitoring loss and
largest model, scoring a 53.9 mAP score on this data validation performance. You can use tools like
set COCO). Darknet or YOLOX’s training scripts to train the
model.
6. How to Use (c) Hyperparameter tuning: Modify variables
There are multiple steps involved in using YOLO (You Only like batch size learning rate, and anchor box
Look Once) for object detection: data preparation, model dimensions based on the model’s performance on
configuration, training, and inference. This is a broad guide the validation data set.
on how to use YOLO: 4. Inference:
1. Data Preparation: (a) Load Trained Weights: Once training is complete,
load the trained weights into your YOLO model.
(a) Dataset Collection: Gather a dataset that includes
images or videos relevant to your application. (b) Image/Video Processing: Pre-process the input
This dataset should be annotated with coordinates image or video frame by resizing it to the model’s
of the bounding boxes and the objects’ matching input size and normalizing pixel values.
class labels you want to detect. (c) Object Detection: Feed the preprocessed input
(b) Annotation:Annotate your dataset with coordinates through the YOLO model. The model will for
for the bounding box and class labels using each detected object, forecast the bounding box
tools like LabelImg, VoTT, or RectLabel. Each coordinates and the class probabilities.
annotation should specify the object’s location and (d) After Processing: To get rid of duplicate and
category. overlapping detections, use non-maximum
suppression. By taking this step, it is guaranteed
(c) Data Split: Divide your dataset into test, validation,
and training sets, if applicable to evaluate the that a single bounding box will represent each
model’s performance. object.
2. Model Configuration: (e) Visualize Results: Draw bounding boxes and
labels around the detected objects on the input
(a) Choose YOLO Version: Decide which YOLO
image or video frame. Optionally, you can display
version is suitable for your application based
confidence scores for each detection.
on factors like accuracy, speed, and available
resources. 5. Fine-Tuning (Optional): Depending on the
performance of your model, you might need to fine-
(b) Model Architecture: Download the architecture tune it further by adjusting hyperparameters, collecting
configuration file (usually in the Darknet format) more data, or exploring data augmentation techniques.
corresponding to the chosen YOLO version. These It’s important to note that the specifics of using YOLO
files define the network’s architecture, layer
can vary based on the version you choose and the tools
configurations, and hyperparameters.
or frameworks you use for implementation. Always
(c) Class Names: Create a file containing the names refer to the official documentation and guides for the
of the classes present in your dataset. This file will specific YOLO version you’re working with.
be used to map class indices to class names during
inference.
7. Challenges and Limitations
(d) Anchor Boxes (if applicable): If using a YOLO
version that employs anchor boxes, generate or While YOLO has been a ground-breaking advancement in
select anchor box dimensions based on the statistics object detection, it also comes with its set of challenges and
of your dataset. These anchor box dimensions help limitations:
the model adapt to object scales.
264 Algorithms in Advanced Artificial Intelligence

1. Challenges: (e) Specific Hardware Requirements: Achieving

(a) Accuracy vs. Speed Trade-off: Achieving real- real-time performance with YOLO might require
time speed often comes at the cost of detection specialized hardware like GPUs or dedicated
accuracy. Optimizing for one aspect might lead to inference accelerators, limiting its deployment in
a compromise in the other, making it a challenge resource-constrained environments.
to find the right balance based on the application’s (f) Dependence on Anchor Boxes: While anchor boxes
requirements. help adapt to object scales, selecting appropriate
(b) Small Object Detection: YOLO may have trouble anchor box dimensions is a manual process that
correctly identifying small objects, especially might not cover all possible object variations.
when they appear in cluttered or complex scenes. (g) Limited to 2D Detection: YOLO is focused on 2D
The model’s grid-based approach might not object detection and doesn’t inherently provide
effectively capture these objects’ details. depth information, making it less suitable for tasks
(c) Object Occlusion: Objects that are partially that require 3D object detection or understanding.
occluded by other objects can be challenging for Understanding these challenges and limitations
YOLO to detect accurately, as it predicts based on is essential when deciding whether YOLO is the
individual grid cells without considering the entire right choice for a specific application or when
object’s context. considering potential workarounds to mitigate
(d) Unusual Object Poses: YOLO can struggle when these issues.
faced with objects in uncommon orientations or
poses that deviate significantly from the training 8. Methodology
data. It might misinterpret these objects or fail to
detect them. The YOLO methodology for object detection involves a series
of steps that together enable efficient and accurate real-time
(e) Class Imbalance: If the dataset contains a
object recognition in videos and images. Here’s an overview
significant class imbalance (one class has many
of the YOLO methodology:
more samples than others), the model might
prioritize the dominant class over others, leading 1. Input Processing: The YOLO model receives the input
to biased predictions. image. YOLO processes the entire image as a single
(f) Generalization to New Domains: YOLO’s entity in a single pass, differentiating it from multi
performance might degrade when applied to new stage methods.
domains or environments that differ from the 2. Feature Extraction: A convolution neural network
training data distribution. Fine-tuning or domain (CNN) processes the input image and extracts features
adaptation might be necessary. from it at various levels of abstraction. These features
capture different patterns, textures, and contextual
2. Limitations:
details within the image.
(a) Limited Context: YOLO’s grid-based approach can
3. Bounding Box Prediction: Every grid cell projects one
limit its understanding of context and relationships
or more bounding boxes around the objects that are
between objects. It doesn’t capture global context
inside the cell. These bounding boxes are defined by
as effectively as some other object detection
their center coordinates (x, y), objectness score, width
methods.
(w), and height (h).
(b) Lack of Instance Segmentation: YOLO provides
4. Class Probability Prediction: For each bounding box,
bounding box predictions, but it doesn’t offer
the model predicts class probabilities for different
pixel-level instance segmentation information,
object categories. The class via the biggest likelihood
which could be useful in applications requiring
is assigned to the object within the bounding box.
precise object boundaries.
5. Output Generation: The retained bounding boxes, the
(c) Arbitrary Object Count: YOLO is designed
class labels that correspond with them, and confidence
for fixed grid sizes, making it less suitable for
scores make up the final output. (Product of objectness
scenarios where the number of objects in an image
score and class probability).
greatly exceeds the grid capacity.
6. Post-processing and Visualization: The output bounding
(d) Complex Scenes: In scenes with numerous
boxes are drawn on the original image to visualize the
overlapping objects, YOLO might struggle to
detected objects. Optionally, confidence scores can be
accurately distinguish and localize each object due
displayed to indicate the model’s certainty about each
to the model’s single-shot approach
detection.
A Study of YOLO (You Only Look Once) to YOLOv8 265

7. Iterative Training: YOLO is trained iteratively using (c) Bounding Box Regression Loss: Measures the
annotated datasets. During training, the model learns to discrepancy between expected and actual bounding
estimate the class and bounding box probabilities that box coordinates.
match the ground truth annotations. 2. Activation Functions (e.g., Leaky ReLU): Activation
8. Hyperparameter Tuning: Hyperparameters, such as functions like Leaky ReLU introduce non-linearity to
learning rate, batch size, anchor box dimensions, the CNN layers, enabling the model to learn complex
and architecture choices, are fine-tuned to attain the features and patterns.
intended equilibrium between speed and accuracy. The 3. Feature Pyramid Network (FPN) (YOLOv3 and later):
YOLO methodology’s unique aspect lies in its single- FPN is used to detect items of various sizes. It combines
pass approach, grid-based detection, and prediction of elements from different layers of the CNN to enhance
bounding boxes and class probabilities together. This the model’s proficiency in managing objects of various
methodology has paved the way for object detection sizes.
in real-time in various applications, making YOLO a 4. Darknet (Framework): Darknet is a custom neural
foundational algorithm within the domain of computer network framework developed for YOLO. It provides
vision. Other than methodologies The YOLO algorithm the architecture configurations, layer implementations,
involves several key components and steps that and training pipeline for YOLO models.
collectively enable real-time object detection. While
5. Data Augmentation (Training): YOLO uses data
YOLO itself is a single-shot object detection algorithm,
augmentation techniques during training to introduce
it utilizes various techniques and algorithms to achieve
variability enhancing the model’s robustness and
its functionality.
generalization in the training set of data.
Here’s an overview of the algorithms and techniques used in
YOLO:
9. Conclusion
1. Loss Functions: YOLO employs several loss functions
during training to guide the model’s learning process: These algorithms and techniques collectively form the
(a) Objectness Loss: Measures the inequalities foundation of the YOLO algorithm, allowing it to achieve
between the predicted and genuine objectness real-time object recognition through one pass through the
scores. network using a direct bounding box and class probability
(b) Classification Loss: calculates the variation in prediction. Using these YOLO versions there is a future scope
the actual class probabilities compared to the in the accuracy improvement, speed optimization, handling
predictions. challenging scenarios, domain specific adaption, multi model
integration, transfer learning and few-short learning, ethical
and fair AI.

Fig. 40.11 YOLO architecture

266 Algorithms in Advanced Artificial Intelligence

14. Weiming Hu, Tieniu Tan, Liang Wang, and Steve Maybank,”
References A Survey on Visual Surveillance of Object Motion and
1. Joseph Redmon and Ali Farhadi ”YOLO9000: Better, Faster, Behaviors, IEEE Transactions On Systems, Man, And
Stronger” CoRR abs/1612.08242, 2016. Cybernetics, Vol. 34, No. 3, August 2004.
2. Joseph Redmon and Ali Farhadi “YOLOv3: An Incremental 15. Lorenzo Favalli, Alessandro Mecocci, and Fulvio Moschetti,”
Improvement” CoRR abs/1804.02767, 2018. Object Tracking for Retrieval Applications in MPEG-2” IEEE
3. Alexey Bochkovskiy, Chien-Yao Wang and Hong-Yuan Mark Transactions On Circuits And Systems For Video Technology,
Liao ”YOLOv4: Optimal Speed and Accuracy of Object VOL. 10, NO. 3, APRIL 2000.
Detection” CoRR abs/2004.10934, 2020. 16. Weiming Hu, Tieniu Tan, Liang Wang, and Steve Maybank,”
4. Li and Chuyi et al. ”YOLOv6: A single-stage object detection A Survey on Visual Surveillance of Object Motion and
framework for industrial applications.” arXiv preprint Behaviors, IEEE TRANSACTIONS ON SYSTEMS, MAN,
arXiv:2209.02976, 2022. AND CYBERNETICS, VOL. 34, NO. 3, AUGUST 2004.
5. Wang, Chien-Yao, Alexey Bochkovskiy, and Hong-Yuan 17. Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid “Label-
Mark Liao “YOLOv7: Trainable bag-of-freebies sets new Embedding for Attribute-Based Classification” In Proceed
state-of-the-art for real- time object detectors” Proceedings of ings of the IEEE Conference on Computer Vision and Pattern
the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2013, pages 819–826.
Recognition. 2023. 18. Z. Akata, S. Reed, D. Walter, H. Lee, and B. Schiele
6. Dr. Ramya, Nikhil, Pavan R, Prabhu nagappa chinagudi and “Evaluation of Output Embeddings for Fine-Grained Image
Vishal ”Real-Time Object Detection and Tracking” Volume 9 Classification” In Proceedings of the IEEE Conference on
Issue 8, 2021. Computer Vision and Pattern Recognition, 2015, pages 2927–
7. Abdul Vahab, Maruti S Naik, Prasanna G Raikar and Prasad 2936.
S R ”Object Detection and Its implementations and Users” 19. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
International Research Journal of Engineering and Technology in Health Care Using Hybrid Model Algorithms”,Journal of
(IRJET) Volume: 06 Issue: 04, Apr 2019. Artificial Intelligence, Machine Learning and Neural Network
8. Licheng Jiao, Fan Zhang, Fang Liu, Shuyuan Yang, Lingling (JAIMLNN), Volume 3, 2023.
Li, Zhixi Feng, and Rong Qu ”A Survey of Deep Learning- 20. K. Barnard, P. Duygulu, D. Forsyth, N. De Freitas, D. M.Blei,
based Object Detection”, arXiv:1907.09408v2 [ cs.CV] 10 and M. I. Jordan “Matching words and pictures” The Journal
Oct 2019. of Machine Learning Research, 3. 2003, pages 1107–1135.
9. N Hassan and C S Woo,” Machine Learning Application in 21. J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu,G.
Water Quality Using Satellite Data”, Earth Environ. Sci. 842 Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio.Theano
01, 2018. “a CPU and GPU math expression compiler” In Proceedings
10. Makes Tiwari and Dr. Rakesh Singhai ”A Review of Detection of the Python for scientific computing conference(SciPy),
and Tracking of Object from Image and Video Sequences” volume 4,. Austin, TX, 2010, page 3.
International Journal of Research and Management Volume 22. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N.
13, Number 5, 2017. Hamilton, and G. Hullender. Learning to rank using gradient
11. Chandrajit, Girisha, and Vasudev,” Multiple Objects Tracking descent. In Proceedings of the 22nd international conference
In Surveillance Video Using Color and Hu moments”, Signal on Machine learning, ACM, 2005, pages 89–96.
Image Processing: An International Journal (SIPIJ) Vol.7, 23. Prakash, Immidisetty V., Valiki Vijayabhasker, and Srinivas
No.3, June 2016. Gadari. “multiplexers, demultiplexers, current progress and
12. Jamal Raiyn,” Detection of Objects in Motion—A Survey of algorithms of wavelength assignment in wdm network.” in
Video Surveillance, Advances in Internet of Things, (2013) 3, Research Review International Journal Of Multidisciplinary,
73-78 http://dx.doi.org/10.4236/ait.2013.34010 2019/5 pages 2615-2619.
13. Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas,” Note: Source for all the figures in this chapter were from https://
Tracking-Learning-Detection”, IEEE Transactions on pattern www.researchgate.net/figure/Timeline-of-You-Only-Look-Once
analysis and machine intelligence, vol. 6, no. 1, January 2010. YOLO-variants_fig1_370153499
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Prediction of Endangered Species

Using Artificial Intelligence 41

Yallamati Prakasa Rao1

Assistant Professor, Computer Science and Engineering, KL University
M. V. V. S. Subrahmanyam2
Assistant.professor, Computer Science and Engineering, SRKR Engineering College
Tvramana3
Professor, Computer Science and Engineering, Jain University

Abstract: The loss of many plant and animal species as a result of various climatic shifts is one of the world’s most pressing
challenges today. The extinction of many species is due to a complex interplay of many different causes. Poaching, climate
change, and the loss of natural habitats are all contributing causes. American pikas, Adeline penguins, koalas, and ringed
seals are among the many species that are at risk from the effects of climate change. Predicting the likelihood of extinction is
crucial for preserving ecological harmony. Researchers can accurately observe animals in their native environment. There has
been a dramatic increase in the usage of automated hidden cameras to monitor wildlife more efficiently and without operator
intervention. These cameras are handy and reliable for collecting vast volumes of data about animals. However, manual data
collection and analysis from camera traps is a difficult and time-consuming process. Using the Random Forest and Convolutional
Neural Network algorithms, our goal is to create a model that can forecast which species are in danger of extinction.
Keywords: Species, Disappearance, Natural habitat, Intervention, Endangered

1. Introduction Currently, deep neural networks can train models to verify the
presence of extinct species, analyze their statistics, and even
Not only do humans call Earth home, but tens of thousands of predict which species will be considered endangered in the
other species do as well. Even if numerous species compete future. Data collection on species traits and the environmental
with one another, the extinction of even a single one can have variables that affect their survival is part of the research.
far-reaching consequences. Yet, developments in technology Position in space, kind of habitat, and other variables may
and the economy, pollution of the air and water, and shifts in all play a role. Reputable conservation groups and databases,
population dynamics all pose significant risks to the world’s such as the IUCN Red List and the World Wildlife Fund
biodiversity. An alarming number of species are becoming (WWF), provide the information. Next, we will prepare
endangered. The International Union for Conservation the data for analysis through preprocessing, which involves
of Nature has noted an increase in the number of species cleaning and structuring the data. To find the most relevant
classified as threatened, with 1,102 in 1996 and 14,360 in characteristics for species endangerment prediction, we will
2019. There are 1,197 plant species and 13,868 animal apply feature selection approaches. After that, we’ll construct
species among them. a prediction model that takes these features into account
using the Random Forest algorithm.
1
prakashlnr@gmail.com, yprakasarao@kluniversity.in; 2subramanyam.mavuri@gmail.com; 3Venkataramana.t@gmail.com

DOI: 10.1201/9781003529231-41
268 Algorithms in Advanced Artificial Intelligence

2. Literature Survey before an assault. To keep up with the latest happenings

in underground markets, AI can swiftly scour the web for
[1] Researchers in 2018 used a dataset of 1,600 images relevant information. This study sought to assess the public’s
taken in the actual world. The researchers used the SLIC attention to the various mammals and birds reported based
segmentation VGGNET method in this study. The architecture on an examination of Twitter text messages [2]. The purpose
they propose consists of three interconnected layers. Oct. of this study is to catalogue bird species in order to ensure
2019 saw the proposal of a study [7] that would compare deep their survival. They developed an automated, robust deep
learning and machine learning approaches to the problem of neural learning method for bird species identification using
animal species identification using camera trap images. They image files, which reduced the need for human intervention
focused on SVM, RF, deep learning, Inception v3, and other and saved time. Included in this compilation are more than
machine learning algorithms. Research [2] highlights the 11,788 sounds, representing 200 distinct genres. He used a
importance of animals. Monitoring animals in their native pre-trained RCNN to extract the ROI from the picture before
environments to aid in decision-making on conservation putting the bird’s ROI into a neural network that was trained
efforts. Camera traps or covert cameras can be useful tools using a transfer learning approach and fine-tuned with the
for this purpose. However, it can be difficult and time- provided dataset.
consuming to edit all of these films and photographs. This is
why they’re proposing a mechanism to keep tabs on animals 3.2 Proposed Solution
without human intervention. A convolutional neural network We offer a hybrid approach to endangered species prediction
is utilized by the algorithm. Researchers [6] created a novel using random forest and convolutional neural networks
approach for detecting animals and avoiding collisions using (CNNs). The Random Forest algorithm categorizes
object identification technology. The publication year of this endangered species based on factors such as habitat, nutrition,
work is 2018. The suggested method for animal detection behavior, and conservation status. However, the CNN
utilizes neural network architectures such as SSD and faster approach is used for species detection in photos. Scientists
R-CNN. In [20], To monitor global ecosystems and animal use information about endangered species to teach the
populations, this article employs two different datasets. For Random Forest algorithm its traits and abilities. The system
one thing, it takes a lot of time and money to analyse camera predicts how endangered a species is by using its traits. One
trap images. One can get a general idea of the literature way the trained model might help with conservation efforts
review from Table 41.1. is by estimating the conservation status of a species based
on its attributes. A dataset containing images of endangered
3. Proposed Work and Procedure animals is used to train the CNN algorithm. The algorithm
has been trained to recognize species in images. The trained
Work of Model Design model can facilitate species monitoring by detecting species
in images. The integration of these two systems can make it
3.1 Existing Solution
easier to predict and track endangered species. By locating
The system “trains” by analyzing historical data and species in danger of extinction and acting accordingly, this
identifying patterns that could indicate poacher behavior

Table 41.1 Comparison of various exisitng works on different datsets

Author Dataset Algorithm Objective
[1] Maurodos Santos de ImageNet SLIC andVGGNet Combines RGB andthermal images to accurately identify
Arruda animals even if images are taken in rough condition.
[5] Mohammad Sadegh SnapshotSerengeti AlexNet,NiN, VGG, VGGNet has the highest accuracy for identification,
Norouzzadeha GoogLeNet and Resnet counting, and description of wild animals
Rajasekaran KTH which has 19 Inceptionv3 Inception v3 has thehighest accuracy in animal classification.
Thangarasu[7] differentspecies
AlexanderLoos [9] SnapshotSerengeti Yolo andSSD For animal detection, they combined YOLO and SSD to
achievehigher precision
Hung Nguyen[2] WildlifeSpotter Project CNN Classified 3 common species from the set of animal images
takenin South- central Victoria, Australia
Sazida B.Islam [4] Cameratrapped images CNN Detected snakes, lizards, frogs from camera trap
from Texas imagescollected from Bastrop County, Texas
Ashvini V.Sayagavi [10] UAV images Kuzikus YOLO Capture animal tracked using RFID classified and using
Wildlife Reserve park YOLO
Prediction of Endangered Species Using Artificial Intelligence 269

might help conservation efforts. The project consists of the

following steps.

3.3 Data Selection and Loading

The Pandas library is used to select the data from a CSV
file. The data includes details about various species and their
characteristics. The data is stored in Pandas data frames
once the CSV file is read using the read_csv() method. After
the data is loaded, the scikit-learn library’s train-test split()
method is used to divide it into training and testing sets. This
function divides the data into a training set and a testing set
using the features and the target variable. The ‘Observations’
and ‘Species’ datasets are ours.

Fig. 41.3 Screenshot for combinations of various observa

tions and species
2. Data Normalization: The scikit-learn library’s
“Standard Scaler” function is used to normalize
data. Using the “fit transform()” technique, which
determines the data’s mean and deviation and scales it
appropriately, the standard deviation is applied to the
training set of data.
3. Data Splitting: Next, training and testing sets of the
preprocessed data are created. Training the machine
learning model involves using the training set, while
evaluating its performance requires the testing data.
Fig. 41.1 Screenshot for various observations
4. Data Reshaping: We change the data to match the input
format required by the CNN model.
5. Encoding the target variable: One-hot encoding is used
to encode the target variable, which are the species
labels. This was done to improve the category labels’
suitability for mathematical computation by converting
them into a numerical representation that machine
learning algorithms can employ. Following encoding,
an integer in the range of 0 to N-1 represents each
category variable, where N is the number of distinct
categories in the variables.

3.5 Label and Feature Preparation

We use the Random Forest technique to extract the labels. We
Fig. 41.2 Screenshot for various species derive the feature using the spectral and spatial information
from the satellite photos, while extracting the label as the
3.4 Data Preprocessing species identification number from the file. In order to extract
In this project, the preparation of the data is carried out in the spectral information from the satellite image, we use its
five steps. bands, and in order to recover the spatial information, we use
the pixels’ shape and size. We use this tagged and feature-
1. Data cleaning: The data we collect could include noisy, extracted dataset to train the random forest algorithm, which
duplicate, or missing values. The mean value of the forecasts the number of new samples that can be identified by
corresponding features is used to impute the missing species. In order to facilitate supervised learning techniques,
value. we generate two data frames, X and Y. The predictors in
270 Algorithms in Advanced Artificial Intelligence

‘X’ consist of the variables ‘scientific name,’ ‘park name,’

‘observations,’ ‘common name,’ and ‘conservation status’.
‘Y’ contains the response variable, ‘category’.

4. Model Training and Validation

The ‘train test split’ function divides the data into an 80:20
ratio for testing and training. The ‘fit’ function trains the
model by passing it the training data. The ‘fit’ function
accepts validation data, batch size, and the number of
epochs as inputs. The optimizer calculates gradients during
backpropagation and updates the model during training. The
‘evaluate’ function tests the trained model on the testing data.
The “evaluate” function returns two measures of the model’s
performance: loss and accuracy.

4.1 Prediction
Fig. 41.5 Confusion matrix
We use the trained CNN and Random Forest models to make
the prediction. Making predictions on the test dataset is the layer, a max pooling layer, a dense layer, and a single output
next step after training the models. We train the models with unit. The model trained using the 3D tensor.
the test data and then compare their predictions to the real
labels to see how well they did. 4.3 Random Forest Algorithm
Species are categorised according to their conservation
status using the random forest algorithm. Scientific names,
park names, observations, common names, and conservation
status were among the features included in the dataset. The
‘fit’ method was used to teach the algorithm to anticipate
the preservation state of certain species from other
characteristics, and it was applied to a subset of the data. By
setting the ‘n_estimators’ argument to 100, we formed the
random forest by combining 100 decision trees. The “predict”
method generated predictions on the test dataset based on the
generated model. The model’s accuracy was evaluated using
the “accuracy score” technique from the scikit-learn library.
The ‘classification_report()’ method can generate a report
that details the accuracy, recall, f1-score, and support for
Fig. 41.4 Screennshot of proposed work model every class in the test data.
Additionally, the confusion matrix is used to assess the
4.2 Deep Learning Algorithms performance of the model. The columns in this table
Convolutional Neural Network (CNN) represent the anticipated values, which are used to compare
the predicted values and assess the performance of a
Using the observation’s scientific name, park name, common categorization model. The diagonal elements represent the
name, and conservation status, a model is built using number of accurate forecasts, while the off-diagonal elements
convolutional neural networks to predict the observation’s represent the number of inaccurate predictions.
category. Image identification and other data with spatial
correlations are typical applications of convolutional neural 4.4 Loss Calculation
networks (CNNs), a kind of neural network. N_samples
The loss function “mae,” which stands for Mean Absolute
denotes the shape of the generated 3D object from the input
Error, measures the absolute difference between the expected
data, with n_timesteps representing the number of time steps
and observed values. Frequently, it is applied to regression
and n_features representing the number of input features.
issues.
Next, we inputted the 3D tensor into a convolutional neural
network (CNN) model. This model had a 1D convolutional The optimizer ‘adam’ is an algorithm used for gradient-based
optimization Since the model is predicting a continuous
Prediction of Endangered Species Using Artificial Intelligence 271

Fig. 41.7 Model accuracy for training and testing

Fig. 41.6 Values obtained for random forest

variable, ’accuracy’ is not an appropriate metric. Instead,

we can use the Mean Absolute Error to evaluate the model’s
performance on the test set.
Loss, accuracy=model-evaluate(X_Test,Y_test)
161/161.0s.loss:0.8562, ACC:0.862-301ms/epoch 2ms/step

5. System Architecture
Species Identification Model: Create a model that uses
recurrent neural networks (RNNs) or convolutional neural
networks (CNNs) to identify various species from pictures or
audio recordings.
Habitat Analysis Model: Create a model to analyze and predict
suitable habitats for various species using environmental
Fig. 41.8 Work flow of proposed model
variables and spatial data.
Population Trend Prediction: Develop models to predict study. After collecting the data from various sources, it was
population trends of species based on historical data, taking preprocessed to extract relevant features. Random forests and
into account factors like climate change and habitat loss. convolutional neural networks were the methods used for
categorization. We trained the data using CNN and random
6. Conclusion forest techniques, and the model achieved an accuracy rate of
86.2%. This experiment demonstrated the feasibility of using
Using machine learning techniques to predict when species machine learning algorithms for species extinction prediction.
may become endangered was the primary objective of the Applying convolutional neural networks to random forests
272 Algorithms in Advanced Artificial Intelligence

allows for accurate species classification based on their 5. Mohammad Sadegh Norouzzadeha, Anh Nguyenb, Margaret
properties. Future research is required to improve the models’ Kosmalac, Alexandra Swansond, Meredith S. Palmere,
predictive abilities by increasing their accuracy and including Craig Packere, and Jeff Clunea, “Automatically identifying,
additional features. Possible additions to the research include counting, and describing wild animals in camera-trap images
with deep learning,” PNAS, 2018.
the ability to forecast behaviour and real-time monitoring of
6. Atri Saxena, Deepak Kumar Gupta, Samayveer Singh,“An
endangered species. There may be advantages to this.
Animal Detection and Collision Avoidance System Using
Deep Learning,” SpringerLink, 2020.
References 7. Rajasekaran Thangarasu, Vishnu Kumar Kaliappan,
Raguvaran Surendran, Kandasamy Sellamuthu, Jayasheelan
1. Mauro dos Santos de Arruda, Gabriel Spadon, Wesley Nunes Palanisamy, “Recognition Of Animal Species On Camera
Goncalves, & Bruno Brandoli Machado, “Recognition of Trap Images Using Machine Learning And Deep Learning
Endangered Pantanal Animal Species using Deep Learning Models,” International Journal Of Scientific & Technology
Methods,” IJCNN, 2018. Research, 2019.
2. Hung Nguyen, Sarah J. Maclagan, Tu Dinh Nguyen, Thin 8. Zhongqi Miao, Kaitlyn M. Gaynor, Jiayun Wang, Ziwei Liu,
Nguyen, Paul Flemons, Kylie Andrews, Euan G. Ritchie, and Oliver Muellerklein, Mohammad Sadegh Norouzzadeh, Alex
Dinh Phung, “Animal Recognition and Identification with McInturff, Rauri C. K. Bowie, Ran Nathan, Stella X. Yu,
Deep Convolutional Neural Networks for Automated Wildlife Wayne M. Getz., et al. “Insights and approaches using deep
Monitoring,” Deakin University, Geelong, Australia, 2017. learning to classify wildlife,” Scientific Reports, 2019
3. N. Banupriya, S. Saraya, Rashi Swaminathan, Sachinthaa 9. Alexander Loos, Christian Weigel, Mona Koehler, “Towards
Harikumar, Sukhita Palanisamya, “Animal Detection using Automatic Detection of Animals in Camera-Trap Images,”
Deep Learning Algorithm,” Journal of Critical Reviews, 2019. European Signal Processing Conference (EUSIPCO), 2018.
4. Sazida B. Islam, Damian Valles, “Identification of Wild
Species in Texas from Camera-trap Images using Deep Neural Note: All the figures and table in this chapter were designed by the
Network for Conservation Monitoring,” CCWC, 2020. author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Early Detection of Alzheimer’s Disease

through Tau-PET Image Analysis
Using CNN
42

M. Janakidevi1
Assistant Professor, Department of Computer Science and Engineering,
Sagi Ramakrishnam Raju Engineering College
Ramalinga Swamy Cheruku2
Assistant Professor, Department of Computer Science and Engineering,
NIT Warangal
Ch. Rami Naidu
Assistant Professor, Department of Computer Science and Engineering,
Sagi Ramakrishnam Raju Engineering College

Abstract: Image processing features challenges like noise, occlusion, and blocking of elements. AI systems employ effective
algorithms but still face issues like darkness, rain, snow, smoke, and reflections. The fusion approach focuses on picture
enhancements, particularly for medical applications like Alzheimer’s. One of the most prevalent neurodegenerative disorders,
Alzheimer’s disease, causes a progressive loss of memory and independence. Amyloidal plaques and tau tangles, two forms of
neurotoxic protein buildup in the brain, are its defining features. Because pathology develops silently over decades, it is critical
to diagnose patients as early in the illness process as possible in order to take appropriate action. This study used imaging to
identify tau protein levels in the brain as a predictor of cognitive decline brought on by the early diagnosis of Alzheimer’s
disease. The study investigates the effectiveness of various imaging techniques in identifying individual variations associated
with Alzheimer disease using convolutional neural networks. This method makes use of convolutional neural networks to detect
tau protein using imaging and forecast cognitive decline in order to facilitate an early diagnosis of Alzheimer’s disease. The
convolutional neural network outperforms RGB, DCT, and CNN in graph accuracy measures.
Keywords: Alzheimer, Convolutional neural networks, Early diagnosis, Neurodegenerative diseases

1. Introduction protein, may significantly more accurately predict patients’

cognitive deterioration than standard imaging methods [3].
A common neurodegenerative condition called Alzheimer’s The prompt incorporation of tau PET into clinical practice
disease is marked by a progressive loss of memory and will provide patients with individualized, timely treatment.
independence. Amyloidal plaques and tau tangles, two forms PET is an essential diagnostic tool for Alzheimer’s disease
of neurotoxic protein buildup in the brain, are its defining because it uses low-level radioactive tracers to visualize
features [1]. Because pathology develops silently over brain degenerative processes. Even though accurate tracers
decades, it is critical to get a diagnosis as early in the illness for glucose and amyloid metabolism exist, these methods
process as possible in order to take appropriate action [2]. fall short of fully comprehending the intricate nature of
Show that tau PET, a novel imaging method for observing tau Alzheimer’s disease [4].

1
mjd@srkrec.ac.in, 2rmlswamy nitw.ac.in, 3crn@srkrec.ac.in

DOI: 10.1201/9781003529231-42
274 Algorithms in Advanced Artificial Intelligence

Beyond amyloid PET, tau PET is a helpful diagnostic tool Identifying individual differences: Amyloid plaques and
that helps diagnose Alzheimer’s disease more precisely and tau are linked to clinical symptoms, with tau’s absence
enhances our capacity to stage the illness. Research has or presence determining a patient’s condition. Imaging
demonstrated that, in contrast to β-amyloid, postmortem NFT techniques for tau are challenging due to its complex
load in cortical regions is connected with clinical symptoms structure. Recent drugs targeting amyloid and tau proteins
[5]. Tau PET can be used to stage patients and determine show promising results [12]. Understanding tau distribution
which ones are in the preclinical stages of the illness because and its impact on symptoms is crucial for better Alzheimer’s
tau pathology endures during these stages [6]. By using disease management. Incorporating tau PET into clinical
tau PET to place participants at various points along the evaluations can help assess individual prognosis and select
Alzheimer’s disease continuum, it was possible to show that, the most appropriate therapeutic strategy.
even in the absence of prior knowledge about the locations,
in vivo tau PET can accurately replicate the Barak spreading 3. Alzheimer Disease Detection by
pattern of NFT pathology. Nevertheless, the absence of
postmortem neuropathological validation places limitations Convolutional Neural Network
on both studies [7]. When a person develops Alzheimer’s Even if the dimension of the disease object database has a
disease in the preclinical stages, tau PET has been shown to big impact on how accurately disease objects are identified,
be a reliable indicator of cognitive deterioration. Researchers the quality of the classification technique is crucial. A
discovered that in all patient groups, including Aβ-positive part of machine learning is deep learning [14]. Because
cognitively normal individuals, tracer absorption in the the properties are automatically extracted, deep learning
temporal cortex by Flortaucipir and second-generation works better than traditional machine learning techniques.
tracer RO-948 predicted cognitive impairment. Compared to Moreover, supplying tasks to the network in addition to raw
volumetric MRI and Aβ PET, its predictive performance was picture data is necessary when using deep learning for “end
better [8]. It has been discovered that preclinical Alzheimer’s to-end learning.” Convolutional neural networks are typically
disease can be detected more effectively in brain regions with employed in Alzheimer’s disease studies to enhance visual
higher Flortaucipir absorption. When it comes to identifying aspects [16].
Alzheimer’s disease-like pathology and permitting ante
mortem biological staging, tau-PET is similar to amyloid-
PET [9].

Fig. 42.2 Alzheimer disease detection using CNN

CNN Algorithm: An array of pixel values for the purpose

of Feature Removal is applied to the image of Alzheimer’s
disease.
Fig. 42.1 Three samples of Tau-PET images
1. To extract an image map feature related to Alzheimer’s
illness, use a convolutional neural network.
2. Proposed Work (a) RELU convergence is in the Alzheimer’s picture.
Alzheimer’s disease, a neurodegenerative disorder causing (i) Select a 4x4 kernel whose depth corresponds
memory loss and autonomy, is a significant cause of to the input array on the Alzheimer image.
cognitive decline. This study employs imaging to identify (ii) Convolutional processing is employed to
tau protein presence and predict cognitive impairment in obtain the disease’s features of Alzheimer’s
Alzheimer’s disease, utilizing convolutional neural networks Picture.
for successful detection. Early diagnosis is critical [10]. This (b) (Max Pooling) Pooling of the Alzheimer’s disease
work proposes two objectives, as follows: picture.
The Comparison of Various Imaging Techniques: To (i) Utilizing the dimensionality reduction
find the imaging technique that accurately predicts future procedure, shrink the feature map’s spatial
cognitive deterioration from Alzheimer’s disease, using size and then extract the 2x2 Alzheimer’s
fluortaucipir, a radiotracer that binds to the tau protein, in the picture.
preclinical phases of Alzheimer’s disease [11].
Early Detection of Alzheimer’s Disease through Tau-PET Image Analysis Using CNN 275

2. Extraction of low-level characteristics from the 5. Experimental Result

Alzheimer’s disease image: Follow the previous stages
up to the fourth layer, where you change the channel The study compares RGB, DCT, and CNN models for
size to 16, 32, 64, or 128. accuracy in early Alzheimer disease diagnosis, focusing
Classification: on CNN’s performance on massive image datasets and its
accuracy in predicting true nativities.
1. A feed-forward neural network with back propagation
receives smooth output at the end of each training Accuracy: A two-dimensional classification test’s accuracy
iteration for Alzheimer disease detection. is a statistical indicator of its capacity to recognize or rule
out a condition based on a comparison of pre- and post-test
2. A trained model is used to classify images such as
probability estimates.
illness object images by detecting their dominating
properties using the Alzheimer’s SoftMax Classification TP + TN (3)
Accuracy =
technique. TP + TN + FP + FN
CNN is a potent neural network method for classifying Where TP = True positive; FP = False positive; TN = True
images and identifying Alzheimer’s disease [15]. Layers like negative; FN = False negative.
pooling, convolution, activation, and classifiers are among
them. Activation functions such as sigmoid, Tanh, and ReLU The RGB method fails to correctly identify 18 out of 100
are employed in conjunction with feature maps extracted by tumors, leaving 18 unidentified, in class-imbalanced data
the convolution layer. The class with the highest probabilities with significant positive/negative label fluctuations, despite
is chosen by the classifier layer. Large datasets and transfer having a 82% accuracy rate in the early identification of
learning models—which may be modified for different Alzheimer’s disease.
tasks—must be handled by CNNs. CNN is used by the 80 + 2
Accuracy = = 0.82%
Alzheimer disease object algorithm to classify images [16]. 80 + 2 + 9 + 9
The DCT method, with a 91% accuracy rate in early
4. Discrete Cosine Transform (DCT) Alzheimer’s detection, only correctly identifies 91 tumors out
The DCT has been used in numerous research studies on of 100, leaving 9 undiagnosed, indicating its ineffectiveness
illness diagnosis as a feature extraction step. Whether used in class-imbalanced data with significant positive/negative
comprehensively or based only on local appearance, spatial label differences.
information has historically been largely disregarded in the 90 +1
application of digital content technologies (DCTs). Feed Accuracy = = 0.91%
90 + 1 + 1 + 8
specific neural network types with local DCT coefficients or
statistically mimic them during the classification stage. Since With a 98% accuracy rate in the early identification of
then, the DCT’s launch has grown in popularity and been Alzheimer’s disease, the CNN approach is unsuccessful in
suggested with a number of changes [13]. class-imbalanced data with considerable positive/negative
label variances, properly identifying only 98 tumors out of
N
p (2n 1) ( k 1)
y ( k,1) = w ( k ) Âx ( n ) cos , k = 1,º, N (1) 100, leaving 2 unidentified.
2N
n=1 96 + 2
Accuracy = = 0.98%
Where 96 + 2 + 1 + 1
Ï 1
Ô , k =1 Table 42.1 Graph compares the performance of RGB, DCT,
Ô N
w (k ) = Ì (2) and CNN algorithms
Ô 2
ÔÓ , 2£ k £ N Comparative Methods Accuracy
N RGB 0.82

The length, indicated by N, and size of the two matrices, x DCT 0.91
and y, are the same. The DCT transforms the columns of an CNN 0.98
x matrix. Since vectors go from 1 to N rather than 0 to N-1,
the series is indexed from n = 1 and k = 1, as opposed to the The convolutional neural network performs best among the
typical n = 0 and k = 0. accuracy measurements for RGB, DCT, and CNN in the
graphs. Although measurement is essential to comprehending
276 Algorithms in Advanced Artificial Intelligence

5. Jo T, Nho K, Saykin AJ. Deep learning in Alzheimer’s disease:

diagnostic classification and prognostic prediction using
Neuroimaging data, Front Aging Neuroscience, 11:220, 2019.
6. Jong Bin Bae, Subin Lee, Wonmo Jung, Sejin Park, Weonjin
Kim, Hyunwoo Oh, Ji Won Han, Grace Eun Kim, Jun Sung
Kim, Jae Hyoung Kim & Ki Woong Kim: Identification of
Alzheimer’s disease using a convolutional neural network
model based on T1-weighted magnetic resonance imaging,
Nature, Scientific Reports volume 10, Article number: 22252,
2020.
Fig. 42.3 The bar chart compares the performance of RGB, 7. Li, H: Deep learning model for early prediction of Alzheimer’s
DCT, and CNN disease dementia based on hippocampus magnetic resonance
imaging data, Alzheimer’s Dementia, 15,1059–070, https://
doi.org /10.1016/ j.jalz.2019.02.007,2019.
the outside world, it also creates mistake, or ambiguity.
8. Lacor, P. N.: Synaptic targeting by Alzheimer’s-related
Accuracy is a crucial factor to take into account while taking amyloid β oligomers, Journal. Neuroscience, 24, 10191–
measurements since it indicates how closely a measurement 10200, 2004.
resembles an established value. The exactness of a collection 9. Mofrad S. A., Lundervold A. J., Vik A., Lundervold A. S.
of measurements of their proximity is indicated by the Cognitive and MRI trajectories for prediction of Alzheimer’s
accuracy, which is a measure of observational error. disease, Scientific Reports,2021;11(1) doi: 10.1038/s41598
020-78095-7, 2021.
10. Marwa Zaabi; Nadia Smaoui; Houda Derbel; Walid Hariri:
6. Conclusion Alzheimer’s disease detection using convolutional neural
The study uses image processing to enhance medical networks and transfer learning based methods, IEEE Xplore,
applications, particularly for Alzheimer’s disease. It uses DOI: 10.1109/SSD49366.2020.9364155, 20-23 July 2020.
convolutional neural networks to detect tau protein levels 11. Morteza Amini,Mir Mohsen Pedram,AliReza Moradi,Mahdieh
Jamshidi, Mahshad Ouchani: GC-CNNnet: Diagnosis of
in the brain and predict cognitive decline. The method
Alzheimer’s Disease with PET Images Using Genetic and
outperforms RGB is 82%, DCT is 91%, and CNN is 98% Convolutional Neural Network, Computer Intelligence
in graph accuracy measures, allowing early diagnosis and Neuroscience, 2022; 2022: 7413081, Published online 2022
appropriate action. The study highlights the importance of Aug 9, Doi: 10.1155/2022/7413081,2022.
early diagnosis in preventing neurodegenerative disorders 12. R. Mufidah, I. Wasito, N. Hanifah and M. Faturrahman:
like Alzheimer’s. Structural MRI classification for Alzheimer’s disease detection
using deep belief network, vol. 17, pp. 37-42, 2017.
13. R.N.V.Jagan Mohan: Fuzzy Cluster Index: An Angle Oriented
References Face Recognition Using RSA, Published in Mathematical
1. Albert, M. S. et al. The diagnosis of mild cognitive Sciences International Research Journal, ISSN: 2278-8697,
impairment due to Alzheimer’s disease: Recommendations ISBN: 978-93-81583-57-9, Volume 1, Number 3, Page No:
from the National Institute on Aging-Alzheimer’s Association 1058-1067, Sep 13th-14th, 2012.
workgroups on diagnostic guidelines for Alzheimer’s disease, 14. Taeho Jo, Kwangsik Nho, Shannon L. Risacher, Andrew
Alzheimer’s Dementia, 7, 270–279, https://doi.org/10.1016/j. J. Saykin: Deep learning detection of informative features
jalz.2011.03.008,2011. in tau PET for Alzheimer’s disease classification, BMC
2. Bera G., Migliaccio R., Michelin T., et al: Parietal involvement Bioinformatics. 2020; 21(Suppl 21): 496, Doi: 10.1186/
in the semantic variant of primary progressive aphasia with s12859-020-03848-0, 2020.
Alzheimer’s disease cerebrospinal fluid profile, Journal of 15. YN Fu’adah: Convolutional Neural Network (CNN) for
Alzheimer’s Disease, 2018;66(1):271–280. doi: 10.3233/jad Automatic Skin, IOPscience, https://iopscience.iop.org,2020.
180087,2018. 16. Shen L., Kim S., Risacher S. L., et al: Whole genome
3. Erickson BJ: Magician’s corner: how to start learning about association study of brain-wide imaging phenotypes for
deep learning, Radiol Artificial Intelligence, 1:e190072, doi: identifying quantitative trait loci in CI and AD: a study of
10.1148/ryai.2019190072,2019. the ADNI cohort. Neuroimaging, 2010; 53(3):1051–1063,
4. Jainesh Rathod, Vishal Waghmode, Aniruddh Sodha, and DOI:10.1016/j.neuroimage. 2010. 01.042, 2010.
Praseniit Bhavathankar: Diagnosis of skin diseases using Note: All the figures and table in this chapter were designed by the
Convolutional Neural Networks, IEEE Xplore, DOI: 10.1109/ author.
ICECA.2018.8474593,2018.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Computational Analysis and Identification

of Specific MMP Targets in Tumours at
Multiple Stages
43

G. Nirmala, Deepak Nedunuri*

Associate Professor, Department of CSE,
Sir C R Reddy College of Engineering, Eluru, India
K. Satyanarayana
Associate Professor, Department of IT,
Sir C R Reddy College of Engineering, Eluru, India
Ch. Madhava Rao
Associate Professor, Department of CSA,
K L E F, Vaddeswaram, India
Y. Butchi Raju
Professor, Department of EEE,
Sir C R Reddy College of Engineering, Eluru, India

Abstract: There is growing consensus that matrix metalloproteinase (MMP) inhibitors, both naturally occurring and man-
made, can be effective cytostatic and anti-antigenic therapeutic targets in the fight against cancer. Because of their significance
in cancer, many inhibitors are currently undergoing clinical studies. The analysis produces computational dock scores, which
are then compared to experimental values, and used to generate further graphs, charts, and observations. LOO (leave-out-one)
is the basis for the model’s cross-validation. We will use the r2 value (correlation coefficient) and the RMSE (root mean square
error) to assess the QSAR model’s quality.
Keywords: MMP inhibitors, PRESS, Leave-out-one (LOO), QSAR, Root mean square error etc.

1. Introduction are collagenases, matrilysins, stromelysins, gelatinases,

membrane-associated MMPs, and other MMPs that aren’t
Matrix metalloproteinases (MMPs) are in the met zinc listed above.
superfamily. These MMPs bind zinc at the catalytic site
and have a conserved “met-turn” motif. When tissues are 1.1 Computational Analysis – QSAR
healthy or sick, matrix metalloproteinases (MMPs) are vital In silico computational drug discovery aimed at a target
for tissue modeling and extracellular matrix modification. macromolecular molecule (like a protein or nucleic acidv
Consequently, they play a crucial role in the maturation of includes both improving existing leads and creating completely
tumors. A family of enzymes called matrix metalloproteinases new ones from scratch. The term “lead” refers to a certain
(MMPs) hydrolyzes the extracellular matrix. There are type of ligand molecule that has a set phase of activity against
six groups of matrix metalloproteinases (MMPs) based targets after binding to them [1].Computational technologies
on how they recognise substrates and cut them up. These can be utilised to create drugs and digitally seek out better

*Corresponding author: nedunurideepak@gmail.com

DOI: 10.1201/9781003529231-43
278 Algorithms in Advanced Artificial Intelligence

studies were encouraging, but there has been a steady stream

of disappointing results and/or limited achievements reported
in recent years. Based on these and other published results,
future research aims to enhance target binding and improve
effectiveness to thoroughly reevaluate MMP-inhibition
strategies. We will obtain the MMP-13 inhibitors from
databases and published literature. Using computer-aided
analysis, we will search databases and literature for molecules
that are either very close to or quite different from the target.
The analysis produces computerized dock scores, which are
then compared to experimental values. From these results,
graphs, charts, and observations are derived. Computational
statistics will be applied to a set of inhibitors retrieved from
databases or the literature before docking investigations are
conducted. Multiple linear regression with F to leave and F to
enter, cross validation, PRESS (Predicted Residual Error Sum
Fig. 43.1 Activation of MMP by cysteine switches mechanisms
Squares), s value, F value, internal and external validations,
ligands using ligand- or structure-based approaches. Ligand r2 (q2), and so on are all examples of parameters.
based computational strategies, namely Quantitative Structure
Activity Relationship (QSAR) methods, are employed when 3. Materials and Methods
there is limited basic information available for a therapeutic
project but the arrangement of dynamic ligand atoms in 3.1 Data Set
the macromolecular target is known. To find quantitative To create a trustworthy and solid QSAR model, biological
structure-activity relationships (QSARs), you have to look data on 72 chemicals [3, 4, 5] published in the literature were
at a group of atoms’ characteristics or characterizations in used. The bioactivities and structures of these derivatives, as
a quantitative way. These measurable models are built to well as their IC50 (half maximum inhibitory concentration)
predict the movement of further mixtures towards the target. values, are provided.
The pharmaceutical industry has extensively used the method
for managing medicinal chemistry efforts for a long time [2]. 3.2 Multiple Variable Analysis
We implemented the QSAR model on both the training and
2. Literature Review complete sets. To round out the validation process, we utilized
The secretion of matrix metalloproteinase (MMP) into the the “consume one, leave one out” strategy and predicted
bloodstream occurs in a wide variety of pro-inflammatory external actions for the test set. The linear MLR method
cell types and connective tissues. Enzymes, known as was used to establish the link between the independent items
zymogens, are the building blocks of proteolytic enzymes and the dependent parameter (log1/IC50). The examination
like serine proteases. One possible target for cancer of statistical data led to the establishment of noteworthy
treatment could be matrix metalloproteinases due to their descriptors. The coefficient of correlation (r), estimate of
significant involvement in the pathological circumstances standard error (s), F value, and cross-validation r2 (q2) were
that cause cancer. Strategies to decrease MMP levels may be used to arbitrate the created equation. Over the course of two
helpful in the battle against cancer, according to promising independent trials, we randomly applied the LOO approach,
results from animal and human studies on tumor models which allows us to enter and exit the equation with two
of MMP suppression. Unfortunately, realistic simulation parameters at a time using F-stepping [6].
of MMP-inhibitor complexes is challenging due to the
3.3 Cross-validation
intrinsic flexibility of the MMP active site. Researchers are
increasingly recognizing the role of matrix metalloproteinase Through the process of cross-validation, one may determine
(MMP) inhibitors, both naturally occurring and man-made, as the QSAR model’s reliability. For this research, we generated
cytostatic and anti-angiogenic medications, and considering several altered datasets by removing the first row and making
MMPs as potential targets for cancer treatment. There are a value predictions with the remaining data using the leave-
plethora of inhibitors now participating in clinical trials due one-out (LOO) method. The purpose of leaving each row is to
to their connection to cancer. The results of the preclinical estimate its value using the values of the other rows.
Computational Analysis and Identification of Specific MMP Targets in Tumours at Multiple Stages 279

Table 43.1 Molecular Descriptors data and statistical values of newly proposed model equations
Descriptors Coefficients
Model-1 Model-2 Model-3 Model-4 Model-5
Total Lipole -0.0144 -0.0277 -0.0287 +0.0651
LipoleZ Component -0.041 -0.047 -0.046 -0.042
KierChiV2(path)index -0.701 - -
KierChiV3(cluster) index +1.845 - -
BalabanTopologicalindex -3.984 - -
NumberofC1Atoms -0.341 - -
6-Membere dAliphaticrings -0.390 - -
H-bondDonors +0.376 - - +0.223
KAlpha2index - -0.408 -0.404
6-memberedaromatic rings - +0.548 +0.594 +0.208 +0.234
Rotatable Bonds - +0.153 +0.147 +0.249 +0.221
LUMO - -2.84 -3.017 -3.735 -3.560

4. Results and Discussions The distribution of activity levels in the validation set is
comparable to that in the training set. Below, we present the
results and statistics of the multiple linear regression method
4.1 Complete Data Set
for various descriptors.
The most important features were found using multiple
regression analysis with F-stepping and single-row cross- Table 43.2 Statistics for equation numbers 7-8
validation. These included inertia moments, the lipole
R 0.858 0.868
component, form flexibility, and six-member rings. The linear
2
QSAR model includes all 72 inhibitors, as demonstrated in r 0.737 0.755
Equation 5. q 2
0.602 0.798

log (1/1C50) = + 0.78127909*Inertia Moment 1 Size F 14.75 22.61

+1.0273278* Inertia moment 1 Length N 51 51

-0.19020687* Total Lipole PRESS 4.307 4.035

-0.12550831* Lipole Xcomponent S 0.663 0.302

-0.22834534* Lipole Z component No of Descriptors 8 6
-0.81364067* Shape Flexibility Equation No. 7 8
-0.67144702* Randic Topological index
-0.1585072* 6-membered aliphatic rings
5. Conclusion
-0.83 902165
r = 0.8399, r2 = 0.7051, q2 = 0.601, F = 18.7988, n = 72, High LUMO energy levels have a negative effect on an activity,
s = 0.3981 as shown in the QSAR model (Eq. 8). Adding halogens or
other elemental substituents to a molecular orbital reduces
4.2 QSAR Model its energy. The likelihood of electron acceptance is higher for
molecules with low-lying LUMOs compared to those with
We split the collection into two parts: one with 51 molecules
high-energy LUMOs. The molecule’s reactivity decreases
for training and the other with 6 molecules for validity in
relative to others as LUMO rises [8]. Thus, making analogs
order to build a new QSAR model. Researchers choose
with electron withdrawals from substituents increases the
molecules for the training set based on their biological
work involved.
activity and molecular structure, aiming to include examples
of various structures with different substituents and activities Equation 8 suggests that increasing the number of
[7]. Hierarchical categorization and the removal of outliers 6-membered aromatic rings enhances the inhibition of
from the data set form the basis of this selection procedure. MMP-13. Inhibitors with linear aliphatic groups tend to
280 Algorithms in Advanced Artificial Intelligence

Table 43.3 Test set data – Eq7

Actual Value Predicted Value Predicted Actual Value
Value
-0.431 -0.52458 -0.52458 -0.431
-0.903 -0.89253 -0.89253 -0.903
-1.528 -1.37933 -1.37933 -1.528
-0.954 -0.77173 -0.77173 -0.954
-1.258 -1.29071 -1.29071 -1.258
-1.845 -1.78846 -1.78846 -1.845
Summation -6.919 -6.64734 -6.64734 -6.919
Actualx Predicted= 45.99293 45.99293
Predicted x Predicted= 44.1871 47.87256
k = Actual x Predicted/ 1.040868 K 0.960737
(Predicted)2
R2= 0.9583 R2= 0.9583
2 2
RO = 0.955 RO = 0.958
R2-RO2/R2= 0.003444 0.000313

Fig. 43.2 Observed vs Predicted Activity of validation set obtained for equation number 7

Table 43.4 Test set data – Eq8

Actual Value Predicted Value Predicted Actual Value
Value
-0.23 -0.3947 -0.3947 -0.23
-0.431 -0.4724 -0.4724 -0.431
-0.886 -0.98175 -0.98175 -0.886
-1.459 -1.41182 -1.41182 -1.459
-0.954 -1.06386 -1.06386 -0.954
-1.389 -1.0818 -1.0818 -1.389
Summation -5.349 -5.40632 -5.40632 -5.349
Actual x Predicted= 28.91843 28.91843
Predictedx Predicted = 29.22834 28.6118
k-Actual x Predicted/ 0.989397 K 1.010717
(Predicted)
R2= 0.9062 R2= 0.9062
2 2
RO = 0.8223 RO = 0.8816
R2-RO2/R2= 0.092584 0.027146
Computational Analysis and Identification of Specific MMP Targets in Tumours at Multiple Stages 281

2. Selassie CD. (2003) History of Quantitative structure-activity

Relationships. Burger’s Medicinal Chemistry and Drug
Discovery, A John Wiley and Sons, Inc., Publication 6th Ed.,
Vol1, Edited by Donald J. Abraham, 1–3.
3. Christian K. Engel et al. (2005) Structural Basis for the Highly
Selective Inhibition of MMP-13, Chemistry & Biology, 12,
181–189.
4. Matter, H., and Schudok, M. (2004). Recent advances in the
design of matrix metalloproteinase inhibitors. Curr.Opin.
Drug Disc. Devel. 7, 513–535.
5. Springman, E.B.; Angleton, E.L.; Birkedalhansen, H.;
Vanwart, H.E. (1990) Multiple-modes of activation of latent
Fig. 43.3 Observed vs Predicted Activity of validation set human fibroblast collagenase - evidence for the role of a Cys
obtained for equation number.8 73 active-site zinc complex in latency and a cysteine switch
mechanism for activation. Proc. Natl. Acad. Sci. U. S. A.,
Table 43.5 FIT Kubinyi data acquired all five QSAR model 87(1), 364–368.
6. G.Nirmala, Yesubabu Adimulam, P.Seetharamaiah (2016) “
Eq. No r2 k n FIT Computational Molecular docking and structural specificity
7 0.736 8 51 1.027 of bipyrazoles as non-zinc chelating inhibitors of MMP-13”,
8 0.753 6 51 1.559
International Journal of Computational Biology and Drug
Design, Vol. 9, No.1/2 pp. 162–171.
7. G. Nirmala, Yesubabu Adimulam, P. Seetharamaiah (2015)
spin molecules inside the active site area and have been the “In silico Multivariate Regression Analysis and Validation
focus of most investigations. There is a positive correlation Studies on Selective MMP-13 Inhibitors”, IJCA ISBN: 973
between the two concepts, suggesting that molecules with 9380890-12-9,Vol.130, No.6. In International journal of
more rotatable bond groups would be more active. computer applications (IJCA), New York, included in DBLP.
Hall LH, Mohney B, Kier LB (1991) The Electro topological
State: Structure Information at the Atomic Level for Molecular
References Graphs. J. Chem. Inf. Computer Science 31:76–82.
1. Joseph-McCarthy D. (2002) An overview of in silico design Note: All the figures and table in this chapter were designed by the
and screening: Toward efficient drug discovery. Curr. Drug author.
Discov, 20–23.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
282 Algorithms in Advanced Artificial Intelligence

Exploring the Rise of Cryptocurrencies with

Blockchain Technology 44

V. Priyadarshini1, R. Shiva Shankar2, P. Neelima3

Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, India
N. Deshai4
Department of Information Technology,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, India
D. Ravibabu5
Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, India

Abstract: A blockchain can be considered a group of records or an accessible history shared among people involved in the
transaction. For example, all parties in that process validate every transaction accepted for incorporation. The Blockchain
has stored one piece of data. It could never be written or altered again in any way. As a result, the Blockchain could be
considered a digital ledger that includes all of the transactions that have occurred. It is also the blockchain technology used by
cryptocurrency networks such as the decentralized Bitcoin or Ethereum, which could be considered computerized peer-to-peer
cash. This paper incorporates a history of Bitcoin, a few literary evaluations, an explanation of how the Blockchain works, and
an implementation of the network.
Keywords: Block cypher, Bitcoin, IoT, Blockchain etc.

1. Introduction agreement of many individuals within the structure of every

conversation, the superb network record.
In 1991, Stuart Haber and W. Scott Stornetta introduced the idea In the same way, once data has been input, it cannot be
of a chain of blocks (a set of data) that could be safeguarded. removed. The Blockchain provides a plain and transparent
It used the pseudonym; “Satoshi Nakamoto,” an individual or record of every exchange, regardless of its formation [2].
batch that developed and executed the blockchain technology Obtaining a treat from a treat thump that is unbroken in a
in late 2008. Hashing was incorporated into the blockchain minimal space is far less challenging than getting pleasure
system to ensure no one could alter or erase already saved from a knock unbroken in an incredible business center
records. Utilize a blockchain concept as its foundation or visible to many people. Bitcoin is the most widely advised
foundational technology [1]. The fundamental definition of a viewpoint that is remarkably associated with the advancement
blockchain is dispersed data of records or a public record of all of the Blockchain. It is also the one that’s the furthest away
modern events that are gone and distributed over individuals from being accurate because it enables a multibillion-dollar
participating in the Blockchain. Every conversation and the typically to ambiguous market exchanges that have no
outstanding network record are accurate according to the oversight. As a result, it is responsible for overseeing a variety
1
priyavoosala@gmail.com, 2shiva.csesrkr@gmail.com, 3neelima.p47@gmail.com, 4desaij4@gmail.com, 5ravibabu.devareddi@gmail.com

DOI: 10.1201/9781003529231-44
Exploring the Rise of Cryptocurrencies with Blockchain Technology 283

of bodily concerns in conjunction with national governments advancement are not adequately covered in detail; however,
and fund affiliations. the topics are discussed in general terms [7]. As a result, the
A decentralized web handles a protected sequence or link creators notice that blockchain writing, for the most part, is
of time stamp information saved in a database managed of a distinct sort, in which the great promise probabilities of
by a group of customers [3]. Every web device gets access advancement are typically safe. However, the conversation on
to the information or records in a blockchain, making it a how Blockchain will enhance the creation of encouragement
decentralized or disseminated database. Cryptographic within organizations is still lacking. Much of the attention
methods are employed to encode all the Blockchain’s is focused on what might happen if most of the population
necessary data entries. It assures that the data in the Blockchain accepts Blockchain, The most fundamental possible use
is secure. The advantages of Blockchain innovation outweigh cases, rather than on the esteem-creating procedures of
the disadvantages, primarily concentrating on administrative Blockchain. Instead, they will examine why companies use
issues and difficulties. One of the most critical events in blockchain technology to solve problems and its benefits [8].
developing blockchain technology is the consolidation of Blockchain’s entrepreneurial difficulty is an enhancement
“splendid contracts.” Intelligent contracts are generally problem, like new development monetary features, because it
computer programmers capable of carrying out the provisions necessitates non-value coordinating over the complementary
of an agreement in this fashion [4]. Sharp Property is another nature of utilizes and opportunities [9]. Satoshi Nakamoto’s
connected plan concerned with dominating the demand with private activities released “Bitcoin: A Peer-to-Peer Electronic
relevant properties or resources using blockchain-based Money System” in 2008. A distributed kind of electronic money
techniques that use intelligent Contracts. Whether the asset is suggested in this research will enable online components to
natural (for example, an automobile, a telephone, or a rare) or be moved extensively across collections without a finance
non-physical (for example, provides of Associate in Nursing firm. Bitcoin was the most important validation of this
affiliations), the property type is essential. Bitcoin is not thinking method [10]. The proper word-processing finance
typically considered money, which must be stated here since specifications would represent all structures and mediums of
Bitcoin is linked to the dominating concern concerning a exchange that use cryptography to deal with trade instead of
particular subject matter [5]. The transaction record includes those frameworks where transactions are routed through a
the transaction date, time, and amount between two parties. collected certainty in parts, as the term implies [11].
As seen in Fig. 44.1 [6], blocks consist of a header and body. Educational institutions evaluate student input to improve
teaching and learning. Online processes simplify feedback
collection, summarization, and abstraction. There are
various internet ways. Finding a practical approach is the
central issue. This study presents our sentimental analysis
methodology for analyzing student comments utilizing long-
term memory. The suggested model is compared to Naive
Bayes, decision trees, and random forests [12]. Blockchain
can revolutionize the world with ease, transparency,
accuracy, speed, and affordability. Increased familiarity and
confidence in Blockchain in finance come from successful
use cases, testimonies, and suitable legal reforms [13]. To
use science and technology, one must gather, analyze, and
interpret health, family, nutritional, and blood data. Naturally,
these data are massive, possibly high-dimensional, diverse,
Fig. 44.1 Block structure and architecturally complicated. A practical data mining
approach is needed to collect and analyze such data and
categorize rural women. Researchers offer an intelligent data
2. Literature Review categorization system (IDCS) to do this. The IDCS has four
The written survey before this assessment explains blockchain phases based on a thorough categorization process. IDCS
creation and product distribution throughout geographical begins by methodically collecting data from diverse sources
locations. The review illustrates that subjects such as on rural women’s nutritional knowledge, attitudes, and dietary
Blockchain as an organization advancement, intelligent patterns [14]. Attrition is the gradual loss of firm personnel
contracts, implementation plans, entrepreneurial possibilities without replacement. High attrition rates cause talent loss,
and difficulties, and Blockchain as a globally beneficial inadequate research, and wasted training expenditures [15].
284 Algorithms in Advanced Artificial Intelligence

The first part of this quantitative research used a novel people depart for better opportunities. Replacing qualified
questionnaire to assess Malaysian blockchain communities’ staff costs money. Thus, they examine current and previous
awareness, acceptance, and confidence in blockchain employee data to determine prevalent attrition factors [21].
technology applications. The questionnaire asks about News media informs the public about frequent happenings.
demographics, FinTech awareness, trust, and acceptance, Today, social media like Twitter delivers user-generated news
notably Blockchain and cryptocurrencies. A 304-person pilot information. Clustering data and providing just important
study validated the revised questionnaire in the second phase. information makes this resource valuable. For data filtering,
The reliability test uses Cronbach’s alpha of 0.908. This phase they employed density-based k-means and graph clustering.
included a validated questionnaire survey with 304 online After filtering, we rank the data by keyword frequency,
responses. The final step of the research employed descriptive relevant key terms, and dataset fundamental term similarity.
statistics to show that blockchain and cryptocurrency They may also cover science, technology, sports, and trends
knowledge is intermediate [16]. Twitter is the most popular besides news [22].
microblogging service and a growing social network. Social Technology enthusiasts are excited about decentralized digital
media contains much data in tweets, forums, status updates, cryptocurrency and “Blockchain” technologies. Blockchain
comments, etc. Applications may use sentiment analysis to protocols, or distributed-ledger technology, have great
process and evaluate this data automatically. Twitter sentiment promise in financial technology. This technology allows the
analysis uses tweets to determine user thoughts and attitudes. creation of safe, trustworthy, and decentralized autonomous
Natural Language Toolkit (NLTK) is a Python machine ecosystems for numerous situations, including better use of
learning and sentiment analysis toolkit. What underpins text old devices, infrastructure, and resources [23]. According to
processing and classification? The study presented a machine a systematic study, cryptocurrencies are digital currencies
learning-based classifier to extract election tweets and assess that perform blockchain-based transactions. Control is held
tweeples’ opinions. Tweets about a politician might be good, by an algorithm and its users in this decentralized financial
harmful, or neutral [17]. system. Financial tool blockchain may boost global growth
The stock market projection predicts the future value of [24]. Feedback is obtained using qualitative scoring.
equities exchanged with another financial system. The present Recent feedback mining techniques mainly concentrate on
study thoroughly explains Machine Learning stock prediction. qualitative remarks and involve manual procedures. It cannot
Machine learning and AI are being used to anticipate stock be evaluated by further examination. A student feedback
values. Researchers spend more time developing methods mining system (SFMS) uses text analytics and sentiment
to increase stock prediction model accuracy each day. This analysis to give educators quantifiable and in-depth analysis
research focuses on the best stock market prediction model of qualitative student input, enhancing learning experiences
[18]. Graders struggle to provide consistent feedback with a [25].
consistent interface, mindset, and deadline. Words, sentences, Bitcoin is a peer-to-peer digital currency maintained by
word count, average length, structure, and arrangement of an open-source software with cheaper transaction costs, higher
essay are used for accurate grading. The sequential forward security and scalability than fiat money, and no central bank.
feature selection approach compares accuracy and picks Scientific interest is growing despite concerns about unlawful
the best subset. It’s easy to build an efficient subset from an usage and societal effects. This study defines and evaluates
empty set and works well on small data sets [19]. bitcoin sustainability literature, including environmental,
Unlike traditional money, digital money is a block of data social, and economic factors. According to studies, Bitcoin is
validated by a hash. All Bitcoin users in the environment a niche currency because mining new bitcoins and running the
get the info. Data mining will occur when a user transacts. virtual monetary system need too much energy. Blockchain, a
Cryptocurrencies have pros and cons as money, and there is no distributed and democratically maintained public transaction
legal framework for their circulation. Government recognition record, may provide new and challenging opportunities
is needed for the public to accept digital money as payment. [26]. Evaluation is the primary way schools assess students’
Because Bitcoin is unfamiliar to several Indonesians, the learning capabilities. One of the main tests is essay writing.
government has not recognized it as a currency. Technology Currently, this assessment is done manually. This takes
is advancing rapidly in the 4.0 revolution age, and digital time and effort. For this project, this assessment method is
money will replace physical money in the following years automated. We use keywords as features since machines can’t
due to its ease [20]. Employee attrition prediction is a severe grasp our rating metrics. First, the student essay is matched
issue in enterprises today. Organizations struggle with against admin keywords. The essay is poor if the similarity is
employee attrition when skilled, technical, and essential under 20% [27].
Exploring the Rise of Cryptocurrencies with Blockchain Technology 285

3. Working on Blockchain it to the rest of the framework as the concurrent one in the
Blockchain. On the other hand, will the system decide which
The invention of Blockchain applies to every publicly traded court should be added to the Blockchain? Meanwhile, various
public asset transfer, mainly on the Internet today. However, Centre positions could generate different, completely distinct
due to the general third-party UN agency technique and squares. In the absence of need, squares may converge at
intervention in any electronic transaction, the online business totally other |completely various solicitations to varying
gets entirely fixed to the fund foundation filling out the concentrations within the frame, making it impossible to rely
forms [28]. Moreover, the approval defense of transfers is on demand.
the responsibility of a particular third party. Therefore, an
unavoidable level of dishonesty exists in online trades, which 4. Corporate Financing with Bitcoin
requires the intervention of cash-connected exchanges to be
successful. As a result, conversion rates are incredibly high. Incorporating business governance into the Bitcoin and
An online exchange between two willing participants is Blockchain structures is generating interest and enthusiasm
executed over the Internet using Bitcoin, which relies on in several areas. The information system is robust, and
scientific disciplinary evidence rather than the confidence of blockchain improvement is being made to produce a safer,
the outsider. Every trade is protected by a digital signature more profitable strategy for stock trading. DocuSign, an
that has been verified. Every interchange is disseminated to associate’s organization that makes significant investments
the “general society key” of the collectors and strictly tagged in e-contracts, has mainly uncovered a proposed method
with the “encryption key” of the sender using the “general with Visa for using Blockchain to track motorcar rentals and
society key” of the sender. Businessmen dealing in electronic reduce the amount of paper used [31]. DocuSign is a leading
currency must exhibit their ability to take responsibility provider of e-contracts. Microsoft may turn “shrewd gets”
for their “encryption key” while considering the main containing information about its organization into “shrewd
objective of burning through cash. The fraction receiving the gets” that take advantage of blockchain innovation. Even
advanced money confirms the processed signature, assuming though blockchain technology is still young, corporations
responsibility for the “encryption key” on the transfer are researching ways to construct many tiny, “private
using “the public key” of the sender [29]. Each exchange is blockchains” inside their operational environments due to
broadcast to every Bitcoin hub and documented in a public their growing fixation with blockchain development.
record after verification.
5. Blockchain with IoT
Transaction
Send Money to B Represented as Block IoT has the potential to become a mainstream innovation in
both the consumer and corporate sectors. This requirement
has prompted attempts to establish IoT phases in specific
Transaction locations [32]. Among the benefits of blockchain growth is
Block Send to ALL
Approved by Network that it energies localized IoT phases, such as anchored and
reliable information interchange and, most importantly,
keeping records. In such a scenario, the Blockchain serves
Blocks Added to Money Moves from A as the public record, preserving an accurate record of various
Chain to B communications sent back and forth between devices inside
an Internet of Things highly localized architecture. In
Fig. 44.2 Send process with block chain collaboration with Samsung, ADEPT is a component created
The Bitcoin project addressed this problem by establishing by IBM that uses members from Bitcoin’s hidden diagram to
today’s recognized Blockchain development rules. In create a flowing arrangement of devices, sometimes known
Fig. 44.1, the Bitcoin system orders transactions by placing as a localized network of things [33]. Inside the platform,
them on social events called squares and linking them using Skillful makes use of three various origins: BitTorrent (record
Blockchain technology after a short time. The deals likely distribution), Ethereum (innovative agreements), and Tele
took place in a small space. These squares are connected (like Hash (Peer-to-Peer Communication).
a chain) in a genuine instantaneous, sequential request. Every
court holds the ashes of the previous rectangle, and each 6. The Benefits of BITCOIN
rectangle contains the hashing of the last square [30].
Table 44.1 demonstrates different types of Blockchain and
There is another difficulty that needs to be addressed. When a its merits and demerits. With a decentralized money system,
Centre creates a square from untested trades, it recommends the govt or banks had no connection to the currency. This
286 Algorithms in Advanced Artificial Intelligence

Table 44.1 Different types of block chain

Block Types Advantages Disadvantages Use Cases
Public High Independence Low Performance Cryptocurrency
High Transparency Low Scalability Document validation
High Trust Low Security
Private High Access control LowTrust Supply chain
High Performance Low Auditability Asset ownership
Hybrid High Access control Low Transparency Medical records
High Performance Low Upgrading Real estate
High Scalability
Association HighAccess control Low Transparency Banking
High Scalability Research
High Security Supply chain

could be useful if a country is struggling financially (like data insurance, sociology, and responsibility [38]. For
the “Unique Recession” in the United States). Transactions example, Liang pushes a Blockchain-based Cloud information
are usually free and minimal. Trading cash in any zone in birthplace definition, or, in other words, ‘ProvChain,’ which
the world is simple. However, in reality, it takes little time is very specific in cloud information birthplace. After a
at all. In the case of an individual who saved bitcoins, banks redesigned straightforwardness and information obligation,
cannot use them. This implies that monetary torments a cloud-based Blockchain program will defend against
imposed by governments will not affect Bitcoin’s valuation. modified records [39, 40, 41]. The birthplace information
The square chain growth considerably reduces the need for becomes more accessible, reliable, confident, and valuable
existing intermediaries to bridge the respect-based trust [42].
gap. However, Bitcoin and different digital currencies are
volatile. This means a bitcoin’s value might fluctuate without 8. Conclusion
warning, and there is no way to predict or explain why. Their
value fluctuates because bitcoins are not tied to a single The Blockchain’s decentralized technology and peer-to-peer
association, nation, or bank. Unlawful things and activities characteristics are highly regarded. However, Bitcoin hides
(illegal narcotics, guns, etc.) could be paid for using bitcoins, several blockchain kinds of research. Therefore, Blockchain
which are more challenging to track. Bitcoins are now stored has applications well beyond Bitcoin. Blockchain has changed
in virtual online wallets. A skilled software engineer could traditional businesses through decentralization, consistency,
break into these virtual wallets, but it has been done before. anonymity, and audit. To summarise, Blockchain is the
Many customers struggle to grasp Bitcoin’s complicated developing backbone of the Bitcoin currency. The importance
square chain. of passed-on data and Blockchain’s safety make it an exciting
breakthrough for comprehending current financial and non
cash-related company issues. Depending on your viewpoint,
7. Operation of Blockchain Outside the computerized, money-based, primarily technical school
Cryptocurrency is either vainglorious or disappointing in growth. Our
Bitcoin is a great Blockchain application. Registration of efforts to enhance blockchain technology allow us to utilize
sanctionative unfathomable applications is a miracle [34, it for business transactions. Thus, its security, assurance,
35]. We protect and verify definitive reports using deeds traceability, trademark knowledge origin, and timestamping
and validations, medical administration data, IoT, and features have moved outside its core application zones.
Cloud. Tapscott claims that Blockchain will be the “General Regarding trading, the Blockchain and its variants are
Ledger,” allowing intelligent deeds, suburbanized and self- interested in any reasonable transaction, regardless of whether
administering affiliations/citizen-led groupings, and more it is a human-to-human or automated marketing.
[36,37]. Cloud information is a part of ‘Data Provenance,’ Moreover, it creates the idea that it is secure, which is
which maintains the historical background of every cloud particularly significant considering the overall development
information challenge and the subsequent assignments of the Internet of Things. As a direct consequence of this,
that are fulfilled as quickly as feasible in the Cloud. For the Blockchain has garnered a lot of interest. It seems this is
the foreseeable future, it will be essential to grant the most the case when discussing emerging nations when establishing
extravagant security to the information birthplace to ensure trust is one of the most important goals.
Exploring the Rise of Cryptocurrencies with Blockchain Technology 287

InProceedings of International Conference on Recent Trends

References in Computing: ICRTC 2021 2022 (pp. 51-60). Springer
1. Akins, B.W., Chapman, J.L. and Gordon, J.M. (2013) A Singapore.
Whole Newsworld: Income Tax Considerations of the 15. Shankar RS, Priyadarshini V, Neelima P, Raminaidu CH.
Bitcoin Economy. shares (2016) Ant shares Digital Assets for Analyzing Attrition and Performance of an Employee using
Everyone, https://www.antshares.org. Machine Learning Techniques. In2021 5th International
2. Atzori, L., Iera, A. and Morabito, G. (2010) ‘The internet Conference on Electronics, Communication and Aerospace
of things: a survey’, Computer Networks, Vol. 54, No. 15, Technology (ICECA) 2021 Dec 2 (pp. 1601-1608). IEEE.
pp.2787–2805. 16. Ku-Mahamud KR, Omar M, Bakar NA, Muraina ID.
3. Bentov, I., Lee, C., Mizrahi, A. and Rosenfeld, M. (2014) Awareness, trust, and adoption of blockchain technology and
‘Proof of activity: extending Bitcoin’s proof of work via cryptocurrency among blockchain communities in Malaysia.
proof of stake [extended abstract]’, ACM SIGMETRICS International Journal on Advanced Science, Engineering &
Performance Evaluation Review, Vol. 42, No. 3, pp.34–37. Information Technology. 2019;9(4):1217-22.
4. Eyal I, Sirer EG. Majority is not enough: Bitcoin mining 17. Kameswari KK, Raghaveni J, Shankar RS, Rao CS.
is vulnerable. Communications of the ACM. 2018 Jun Predicting Election Results using NLTK. International
25;61(7):95-102. Journal of Innovative Technology and Exploring Engineering.
5. Billah, S. (2015) One Weird Trick to Stop Selfish Miners: 2019;9:4519-29.
Fresh Bitcoins, A Solution for the Honest Miner. 18. Jyothirmayee S, Kumar VD, Rao CS, Shankar RS. Predicting
6. Di Battista G, Di Donato V, Patrignani M, Pizzonia M, stock exchange using supervised learning algorithms.
Roselli V, Tamassia R. Bitconeview: visualization of flows International Journal of Innovative Technology and Exploring
in the bitcoin transaction graph. In2015 IEEE Symposium on Engineering. 2019;9(1):4081-90.
Visualization for Cyber Security (VizSec) 2015 Oct 25 (pp. 19. Shiva Shankar R, Ravibabu D. Digital Report Grading Using
1-8). IEEE. NLP Feature Selection. InSoft Computing in Data Analytics:
7. Biryukov, A., Khovratovich, D. and Pustogarov, I. (2014) Proceedings of International Conference on SCDA 2018 2019
‘Deanonymisation of clients in bitcoin p2pnetwork’, (pp. 615-623). Springer Singapore.
Proceedings of the 2014 ACMSIGSAC Conference on 20. Faturahman A, Agarwal V, Lukita C. Blockchain technology-
Computer and CommunicationsSecurity, New York, NY, the use of cryptocurrencies in digital revolution. IAIC
USA, pp.15–29. Transactions on Sustainable Digital Innovation (ITSDI). 2021
8. Enoksen FA, Landsnes CJ, Lučivjanská K, Molnár P. Oct 31;3(1):53-9.
Understanding risk of bubbles in cryptocurrencies. Journal of 21. Shankar RS, Rajanikanth J, Sivaramaraju VV, Murthy KV.
Economic Behavior & Organization. 2020 Aug 1;176:129-44. Prediction of employee attrition using datamining. In2018 ieee
9. Rao VV, Silpa N, Gadiraju M, Shankar RS, Vijaya K. An international conference on system, computation, automation
Optimal Machine Learning Model Based On Selective and networking (icscan) 2018 Jul 6 (pp. 1-8). IEEE.
Reinforced Markov Decision To Predict Web Browsing 22. Sebastião HM, Cunha PJ, Godinho PM. Cryptocurrencies and
Patterns. Journal of Theoretical and Applied Information Blockchain. Overview and future perspectives. International
Technology. 2023 Jan 31;101 (2):859-73. Journal of Economics and Business Research. 2021;21(3):305
10. Ghosh A, Gupta S, Dua A, Kumar N. Security of 42.
Cryptocurrencies in blockchain technology: State-of-art, 23. Shankar RS, Murthy KV, Rao CS, Gupta VM. An approach
challenges and future prospects. Journal of Network and for extracting tweets from social media factors. In2018 ieee
Computer Applications. 2020 Aug 1;163:102635. international conference on system, computation, automation
11. VVR MR, Silpa N, Gadiraju M, Reddy SS, Bonthu S, and networking (icscan) 2018 Jul 6 (pp. 1-7). IEEE.
Kurada RR. A Plausible RNN-LSTM based Profession 24. Hameed BI. Blockchain and Cryptocurrencies Technology:
Recommendation System by Predicting Human Personality a survey. JOIV: International Journal on Informatics
Types on Social Media Forums. In2023 7th International Visualization. 2019 Nov 9;3(4):355-60.
Conference on Computing Methodologies and Communication 25. Shankar RS, Srinivas LV, Ravibabu D, Raminaidu C. Novice
(ICCMC) 2023 Feb 23 (pp. 850-855). IEEE. Retroaction Report. ARPN Journal of Engineering and
12. Reddy SS, Gadiraju M, Maheswara Rao VV. Analyzing Student Applied Sciences. 2006;13.
Reviews on Teacher Performance Using Long Short-Term 26. Giungato P, Rana R, Tarabella A, Tricase C. Current trends in
Memory. InInnovative Data Communication Technologies sustainability of bitcoins and related blockchain technology.
and Application: Proceedings of ICIDCA 2021 2022 Feb 24 Sustainability. 2017 Nov 30;9(12):2214.
(pp. 539-553). Singapore: Springer Nature Singapore. 27. Shankar RS, Babu DR, Murthy KV, Gupta V. An approach
13. Hashemi Joo M, Nishikawa Y, Dandapani K. Cryptocurrency, for essay evaluation using system tools. In2017 International
a successful application of blockchain technology. Managerial Conference on Innovative Research In Electrical Sciences
Finance. 2020 Aug 29;46(6):715-33. (IICIRES) 2017 Jun 16 (pp. 1-9). IEEE.
14. Maheswara Rao VV, Silpa N, Mahesh G, Reddy SS. An 28. Bonneau, J., Narayanan, A., Miller, A., Clark, J., Kroll, J.A.
Enhanced Machine Learning Classification System to and Felten, E.W. (2014) ‘Mixcoin:Anonymity for bitcoin with
Investigate the Status of Micronutrients in Rural Women. accountable mixes’, Proceedings of International Conference
288 Algorithms in Advanced Artificial Intelligence

onFinancial Cryptography and Data Security, Berlin, 36. Dennis, R. and Owen, G. (2015) ‘Rep on the block: A next
Heidelberg, pp.486–504. generation reputation system basedon the blockchain’, 2015
29. M. Marchesi,”’ Why Blockchain is important for programming 10th International Conference for Internet Technology and
designers, and why programming building is crucial for SecuredTransactions (ICITST), IEEE, pp.131–138.
blockchain programming (Keynote),”’ 2018 International 37. Eyal, I., Gencer, A.E., Sirer, E.G. and Van Renesse, R. (2016)
Workshop on Blockchain orientating software system ‘Bitcoin-ng: a scalable blockchainprotocol’, Proceedings of
Engineering (IWBOSE), Campobasso, 2018, pp. 1-1. 13th USENIX Symposium on Networked Systems Design
30. T. N. Dinh and M. T. Thai,’ “AI and Blockchain: A turbulent andImplementation (NSDI 16), Santa Clara, CA, USA,
Integration,”’ vol. 51, no. 9, pp. 48-53, Gregorian calendar pp.45–59.
month 2018. 38. Fan, Z., Kulkarni, P., Gormus, S., Efthymiou, C., Kalogridis,
31. L. Kan, Y. Wei, A. Hafiz Muhammad, W. Siyuan, G. LinchaoJ. G.,Sooriyabandara, M., Zhu, Z.,Lambotharan, S. and Chin,
Fiaidhi, S. Mahomet and S. Mohammed,”” EDI with W.H. (2013) ‘Smart grid communications: overview of
Blockchain as associate Enabler for Extreme Automation,”’ in researchchallenges, solutions, and standardisation activities’,
IT skilled, vol. 20, no. 4, pp. 66-72, Jul./Aug. 2018. IEEE Communications Surveys andTutorials, Vol. 15, No. 1,
32. V. Gatteschi, F. Lamberti, C. Demartini, C. “Pranteda and V. pp.21–38.
Santamaría,”’ To Blockchain or to not Blockchain: that’s the 39. Miorandi, D., Sicari, S., Pellegrini, F.D. and Chlamtac, I.
Question,’” in IT skilled, vol. 20, no. 2, pp. 62-74, Mar./Apr. (2012) ‘Internet of things: vision, applicationsand research
2018. challenges’, Ad Hoc Networks, Vol. 10, No. 7, pp.1497–1516.
33. . T. A. Dinh, R. Liu, M. Zhang, G. Chen, B. C. Ooi and J. 40. Garay J, Kiayias A, Leonardos N. The bitcoin backbone
Wang, “Unraveling Blockchain: a knowledge process read of protocol: Analysis and applications. InAnnual international
Blockchain Systems,”’ in IEEETransactions on information conference on the theory and applications of cryptographic
and information Engineering, vol. 30, no. 7, pp. 1366-1385, techniques 2015 Apr 14 (pp. 281-310). Berlin, Heidelberg:
one July 2018. Springer Berlin Heidelberg.
34. N. Kshetri, “Can Blockchain Strengthen the net of Things?,”’ 41. Gervais A, Karame GO, Wüst K, Glykantzis V, Ritzdorf
inand H. Kai,”’ A Multiple Blockchains design on Inter- H, Capkun S. On the security and performance of proof of
Blockchain Communication,”’ 2018 IEEE International work blockchains. InProceedings of the 2016 ACM SIGSAC
Conference on software system Quality, responsibility and conference on computer and communications security 2016
Security Companion (QRS-C), L’isbon, 2018, pp. 139-145. Oct 24 (pp. 3-16).
35. Decker, C., Seidel, J. and Wattenhofer, R. (2016) ‘Bitcoin 42. Huckle S, Bhattacharya R, White M, Beloff N. Internet of
meets strong consistency’, Proceedings ofthe 17th things, Blockchain and shared economy applications. Procedia
International Conference on Distributed Computing and computer science. 2016 Jan 1;98:461-6.
Networking (ICDCN), ACM,Singapore, Singapore, p.13.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Mitigating Misinformation: An Advanced

Analytics Framework for Proactive
Detection of Fake News to Minimize 45
Misrepresentation Risks

R. Shiva Shankar1, G. Mahesh2

Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, India
V. Maheswararao3, N. Silpa4
Department of Computer Science and Engineering,
Shri Vishnu Engineering College for Women (A), Bhimavaram, Andhra Pradesh, India
K V S Murthy5
Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, India.

Abstract: Fake news is spreading deception by changing people’s perspectives and knowledge. Social media and online forums
have helped spread fake news by mixing it with actual news. This paper presents novel text mining techniques and strategies for
detecting Fake and misinformation to decrease the hazards associated with its usage. We begin by describing the structure of the
suggested approach and the underlying conceptual method, providing implementations and verification using newspaper data.
We gathered genuine and bogus information and then translated it into a subject and event-based description from a manuscript
database. Fake news is being identified using a two-layered technique that includes identifying fake topics and false events.
The reliability of the proposed approach is proved by creating and validating an innovative False E-News Detector (FEND)
technology. Based on the provided threshold level of 0.6, the suggested methodology obtains 92.49 percent accuracy of the
classification and 94.16 percent recall.
Keywords: Fake news, Misinformation, Disinformation, Text mining techniques, Hazard reduction, Social networks, Internet
forums

1. Introduction on the internet to propagate false information considerably

increased the hazards of spreading disinformation to people
“The online article of purposefully or deliberately false and organizations (incorrect information). For instance,
claims of reality” describes false information [1]. The social networks regularly distribute fake news by altering
emphasis is on publications or comments posted on social real news or fabricating new information. Berners-Lee, the
media in the hopes of public “viral.” Fake news feeds on Internet creator, stated recently that false information is
the spread of fake stories, frauds, exaggeration, and outrage among the most troubling online developments that need to
due to news stories posted on the internet [2]. Although be addressed [3].
deliberate damage is debatable, numerous reasons — Detecting fake news is challenging but not impossible due to
economic, societal, and political advantage – are frequently its diversity and secrecy. False information has the potential
used to promote bogus information. The latest advancements to have a negative impact and cause harm. Modifying the data

1
shiva.csesrkr@gmail.com, 2mahesh.cse.srkr@gmail.com, 3mahesh_vvr@yahoo.com, 4nrusimhadri.silpa@gmail.com, 5kvssrmurthy75@gmail.com

DOI: 10.1201/9781003529231-45
290 Algorithms in Advanced Artificial Intelligence

stream used for media usage affects an individual’s decision- chamber” or “filter bubble” situation through social networks
making and alters one’s impressions of actual events. The magnifies the demand for acceptable news items. Consumers
influence is much more damaging at the organizational level using social media platforms prefer to carefully connect
because it jeopardizes their brand names and may influence with people, sharing their ideas and consuming material that
how their goods or services are consumed [4]. Because of appeals to their interests. The effect is amplified by social
increasing online media consumption and bots (e.g., Twitter media’s personalization options [8]. Thus, false information
bots) that automated data dissemination, news bulletins triggers consumers’ points of view to become even more
posted on social media worsened the problem. A recent polarized, increasing the risk of data polarization. The
study of verified fake news in the three months leading up data polarization impact is caused by differential incorrect
to the 2016 election found 38 million Facebook shares of information consumption caused by selective disinformation
presidential candidate support [4]. News verification systems exposure. Fake stories have been frequently followed with
have improved, addressing the need for automated methods fact-checks provided by various media outlets.
to detect false news from authentic news in the vast amount The distributed HDFS-Spark parallel computing architecture
of information [5]. Recent fake news disclosure methods fall helps the MLSRM capture and store web surfing data
into two categories based on methodology, language, and effectively. Later, MLSRM created a reinforcement method
networking tactics. Lexical methods (e.g., natural language to intelligently pick and integrate different Markov decision
or NLP) have investigated false information trends by processes to obtain actionable information to comprehend
examining underlying semantics. online user browsing behaviors with decreased state
On the other hand, network techniques use existing knowledge complexity and enhanced forecasting performance [9]. On
networks to verify truths. In numerous respects, our inquiry the other hand, internet access has promoted self-expression
contributed to the existence of awareness. Firstly, a unique and socialization online. One of the most modern social
analytic-based approach for spam detection is presented, networking networks, Twitter, produces gigabytes of data
which employs a topic-based categorization mechanism to daily. The current study investigates whether internet profiles
partition genuine content into various subject categories. and activities indicate people’s personalities [10]. SS Reddy
The news from each group seems to have a similar theme. et al. [11] proposed a sentimental analysis methodology for
Therefore, an event-extraction method extracts the actions assessing student comments utilizing extended short-term
from these media items. Secondly, by contrasting steps memory. The suggested model is compared to Naive Bayes,
obtained from the media to others in real news, we construct decision trees, and random forests. Based on a thorough
and execute a trustworthiness criterion for establishing the categorization process, the IDCS has four steps. IDCS begins
legitimacy of any data. by methodically collecting data from diverse sources on
rural women’s nutritional knowledge, attitudes, and dietary
2. Literature Review patterns. After that, it pre-processes the data according to
conventional methods and organizes it for categorization.
The creation and dissemination of fake news pose substantial Next, a learning algorithm develops an intelligent classifier
hazards from various viewpoints, particularly public to divide rural women into appropriate nutritional categories
protection. A great instance of this would be purposeful [12]. Attrition is the gradual loss of firm personnel without
misinformation that tries to affect a person’s opinion replacement. High attrition rates cause talent loss, inadequate
of another person or national polls. Politically divided research, and wasted training expenditures [13].
consumers in the US and Europe desire information from Social networking sites have become essential tools for
like-minded sources. It might be confirmation bias, or people to interact. Twitter is the most popular microblogging
“tunnel vision,” which involves creating one-sided scenarios service and a growing social network. Social media contains
based on previous preconceptions or ideologies [6]. Contrary much data in tweets, forums, status updates, comments,
to the confirmation bias concept, such research shows that etc. Applications may use sentiment analysis to process
people are misled by false information since they fail to and evaluate this data automatically [14]. The stock market
think logically during news rather than be motivated. [7] projection predicts the future value of equities exchanged
addresses different cognitive biases that operate as obstacles with another financial system [15]. Graders find delivering
to analyzing and resolving disinformation whenever humans comments with a steady interface, mentality, and deadline
analyze bogus information or disinformation. difficult. For optimum grading accuracy, a bag of words,
The rise of misleading information risks is misleading readers, sentences, and word count, average length, structure, and
preying on their need for pleasant information. Also, they organization of an essay are employed [16]. Organizations
lack logical thought while reading national media. The “echo struggle with employee attrition when skilled, technical, and
Mitigating Misinformation: An Advanced Analytics Framework for Proactive Detection of Fake News 291

essential people depart for better opportunities. Replacing examples of widely used network attributes. For instance, it
a qualified staff costs money. Thus, they examine current involves customer evaluation to eliminate disinformation in a
and previous employee data to determine prevalent attrition Parkinson’s disease-related social networking site community
factors [17]. Social networks like Twitter give user-generated [24].
news data. Clustering data and providing just important According to this study, disinformation in a discussion
information makes this resource valuable. For data filtering, forum depends on the publisher’s content and consumer
density-based k-means and graph clustering. After filtering, characteristics. Other research provides a technique for
we rank the data by keyword frequency, relevant key terms, evaluating the quality of replies in an internet crowd-sourced
and dataset key term similarity [18]. A student feedback survey report, the readability of the thread topics, and the
mining system (SFMS) uses text analytics and sentiment users’ ability to contribute usefully [25]. Unfortunately,
analysis to give educators quantifiable and in-depth previous sentiment and syntax analysis algorithms have been
analysis of qualitative student input, enhancing learning tailored to specific data categories, making them ineffective
experiences [19]. This takes time and effort. For this project, for detecting fake news.
this assessment method is automated. We use keywords as
features since machines can’t grasp our rating metrics. First, CNT orchestrates various ways for selecting weblog features
the student essay is matched against admin keywords. Articles to identify falsehoods [26]. Demonstrated that the optimum
with less than 20% similarity are not good enough [20]. features combined may recognize satirical information about
90 percent accuracy and 84 percent recall while evaluating
According to the “echo chamber” tendency, partisan news various selected features using only 360 news stories.
users systematically judge and share fact-checking content, However, those tactics might influence immoral authors to
as proven in research [21]. When it comes to the topic of generate false propaganda without displaying recognizable
fact-checking, various research on vote-based systems qualities. Evaluation of emotion and syntax Analysis of the
has yielded diverse outcomes. Whenever fact-checks of data and lexicon techniques are used to identify aberrant
disinformation provided individuals, a “backfire effect” data inside textual data with great accuracy [27]. Analytical
happens, in which they emotionally counter-argue & enhance statistics Hancock, Woodworth, and [28] suggested a method
their earlier incorrect impressions. However, [22] presented for studying the characteristics of crime stories. Their
no proof of true backfiring in a recent survey. Whereas fact- findings demonstrate that psychopaths’ speech contains the
checks could help rectify the news for the history, they are highest frequency of distortions and that they employ more
fruitless hazard evaluations. They are now almost entirely past tense words in stories than current tense words. [29] The
ineffectual in preventing the spread of incorrect information Word Vectors method developed the rhetorical formula, or
and data polarization in the first position. This highlights the RST, to discern the difference between authentic and false
urgent need for more reliable false information identification text information.
systems to stop the spread of misleading information.
Sentiment analysis is a frequently used method of identifying
dishonesty, specifically misleading Spam, in common.
3. Fake News Detection [30] suggested PU-learning identifies fraudulent Spam by
Consequently, social media websites are very cautious and analyzing real and fake views. Analysis of linguistic clues [31]
must start incorporating a false news recognition approach. shows that linguistic cues drawn from dishonesty concepts,
Cross-platform procedures, on the other hand, have gotten combined with information clues depending on textual
little consideration. Identifying fake news that originates from information, could be beneficial in identifying dishonest and
many websites could be an efficient approach for authorities. non-fraudulent projects on crowd financing policies.
Fake news is made by generating false information or Deep syntax analysis: PCFG (context-free probability
altering essential information. Fake news achieves credibility grammars) is a viable way to separate phrases into rewriting
by (1) copying well-known writers’ style of writing or (2) trees that describe syntactic structures using deep syntax
conveying viewpoints in a tone that is common in real news. research. [32], For instance, we examined syntactic stylometry
A growing number of false information detection algorithms for detecting fraud using hotel web data and characteristics
have lately been created. All present detection methods obtained from context-free grammar (i.e., CFG) parse trees.
could be classified into linguistic-based and network-based Unfortunately, existing prediction strategies have been
approaches [23]. Network properties are a helping element developed for specific data types or contextual factors, such
for different linguistic-based techniques in network-based as spam review site detection. [33] and spam mail recognition
methods towards fake media identification. Website data, and thus are insufficient for general-purpose spam filtering
editors’ data, timestamps, and other network properties are that might implement a variety of topics or difficulties.
292 Algorithms in Advanced Artificial Intelligence

4. Topic Extraction to assess the truthfulness & reliability of assertions made

by US authorities and many others [39]. However, the
Banko presented Text Runner as one of the earliest, yet PolitiFact methodology seems heavily reliant on human
highly flexible, OIE platforms (2007). However, only a few involvement, with reporters analyzing evidence by watching
famous OIE methods have been designed since Text Runner television, monitoring social networking sites, & analyzing
[34] presented an alternative OIE technique (ClausIE) in reader comments. Unlike PolitiFact, our system depends on
2013, which decomposes words into a collection of “clauses” artificially intelligent methods that analyze text information
to preserve the data quality of unique text documents. Similar sources instead of intervention from many reporters.
work by [35] incorporates comparable extracting stages Fake News Detector AI — uses artificial intelligence methods
but adds to the possibilities of technique by incorporating such as BlackBox to identify fake news sites by comparing
contextual phrase deconstruction to aid lexical searches [36]. their similarities to fraudulent sites [40]. On recognized sites,
[37] demonstrated a method for extracting text associations our process applies a neural network-based features analysis
using no-verb phrases. OLLIE and ClausIE have been used approach (e.g., headlines, coding structure, a spot popular)
to verify the extracting outcomes. Their findings indicate to determine the legitimacy of the evaluated internet sites.
that ClausIE discovers more excellent extra associations than Regarding feature kinds, our system is different from this
OLLIE, implying that ClausIE performs better than OLLIE. method for detecting. Fake News Detection AI uses network-
based characteristics, while our policy uses semantic-based
5. Fake News Retrieval Forms factors.
Past attempts to detect fake news have taken a variety of
forms. For example, [38] alerts users to untrustworthy news 6. The Analytics Paradigm
sources by searching most links on a provided website for Topics and events: The theoretical and statistical foundations
seeds collected in an untrustworthy news dataset. Also, of the suggested analytical solution for determining the
it incorporates test results of false information, satire, authenticity of news items are described in this part. This part
extreme bias, conspiracy theories, rumor mills, state news, begins by explaining how complete and incomplete phrases
junk science, and the like. This strategy uses a knowledge are constructed. Following that, we formalize the definitions
base of unreliable connections, even though the repository of actions and themes derived from entire phrases. Boolean-
maintains essential and complex information to aid fake news value functions are defined, which identify fraudulent
identification. But unlike the web plugin, our method uses occurrences and subjects from authentic ones. Lastly, we
news stories to undertake an in-depth evaluation to calculate go over the mathematical technique for determining the
reliability rankings. PolitiFact seems to be a six-dimensional reliability of news items [41]. Topics or incidents can be used
grading system for fact-checking. This is routinely used

Fig. 45.1 By sorting authentic news stories into newsgroups, the model-training methodology creates ground-truth sources of
knowledge
Mitigating Misinformation: An Advanced Analytics Framework for Proactive Detection of Fake News 293

to identify false propaganda. A news story is fabricated of 7. Proposed Work

several sentences. α = {σ1, σ2, . . ., σn}. We represent an
item as a collection of n words. Based on the concentration of This part discusses the tool for identifying false propaganda
item set Oi in the triplets of the ith phrase, we divide phrases and describes the recommended research methodology.
in the text into whole and unfinished phrases. If indeed the Following that, we’ll go over the internet crawler’s design
thing set Oi exists in the phrase triple, we write to a phrase I and the pseudo-code that goes with this. Finally, we describe
as a complete sentence; else, we refer to it as an incomplete the data processing and classification. Finally, we review
sentence (i.e., (i.e., Oi = ∅) [42]. For two main reasons, the the analytic techniques to group and classify fake news.
model suggested in this work identifies bogus information Finally, the complete architecture and numerous elements
from the total sentence set Scp instead of the idiomatic are combined to create FEND, a new fake news detection
phrase set Sic. Due to the absence of items, incomplete software (Fake News Detection) [45].
phrases contain information bits. Secondly, declaratory
phrases - conveying facts – are unfinished phrases among 7.1 Research Frame Work
the four groups of phrases. Which is known (i.e., declaratory, The architecture which governs the creation of FEND is
questioning, crucial phrases, and exclamation mark shown in Figure 2. FEND’s model-training approach divides
paragraphs [43] follow three unfinished phrase samples. news items into groups based on subjects so that news items
Incomplete Sentence 1: Ram is lying. 2: It’s rainy out there. in the same cluster have the same collection of topics. The
3: Sea water disappears while it’s heated. Some phrases subject sets of items categorized into multiple groups are
might include false information, while others may contain diverse [46]. False information is identified in two stages (see
accurate data. From the standpoint of words, we will discuss Fig. 10): (1) falsified identification utilizing article groups and
the disparity between false incidents and false themes in the (2) falsified recognition utilizing word similarities, created
following sections. We begin such a comparison by formally Media groupings based on media themes. A news item is
presenting happenings and ideas (Fig. 45.1). Trust and suspected of being false if (1) it cannot be categorized into
evaluation processes measure The author’s trustworthiness is any group or (2) its words get a minimal degree of similarity
determined using a function g(, generated from the logical to the relevant words in its newsgroup.
function fE. A story’s frequency of natural occurrences is Dealing with analogs is a crucial issue to be solved. Our
used to determine its trustworthiness. As a result, suggested models handle that problem in various ways,
Wi including techniques like tokenization, stemmed, and
Â ( fE (eij )) component labeling to guarantee that duplicate or ambiguity
g(a ) =
j =1 is deleted during the pre-processing phase. We use several
Wi capabilities from the Word vectors library to discover
synonyms of the predicated occurrence in a phrase list [47,
We use Eq. 1 to evaluate the author’s trustworthiness in news 48]
reporting. The articles would be classified as false if their
The following is an example of a detection process: The first
trustworthiness is too low (e.g., 0.6). During the trustworthiness
process uses most of the predicate’s synonym inputs and
assessment phase, we consider every occurrence in the article
every term in the verb listing. The second step is to evaluate
similarly for convenience. A story will likely be phony if a
the present evaluated predicated with every verb in the
significant incident isn’t happening. During the classification
contrasting word list in terms of synonym factors to find the
phase, the flow of every incident could be applied to signify its
arguments pairings with more resemblance. The final goal is
significance. During the falsified identification process, lots
to gather a certain quantity of synonym pairings (e.g., 100) to
of incidents were unintentionally produced. It is reasonable to
accurately measure a minimum resemblance threshold (e.g.,
suppose such occurrences are usually unrelated to each other.
86.6 percent), which will then be contrasted to the highest
For instance, we retrieved nearly 200,000 themes from data
correlation acquired in step 2 equivalents [49].
of 14,221 items; the series of events is greater than the variety
of topics. As a result, relying on incident ratings to discern The train data collecting component of the system collects
essential from insignificant occurrences is unworkable [44]. original data from trustworthy news websites and filters
Instead, consumers can specify the relevance of events by out noise like adverts. This method is implemented by
directly assigning a higher load to a more profoundly relevant employing a customized crawler exclusively for constructing
occurrence. the repositories and executing streaming data operations.
294 Algorithms in Advanced Artificial Intelligence

The data pre-processing component utilizes text-processing phrase databases. Tokens are converted into binary vectors
techniques to retrieve subjects and occurrences from freshly during the segmentation procedure, which is then used to
gathered news data. Finally, per the obtained activities, the generate topics. We employed a weighting word technique
clustering component divides the media stories into distinct called Term Frequency Inverse Document Frequency (TF
groups. IDF) [52]. This methodology enables you to rate the term
frequency of tokens according to their significance in the
8. Method-data Processing and content. The TF-IDF technique leverages the sci-kit learn
library, a Python-based machine learning library linked with
Clustering the before pipeline documents. The term frequency (TF)
The unprocessed information gathered by the web search metric of every topic incidence inside a manuscript is graded
feeds the building of the ground-truth and misinformation by its significance in the TF-IDF technique (i.e., IDF).
databases in the fake news detection architecture (see also IDF evaluates the importance and produces a set of weights
Fig. 2). We construct a global search engine to extract news for every topic inside the database, whereas TF reflects the
from multiple sources of websites to be checked by the false term frequency of every subject. The following formula is
information detecting approach to undertake broad trials. used to calculate the unprocessed TF-IDF data.
Combining the triple extraction process with OIEs, phrase t f -idf(t, d,D) = t f(t, d) × idf(t,D)
text categorization, validation, stemmed, asset labeling,
The letters t, d, and D represent a theme, an object, and several
occurrence gathering, deconstruction, and subject data
documents in the repository, respectively. TF measures the
compression, we create a phrase pipeline (see Fig. 45.2). The
cost of each topic occurring in each post (t, d). Finally, IDF
internet crawler’s original data is then subjected to numerous
is calculated using the formula below. Lastly, the Euclidean
pre-processing modifications for annotating, topic collection,
distance norm is applied to the original TF-IDF values for
and event excavation. The Natural Language Toolkit is
normalization. The method of producing occurrences and
utilized with the Stanford CoreNLP package, which offers
themes from news stories is depicted in Figure 4. The triple
a framework for conducting a series of language annotating
data center serves as a link between the OIE instruments
processes such as lemmatization, tag, phrase check, stem,
and the word-processing pipelines. The word-processing
and portion tagging [50, 51].
pipeline extracts incidents into an events database. We create
a module that divides the incident database into a subject &
verb database [53].
It allows for the categorization of news items in later phases.
The concept database is used to fuel topic-based article
classification in a later stage. We employ two cluster analyses
to train the three independent dataset methods: k-means and
proximity propagation. This approach allows us to put our
fake stories identifying hypothesis, which treats individuals
as attributes of news articles, to the test. Adopting these
different methods can be explained in two ways [54]. To
begin with, unlike the more complicated procedures, those
Fig. 45.2 The following is the procedure for producing two clustering algorithms are easy to accomplish. Secondly,
activities and subjects with news stories we select supervised (K-means) and one unsupervised
(attached propagation) strategy for each technique grouping
Every article in the ground truth and fake media corpora is as the apogee. The following is how the k-means cluster
segmented into several phrases, subsequently transformed approach will work:
into several “tokens” – a single word or a string of different
1. Choose k centroids randomly.
continuous characters. The stemmed method takes the results
of the tokenization method to conduct feature extraction 2. Calculate the Distance Function among each place and
on every token produced by the tokenization procedure. the centroids, then note the current clusters.
By abbreviating words to sources, this method eliminates 3. Reconsider the distance among each cluster’s given
the repetition of infrequent word statistics. The triple data dataset and select new cluster centers.
repository links the OIE technologies and the phrase process. 4. Repeat Steps 2 and 3 in additional instances or when
The incident and phrase data are used to create the subject and the group is uniform.
Mitigating Misinformation: An Advanced Analytics Framework for Proactive Detection of Fake News 295

9. Explanation of the Findings Natural News is a website that has gained prominence due to
scientific deception but many theories.
Using two real-world news datasets, I ran thorough
experiments to demonstrate FEND’s improved performance 10. Evaluation
in this section quantitatively. I evaluated our suggested
technique using two data methodologies: consistency, As previously stated, we use two distinct clustering strategies,
reliability, recall rate, and F-score to determine its usefulness. K-means and Attraction Spreading, to retrieve news items as
Researchers use genuine news on CNN and the New York characteristics (AP).
Times to objectively test the effectiveness of the new strategy The number of clusters generated by both methodologies was
during training and testing. The significance of identification equal. Therefore, minor differences in the makeup of groups
is assessed using only a small number of testing datasets, created utilizing various methods may exist. The Euclidean
also listed in Table 3. Our news outlets, including CNN and distance is used across algorithms to classify into separate
the New York Times, were taken as gospel in this research groups. Both techniques provided similar findings to our
depending on qualities determined by a large research group. data. As a result, we concentrate on groups created via the
In contrast to their competition, they appear to see a much AP approach [57].
larger audience [55]. CNN, for instance, is seen by 96 million
remuneration homes, or 82.8 % of all couples with children Table 45.1 shows that themes have been recognized in every ten
who have access to a TV. groups & the overall number of subjects in every cluster. For
example, cluster 1 (see Table 45.2) may have the most items
Because CNN’s material contradicts users’ beliefs, only a (about 29,900), which contains terms like “international,”
tiny percentage of viewers may see CNN as a reliable source. residents,” working process,” leadership,” stocks,” Obama,”
On the other hand, this system offers a robust procedure that gov’t,” group,” American,” and so on. These topics are linked
allows users to include any reliable news source. Consumers to a different verb, making it much easier to categorize the
could establish an entire news archive by choosing from content. Cluster 20 is the shortest group, with around 5000
credible news sources. Regardless of the information sources, items, and covers themes including such “actress,” theory,”
our engine appears to be able to detect false propaganda. nominations,” film,” tribute,” portrayal,” rewards,” fortune,”
Researchers study a group of websites labeled as misleading spirit,” characters,” activities,” and so on [58].
information sites to acquire misinformation as underlying data
[56]. Amongst them are the websites www.greenvillegazette. The findings show that the first-level filter’s detection
com, www.politicot.com, www.advocate.com, and www. accuracy varied greatly among databases and had a large
naturalnews.com. The very first three sites disseminated detectability [59]. For instance, the advocate.com database
fake pro-Clinton disinformation, whereas the last two posts has the most excellent detection accuracy (66.9%), while
fake anti-Trump misinformation. Our unbiased & truthful the politicot.com database has the poorest (i.e., 4.4 percent).
evaluations are supported by data acquired from both pro- The first-level filtering accurately detects approximately 65
Clinton & pro-Trump websites. It must be remembered that percent of information on advocate.com, indicating that a

Table 45.1 Article cluster collection for ground truth

Clusters Subjects (selected) Repeated
No. Topics
1 Trump, campaigning, democratic, Californian, political, judgment, senators, CNN, candidate, truths, suffering, 13,456
reporters.
2 Photographers, healthcare, trekkers, earthquakes, migratory, artists, townships, and vision 11,765
3 Gadgets, database systems, Amazon, enterprise, marketplace, web, engineering, worth, global, Google, and 11,000
opponents
4 Society, story, policeman, directors, area, darkness, issue, attractiveness, perspectives, assumptions, deficiency, 8984
inspiration, and exhibitions are all words that come to mind while thinking about a way of life, story, police, and
director.
5 When thinking of North Korea, North Korea, embassy, presentation, isolation, monitoring, emergency, strategy, war, 8955
conference, information, and recon are all words that come to mind.
6 Visitors, specialists, incubators, universities, recruitment, inquiries, officials, investigations, images, operations 9000
7 Migrants, nationality, westerners, terrorism, policeman, sanctuary, radicals, extremism, Governmental, arrests, 790
Pakistani, psychiatrist, pursue.
296 Algorithms in Advanced Artificial Intelligence

Table 45.2 The initial filtering found a large percentage of 3. Chen, C., Wang, Y., Zhang, J., Xiang, Y., Zhou, W., & Min, G.
bogus news items (2017). Statistical features-based real-time detection of drifted
twitter spam. IEEE Transactions on Information Forensics and
Advocate Naturalness Politicos Greenville
Security, 12(4), 914–925.
gazette
4. Cinque, G. (2014). The semantic classification of adjectives:
1 Uncertain 6543 2567 2955 1600 A view from syntax. Studies in Chinese Linguistics, 35(1),
News 1–30.
2 Fake 4534 498 145 500 5. Conroy, N. J., Rubin, V. L., & Chen, Y. (2015). Automatic
Topics deception detection: Methods for finding fake news.
3 Remaining 2245 1976 3000 1123 Proceedings of the Association for Information Science and
Data Technology, 52(1), 1–4. Del Corro, L., & Gemulla, R. (2013).
6. Fusilier, D. H., Montes-y Gómez, M., Rosso, P., & Cabrera, R.
considerable part of the media on advocate.com has bogus G. (2015). Detecting positive and negative deceptive opinions
using learning. Information Processing & Management, 51(4),
subjects (i.e., type-1 fake news).
433–443.
Natural news.com and greenvillegazette.com have relatively 7. Golbeck, J., Mauriello, M., Auxier, B., Bhanushali, K. H.,
low type-1 collaborative filtering levels (i.e., 21.1 percent Bonk, C., Bouzaghrane, M. A.,Everett, J. B., et al. (2018).
and 31.4 percent, respectively), suggesting that many articles Fake news vs satire: A dataset and analysis. In Proceedings of
reported on such sites are phony and have legitimate topics. the tenth ACM conference on web science (pp. 17–21). ACM.
Only 4.4 percent of the news about politicot.com is classified 8. Gross, M. (2017). The dangers of a post-truth world. Guess,
under type-1 false information, implying that practically A., Nyhan, B., & Reifler, J. (2018). Selective exposure to
misinformation: Evidence from the consumption of fake news
every one of the reports on politicot.com is reliable. The
during the 2016 US presidential campaign. Technical Report.
findings show that the very first filtering layer is a helpful Dartmouth College. https://www.dartmouth.edu/∼nyhan/
method for identifying a news source’s credibility [60]. The fake-news-2016.pdf
first layer’s output is fed into the second layer filters. The 9. Rao VV, Silpa N, Gadiraju M, Shankar RS, Vijaya K. An
second-level filtering identifies fake news’s believability and Optimal Machine Learning Model Based On Selective
makes it easier to compare specific false information ratings Reinforced Markov Decision To Predict Web Browsing
to the threshold. Patterns. Journal of Theoretical and Applied Information
Technology. 2023 Jan 31;101 (2): 859–73.
10. VVR MR, Silpa N, Gadiraju M, Reddy SS, Bonthu S,
11. Conclusion Kurada RR. A Plausible RNN-LSTM based Profession
As fake news becomes prevalent and hard to spot, better Recommendation System by Predicting Human Personality
detection methods are needed. Fake news misinforms users Types on Social Media Forums. In2023 7th International
Conference on Computing Methodologies and Communication
and targets, who may be persons or organizations. Misleading
(ICCMC) 2023 Feb 23 (pp. 850–855). IEEE.
information may cost organizations their competitive 11. Reddy SS, Gadiraju M, Maheswara Rao VV. Analyzing Student
advantage or reputation, while misleading statements can Reviews on Teacher Performance Using Long Short-Term
confuse people and affect their attitudes and decisions. Fake Memory. InInnovative Data Communication Technologies
news identification using novel analytical technologies is and Application: Proceedings of ICIDCA 2021 2022 Feb 24
presented in this research. FEND, which builds and verifies (pp. 539–553). Singapore: Springer Nature Singapore.
the false information identification framework, is then 12. Maheswara Rao VV, Silpa N, Mahesh G, Reddy SS. An
discussed. This system employs two layers to categorize. Enhanced Machine Learning Classification System to
False topic detection and tracking by the first and second Investigate the Status of Micronutrients in Rural Women. In
layers produce 92.49 percent accuracy. Our study is excellent Proceedings of International Conference on Recent Trends
since each news story is translated into actions instead of in Computing: ICRTC 2021 2022 (pp. 51–60). Springer
Singapore.
identifying bogus articles using syntactic rules or attitudes.
13. Shankar RS, Priyadarshini V, Neelima P, Raminaidu CH.
Analyzing Attrition and Performance of an Employee using
References Machine Learning Techniques. In2021 5th International
Conference on Electronics, Communication and Aerospace
1. Allcott, H., & Gentzkow, M. (2017). Social media and fake Technology (ICECA) 2021 Dec 2 (pp. 1601-1608). IEEE.
news in the 2016 election. Technical report. National Bureau 14. Kameswari KK, Raghaveni J, Shankar RS, Rao CS. Predicting
of Economic Research. Election Results using NLTK. International Journal of
2. Bird, S., Klein, E., & Loper, E. (2009). Natural language Innovative Technology and Exploring Engineering. 2019; 9:
processing with Python: Analysing text with the natural 4519–29.
language toolkit. O’Reilly Media, Inc..
Mitigating Misinformation: An Advanced Analytics Framework for Proactive Detection of Fake News 297

15. Jyothirmayee S, Kumar VD, Rao CS, Shankar RS. Predicting 29. Michalon, O., Ribeyre, C., Candito, M., & Nasr, A. (2016).
stock exchange using supervised learning algorithms. More profound syntax for better semantic parsing. In Cooling
International Journal of Innovative Technology and Exploring 2016. Miller, G. A. (1995). Wordnet: a lexical database for
Engineering. 2019; 9(1): 4081–90. English. Communications of the ACM, 38(11), 39–41.
16. Shiva Shankar R, Ravibabu D. Digital Report Grading Using 30. Nickerson, R. S. (1998). Confirmation bias: A ubiquitous
NLP Feature Selection. InSoft Computing in Data Analytics: phenomenon in many guises. Review of General Psychology,
Proceedings of International Conference on SCDA 2018 2019 2(2), 175–220. doi: 10.1037/1089-2680.2.2. 175.
(pp. 615–623). Springer Singapore. 31. Nyhan, B., & Reifler, J. (2010). When corrections fail: The
17. Shankar RS, Rajanikanth J, Sivaramaraju VV, Murthy KV. persistence of political misperceptions. Political Behavior,
Prediction of employee attrition using data mining. 2018, 32(2), 303–330. Open Sources (2017).
i.e., International Conference on System, computation, 32. Pennycook, G., & Rand, D. G. (2017). Who falls for fake
automation, and Networking (icscan) 2018 Jul 6 (pp. 1–8). news? The roles of analytic thinking, motivated reasoning,
IEEE. political ideology, and bullshit receptivity. SSRN Electronic
18. Shankar RS, Murthy KV, Rao CS, Gupta VM. An approach Journal, September, 1–63. doi: 10.2139/ssrn.3023545.
for extracting tweets from social media factors. In2018 ieee 33. Qazvinian, V., Rosengren, E., Radev, D. R., & Mei, Q. (2011).
international conference on system, computation, automation, Rumor has it: Identifying misinformation in microblogs.
and Networking (scan) 2018 Jul 6 (pp. 1–7). IEEE. In Proceedings of the conference on empirical methods in
19. Shankar RS, Srinivas LV, Ravibabu D, Raminaidu C. Novice natural language processing (pp. 1589–1599). Association for
Retroaction Report. ARPN Journal of Engineering and Computational Linguistics.
Applied Sciences. 2006; 13. 34. Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y.
20. Shankar RS, Babu DR, Murthy KV, Gupta V. An approach (2017). The truth of varying shades: Analysing language in
for essay evaluation using system tools. 2017 International fake news and political fact-checking. In Proceedings of the
Conference on Innovative Research In Electrical Sciences 2017 conference on empirical methods in natural language
(IICIRES) 2017 Jun 16 (pp. 1–9). IEEE. processing (pp. 2931–2937). Association for Computational
21. Hua, W., Wang, Z., Wang, H., Zheng, K., & Zhou, X. (2017). Linguistics. doi:10.18653/v1/ D17-1317.
Understand short texts by harvesting and analyzing semantic 35. Rubin, V. L., Chen, Y., & Conroy, N. J. (2015a). Deception
knowledge. IEEE Transactions on Knowledge and Data detection for news: three types of fakes. Proceedings of the
Engineering, 29(3), 499–512. Association for Information Science and Technology, 52(1),
22. Iyengar, A., Kalpana, G., Kalyankumar, S., & GunaNandhini, 1–4.
S. (2017). We integrated spam detection for multilingual 36. Rubin, V. L., Chen, Y., & Conroy, N. J. (2015b). Deception
emails. In Proceedings of the 2017 International Conference detection for news: Three types of fakes. In Proceedings of the
on Information Communication and Embedded Systems seventy-eighth ASIS&T annual meeting: Information science
(ICICES) (pp. 1–4). IEEE. with impact: Research in and for the community. In ASIST
23. Jang, S. M., Geng, T., Li, J.-Y. Q., Xia, R., Huang, C.-T., ‘15 (pp. 83:1–83:4). Silver Springs, MD, USA: American
Kim, H., & Tang, J. (2018). A computational approach for Society for Information Science.
examining the roots and spreading patterns of fake news: 37. Rubin, V. L., Conroy, N. J., Chen, Y., & Cornwell, S. (2016).
Evolution tree analysis. Computers in Human Behavior, 84, Fake news or truth? Using satirical cues to detect potentially
103–113. misleading information. In Proceedings of NAACL-HLT
24. Jin, Z., Cao, J., Jiang, Y.-G., & Zhang, Y. (2014). News (pp. 7–17).
credibility evaluation on microblog with a hierarchical 38. Rubin, V. L., & Lukoianova, T. (2015). Truth and deception
propagation model. In Proceedings of the 2014 IEEE at the rhetorical structure level. Journal of the Association for
International Conference on Data Mining (ICDM) (pp. 230– Information Science and Technology, 66(5), 905–917.
239). IEEE. 39. Sahu, I., & Majumdar, D. (2017). We are detecting factual
25. Klein, D. O., & Wueller, J. R. (2017). Fake news: A legal and non-factual content in news articles. In Proceedings
perspective. Journal of Internet Law, 20(10), 1,6–13. of the fourth ACM IKDD conferences on data sciences. In
26. Lau, R. Y. K., Zhang, W., & Xu, W. (2018). Parallel aspect- CODS ‘17 (pp. 17:1–17:12). New York, NY, USA: ACM.
oriented sentiment analysis for sales forecasting with big data. doi:10.1145/3041823. 3041837.
Production and Operations Management, 27(10), 1775–1794. 40. Shin, J., & Thorson, K. (2017). Partisan selective sharing: The
doi:10.1111/poms.12737. https://onlinelibrary.wiley.com/doi/ biased diffusion of fact-checking messages on social media.
abs/10.1111/poms.12737 Journal of Communication, 67(2), 233–255. doi:10.1111/
27. Li, H., Gupta, A., Zhang, J., & Flor, N. (2018). Who will use jcom.12284.
augmented reality? An integrated approach based on text 41. Siering, M., Koch, J.-A., & Deokar, A. V. (2016). Detecting
analytics and field survey. European Journal of Operational fraudulent behavior on crowdfunding platforms: The role
Research. doi:10.1016/j.ejor.2018.10.019. of linguistic and content-based cues in static and dynamic
28. Lin, Y.-S., Jiang, J.-Y., & Lee, S.-J. (2014). A similarity measure contexts. Journal of Management Information Systems, 33(2),
for text classification and clustering. IEEE Transactions on 421–455.
Knowledge and Data Engineering, 26(7), 1575–1590.
298 Algorithms in Advanced Artificial Intelligence

42. Silverman, C. (2015). Lies, damn lies, and viral content: How 52. Wu, H. C., Luk, R. W. P., Wong, K. F., & Kwok, K. L.
news websites spread (and debunk) online rumors, unverified (2008). Interpreting TF-IDF term weights as making relevant
claims, and misinformation. Technical Report. New York, decisions. ACM Transactions on Information Systems (TOIS),
NY: Tow Center for Digital Journalism, Columbia Journalism 26(3), 13.
School, Columbia University. 53. Xavier, C. C., & de Lima, V. L. S. (2014). Boosting open
43. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. information extraction with noun-based relations. In
D., Ng, A., & Potts, C. (2013). Recursive deep models for Proceedings of the LREC (pp. 96–100).
semantic compositionality over a sentiment treebank. In 54. Xu, J., & Taft, M. (2015). The effects of semantic transparency
Proceedings of the 2013 conference on empirical methods in and base frequency on recognizing English complex words.
natural language processing (pp. 1631–1642). Journal of Experimental Psychology: Learning, Memory, and
44. Stepinski, A., & Mittal, V. (2007). A fact/opinion classifier for Cognition, 41(3), 904.
news articles. In Proceedings of the thirtieth annual international 55. Yang, X.-F., & Siu, W.-C. (2017). Vehicle detection under
ACM sigir conference on research and development in tough conditions using prioritised feature extraction with
information retrieval. In SIGIR ‘07 (pp. 807–808). New York, shadow recognition. In Proceedings of the 2017 twenty-
NY, USA: ACM. doi:10.1145/1277741.1277919. second international conference on digital signal processing
45. Swartz, J. (2017). The worldwide web’s inventor warns it’s (DSP) (pp. 1–5). IEEE
in peril on 28th anniversary. USA Today. www.usatoday. 56. Deshai N, Sekhar B V D, VenkataRamana S, S, Srinivas K &
com/story/tech/news/2017/03/11/ world-wide-webs-inventor Varma G P S, Big data hadoop map reduce job scheduling: a
warns-s-peril/99005906/ short survey, Advances in Intelligent Systems and Computing,
46. Tang, R., Ouyang, L., Li, C., He, Y., Griffin, M., Taghian, vol 862. Springer, Singapore, 2019, 349-365, Available from:
A., . . . Hughes, K. (2018). Machine learning to parse breast 10.1007/ 978-981-13-3329-3_3.3 4
pathology reports in chinese. Breast Cancer Research and 57. Deshai N, Sekhar B V D S, Venkataramana S, Chakravarthy
Treatment, 1–8. V V S S S & Chowdary P S R, Study with comparing bigdata
47. Venkatesan, S., Han, W., Kisekka, V., Sharman, R., Kudumula, handling techniques using apache hadoop map reduce Vs
V., & Jaswal, H. S. (2013). Misinformation in online health apache spark, Int J Eng Technol, 7(4) (2018) 4839–4843,
communities. In WISP 2012 Proceedings (p. 28). Available from:10.14419/ijet.v7i4.1.15997 5
48. Venkatesan, S., Han, W., & Sharman, R. (2014). A response 58. Mahesh G, Shankar Reddy S, Maheswara Rao VV, Silpa N.
quality model for online health communities. In Proceedings Preeminent Sign Language System by Employing Mining
of the thirty-fifth international conference on information Techniques. InInternational Conference on IoT Based Control
systems (p. 28). Networks and Intelligent Systems 2023 Jun 21 (pp. 571-588).
49. Tsai, M.-F., & Wang, C.-J. (2017). On the risk prediction Singapore: Springer Nature Singapore.
and analysis of soft information in finance reports. European 59. Deshai N, Venkataramana S & PardhaSaradhiVarma G,
Journal of Operational Research, 257(1), 243–250. Performance and cost evolution of dynamic increase hadoop
50. Wang, P., Xu, B., Xu, J., Tian, G., Liu, C.-L., & Hao, H. workloads of various datacenters, Smart Innovation, Systems
(2016). Semantic expansion using word embedding clustering and Technologies, 105 (2019) 505–516. Available from:
and convolutional neural network for improving short text 10.1007/978-981-13-1927-3_54.
classification. Neurocomputing, 174, 806–814. 60. Deshai N, SaradhiVarma G, P & Venkataramana S, A study
51. Wei, T., Lu, Y., Chang, H., Zhou, Q., & Bao, X. (2015). A on analytical framework to breakdown conditions among data
semantic approach for text clustering using wordnet and lexical quality measurement, International Conference on Innovative
chains. Expert Systems with Applications, 42(4), 2264–2275. Research in Science and Technology, 7, 2018 Available from:
Wood, T., & Porter, E. (2018). The elusive backfire effect: 10.14419/ijet.v7i1.1.9276 10.
mass attitudes’ steadfast factual adherence. Political Behavior,
Note: All the figures and tables in this chapter were designed by
1–29.
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Summarization of Legal Texts by Using

Deep Learning Approaches 46

Nilambar Sethi1
Department of Computer Science and Engineering,
GIET University, Gunupur, Odisha, India
V. Sivarama Raju Vetukuri2, R. Shiva Shankar3
Department of Computer Science and Engineering, SRKR Engineering College,
Bhimavaram, Andhra Pradesh, India
R. Rajender4
Department of Computer Science and Engineering,
LENDI Institute of Engineering and Technology, Vizianagaram, Andhra Pradesh, India

Abstract: The exponential rise of internet textual data necessitated a sophisticated tool that automatically summarises material
while keeping key information. Text summarization is essential in today’s age of massive data sets to extract relevant material
and display accurate, intelligible information. Over time, several ways have been devised to summarise material. By extracting
terms from text, conventional approaches build redundant summaries and ignore document summary relationships. Text
summarization is an integral part of Natural Language Processing that helps people comprehend text. AI uses natural language
processing to find important information quickly while preserving context. Deep learning is used to extract key phrases and
summarise them. In this article, we discuss the techniques of extractive and abstractive text summarization using Text Rank and
Encoder-Decoder LSTM. These techniques are beneficial because they do not require pre-defined features or domain-specific
knowledge and can be applied across different domains. Moreover, to overcome the lack of labeled data, we evaluate and score
the training set phrases by comparing them to human reference descriptions. Our experimental assessments demonstrate the
efficacy of our suggested methodologies compared to other baselines.
Keywords: Extractive text summarisation (ETS), Abstractive summarization (AS), Natural language processing (NLP), Deep
learning (DL), Long short-term memory (LSTM)

1. Introduction textual material into a shorter version that may include all
relevant and vital information about the content. Information
Over the last several years, articles and links have increased, loss occurs due to compression. Loss. Medical records,
making searching for helpful information and displaying it meteorological data, news summaries, etc., are successfully
harder. As data grows, semantic density increases. Therefore, summarised using text summarising [1]. Here are some broad
the necessity to quickly distinguish the most significant items types of text summarisation:
arises. The Summary helps determine whether the article’s 1. Extractive text summarisation: Document items are
condensed text is relevant. Text summary involves selecting a extracted without alteration.
portion of a material to represent it. The text summary reduces

1
nilambar@giet.edu, 2sivaramaraju.vetukuri@gmail.com, 3shiva.csesrkr@gmail.com, 4rajender.renuguntta@gmail.com

DOI: 10.1201/9781003529231-46
300 Algorithms in Advanced Artificial Intelligence

2. Abstractive summarising: Unlike extractive

summarising, abstractive summarising modifies,
rephrases, or utilizes outside words to construct a more
sophisticated summary.
Textual data, such as online publications, articles, news,
and reviews, contains extensive content that requires
summarization [2]. The importance of text summarisation
lies in its ability to retrieve critical information quickly, load
Fig. 46.1 Flow of abstractive summarization
it quickly, and solve problems related to summary evaluation
criteria [3]. As artificial text summarisation technologies
have advanced and yielded considerable results in several
languages, they need evaluation and Summary. This study
examines contemporary methodologies, focusing on their
methods, datasets, assessment metrics, problems, and
approaches to addressing issues [4]. Text summarisation may
be categorized by function, genre, context, summariser type,
and document count [5]. One technique organizes the process Fig. 46.2 Overview of semantic-based approach
into ETS and AS categories [6].
To demonstrate, ETS integrates key lines from the text
AS modifies the original text, creating new concepts and
using extracted characteristics (statistical or linguistic)
making it difficult for computers. For abstractive text
without altering the content. Extraction techniques are
summarisation (ATS), sophisticated machine learning and
more straightforward to build, but their summaries are
NLP algorithms are necessary to interpret the material and
less intelligible, lack coverage and coherence, and have
provide a summary. Abstractive summarization is more
a more significant likelihood of redundancy. Abstractive
challenging than exhaustive since it involves real-world
summarization uses linguistic characteristics to construct
knowledge and semantic analysis [7]. AS is superior to
cohesive and grammatically accurate phrases from retrieved
extractive summarization since it approximates human-
content. Linguistic approaches provide more human-like and
generated summaries, increasing their significance [8]. For
concise summaries but are more challenging to implement.
both types, an effective outline requires maintaining the order
As a result, researchers give priority to techniques that use
of main ideas and concepts, minimizing repetition, ensuring
extractive summarization [13-14].
consistency and coherence, and retaining meaning even in
lengthy sentences. The resulting Summary should be concise
and communicate key information from the original text [9]. 2. Related Work
Structured and semantic approaches are among the ATS Sobh et al. [15] analyzed sentence and paragraph length,
methods available. While the former emphasizes text cosine similarity values, and POS-based characteristics such
semantics and uses information representation to summarise as infinitives, verbs recognized words, and digit presence.
text, the latter focuses on encoding essential document Phrase length, word weight, and similarity total were crucial.
elements. The flow of ATS is shown in Fig. 46.1. Figure 46.2 Schlesinger et al. [16] processed Arabic texts using A rule-
shows the flow of semantic-based multimodal, information based sentence splitter, which uses specific rules to identify
item, and semantic graph methodologies [10]. sentences within a text block. Six-gram tokenization is a
To differentiate between summaries of single documents, method of breaking down a sentence or phrase into individual
summaries of multiple documents, and summaries of components called tokens, where each token consists of six
interconnected documents, the “span” parameter is utilized. consecutive words. This approach is helpful for various NLP
There is also the possibility of multilingual support for tasks, such as language modeling or sentiment analysis. The
summarization systems. In terms of the parameters of an writers noted the insufficient resources for completing these
outline, an indicative summary keeps the primary concept of duties. Based on Douzidia and Lapalme’s [17] successful
the text. Still, an informative overview contains all significant assessment, writers grade sentences using actual Arabic
subjects or information while keeping the number of words texts. After extracting top-ranked phrases, the algorithm
used to a minimum. The audience parameters decide whether substituted Arabic sentences with machine-translated (MT)
an outline is generic or query-based, with the latter using user ones. School evaluation is the primary approach to measuring
queries to summarise relevant material [11-12]. student learning.
Summarization of Legal Texts by Using Deep Learning Approaches 301

One major exam is essay writing. Manual assessment is used information in digital documents. They addressed automated
now. An automatic check is used for this project. The essay text summarising research approaches. A graph-based
is poor if the similarity is under 20%. Other methods include report system by Kavita Ganesan et al. [31] creates concise
segmentation, stop words, word frequency computation, theoretic summaries of excessively repetitious viewpoints.
numerical attributes, sentence validation, and spell check. The network organization is distinct, with nodes representing
Assessment tools exist, but our process is faster and more word units and directed edges indicating sentence structure.
accurate [18]. Students will provide comments online using Nodes have positional data. N.Moratanch and S.Chitrakala
a standard form. The suggested approach prioritizes security [32] suggested text summarising extracts of short
by allowing only legitimate users to see and understand the information from the web, Twitter, etc. Text summarisation
cumulative input and opinions of a batch of pupils [19]. might be extractive or abstractive. Different methods have
They organized the data into clusters and provided just the been used to evaluate extractive summarisation issues.
helpful information for this resource to be valuable. For this ETS uses supervised and unsupervised learning to shorten
purpose, they filtered the data using a density-based k-means essential phrases and paragraphs based on word and sentence
technique and a graph clustering algorithm. Following the properties. Abstractive text summarisation [33] is coherent,
application of the filter, the data is based on the occurrence of less hesitant, and information-rich. The two main abstractive
keywords, the relevance of key phrases, and, ultimately, the strategies are organized and segmented. The structured
degree to which key terms appear similarly across the dataset method leverages previous knowledge, while the semantic
[20]. approach uses NLP. The advanced abstractive model
Maaloul et al. [21] found 19 rhetorical linkages in a corpus- summarises by comprehending the source text. Abstractive
based analysis, with some similarities to Mathkour et al. summarization is clear and linguistically accurate.
[22]. Secondly, they began with the nine best summary Tooba Siddiqui and Jawwad Ahmed Shamsi [34] said that
connections. Al-Thanyyan and Azmi [23] suggested hybrid abstractive approaches are better than extractive methods
two-pass summarisation. A primary summary is created since they mimic human summarization. AS often generates
from the RS tree’s early levels, while a shorter summary is new sentences using neural networks. The repeating words
produced in the second pass. According to the researchers, issue was solved via temporal attention. The headline-
the two-pass summariser improves RST. This method extracts generating algorithm was trained and evaluated. They used
relevant sentences by utilizing Arabic NLP techniques. The global and temporal attention models. According to A, the
discriminant analysis commences with mRMR [24], and attention technique gives the decoder immediate access to
based on mRMR scores, it ranks the sentences according to the input sequence instead of a fixed-length context vector.
the strength of their discriminant words. P Patil et al., automatic Summary condenses material into
Data feature selection and analysis can reduce staff turnover a concise form [35]. Finding a practical approach is the
[25]. The grading process considers the essay’s average central issue. This study presents our sentimental analysis
length, structure, organization, bag of words, sentence, and methodology for analyzing student comments utilizing
word count. Sequential forward feature selection is used to LSTM [36]. Single or several papers may be summarised.
select the best candidate subset [26]. In addition, this study Redundancies and unnecessary data may be deleted, saving
employs Machine Learning to predict the stock performance users time. Extractive and abstractive methods were utilized
of equities traded on a financial transaction involving another to summarise and enhance the material. Text Rank algorithm
economic system [27]. Lastly, the grading process considers ranks sentences. The text’s most crucial sentences are
the essay’s average length, structure, organization, bag of selected—lexical database abstracted Summary. Word ranks
words, sentence count, and word count and then selects are awarded to words, not text. Preprocessors break text
into sentences and remove stop words and stems. Important
Chandhana Surabhi. M [29] said NLP makes machines act phrases are picked in the extraction summary, and a few words
like humans. It facilitates human-machine communication. are substituted with incorrect synonyms during abstraction.
NLP has many daily uses. NLP processes vast texts. The algorithmic nature of this strategy is a benefit [37].
Category classification, indexing and searching huge texts, Shuai Wang et al. [38] explained that extended text requires
machine translation, and information extraction are required. two stages. Phases are extraction and abstraction. Sentence
A language comprehension software must grasp language extraction employs a graph model to extract essential
structure, including words and how they form phrases and sentences. Summaries are generated using recurrent neural
sentences. It should understand sentence meaning and network LSTM encoder-decoder at the abstraction phase.
context. The software must understand human thought Top-ranked sentences are combined to produce a summary.
and the world. User-friendly NLP systems are the future. Abstractive text summarisation uses Seq2Seq.
Vipul Dalal and Latesh Malik [30] suggested discovering
302 Algorithms in Advanced Artificial Intelligence

Jaradat [39] combines HS with summarising to deliver the For this project, we are using the Encoder-Decoder model.
nearest approximation to the optimal document summary The encoder-decoder approach helps with sequence-to
using the E ASC corpus and the ROUGE toolbox to analyze sequence challenges like text summarisation. It incorporates
the method. The findings demonstrated that the proposed a total of 2 RNNs. The first performs the role of encoder,
method performed much better than other contemporary while the second decodes information. We use Encoder-
alternatives. Jaradat and Al-Taani [40] developed a hybrid Decoder LSTM to predict variable-length output for variable-
single-document extractive Arabic text summary using length inputs. The encoder is stacked LSTM. Stacked LSTMs
evolutionary algorithms. Al-Zahrani et al. [41] found that use output sequences as input to the next LSTM. Due to poor
PSO outperformed several Arabic summarization methods. performance with extended input or output sequences, this
The particle selection operator (PSO) selects the particle with Encoder-Decoder model uses the Attention Layer. Using
the best combination of eight structural attributes required the Attention layer, we focus on essential data and ignore
by Arab summarizers. This is achieved by training on data unnecessary data. Our model uses early stopping to halt
from the Essex Arabic summaries corpus. Each PSO iteration training when validation loss increases.
scores and ranks input text sentences based on selected
characteristics and weights, producing an output summary 3.1 Long Short Term Memory (LSTM)
of the top-ranked phrases. The experiments show that Arabs LSTMs [24] are a type of RNN that can learn how things
synthesize texts by concentrating on the opening phrase of depend on each other over time. Only current information
each paragraph. Mahjoub [42] suggested using the PSO is necessary when predicting the final word in a sentence.
algorithm to automatically derive summaries from Arabic In circumstances where there is not much of a time lag
single texts. The proposed method for obtaining Arabic single between the relevant information and the location at which
document summaries worked in experiments. it is required, RNN can learn how to utilize the historical
data without encountering the issues outlined before. Extra
3. Methodology context may be necessary when essential information is
distant. In cases like this, LSTM networks are adequate.
Hierarchical learning, also known as deep learning, involves
using a neural network with multiple layers for processing
data, including linear and non-linear transformations that
identify high-level data abstractions. An observation could
be described as a collection of edges, regions of a specific
geometry, or a vector of intensity values for each pixel.
Learning may be made more straightforward using certain
representations. Deep learning can supplant traditionally
hand-crafted features with more time- and cost-effective
unsupervised, semi-supervised, or fully-supervised feature
learning and hierarchical feature extraction techniques. This
research aims to enhance representations and develop models
that can effectively learn from vast amounts of unlabeled Fig. 46.3 System flow for LSTM
data. Neural networks receive data as input to recognize
The structure of the LSTM is analogous to that of a chain,
patterns and predict output.
and each iteration of the repeating module comprises four
Three layers make up the network. Text, pictures, and audio interacting modules. The four components of this system are
are submitted to the submitted Layer. Numerical values are the cell state, the output gate, the update gate, and the forget
connected to the Neurons that make up the Hidden Layer. gate. Within the LSTM repeating module, the horizontal
These are the premises upon which the calculations are running layer is used to indicate the current state of the
carried out. The output is then sent to the Output Layer as cell. This Layer acts as a conveyor belt and interacts very
the last step. A Recurrent Neural Network (RNN) gets its little with the other levels; it may be seen in Fig. 46.3. The
current state input from its previous state output. This is following is an explanation of the fundamental structure of
applied in sequence-based state prediction issues. It features an LSTM module:
a Hidden Layer that can remember the order of things. Forget Gate: It decides which information from the cell state
RNN has a drawback: Vanishing gradient descent. LSTM is should be deleted or forgotten. It does this by analyzing all
implemented so that this problem may be solved. LSTM is
of the data. It makes use of a sigmoid activation function
Long Short-Term Memory. RNNs like LSTM overcome the
and accepts as input both the currently provided input (x_t)
vanishing gradient issue. It contains gates, which allow it to
and the previously used hidden state (h_(t-1)). The output of
circumvent this issue.
Summarization of Legal Texts by Using Deep Learning Approaches 303

the forget gate determines which aspects of the cell state are Symptoms of Malaria usually appear about 10 to 15 days
forgotten and which are remembered. It also specifies which after an infected mosquito bites a person.
elements of the cell state should be kept. The tokenization of sentence: The sentences must be
Input Gate: It is in charge of determining which pieces of tokenized following the provision of text as input. The process
information from the currently active time step should be of separating (or dividing) a text into a list of tokens is known
added to the state of the cell. A sigmoid activation function as tokenization. When referring to a paragraph, sentences
is used. It accepts as input the presently available input data are regarded as the tokens, but words are considered the
(x_t) as well as the prior concealed state (h_(t-1)), and the tokens when referring to a phrase. We are using sentence
information that should be saved is determined based on the tokenization in this process.
outcome of this process. After tokenization: [“mosquito-borne” illness malaria affects
Output Gate: It is responsible for determining which aspects people and animals. ‘Fever, fatigue, vomiting, and headaches
of the current cell state should be used to compute the hidden are classic malaria symptoms. Yellow skin, convulsions,
state for the next time step. It uses a sigmoid activation coma, and death may result. After a mosquito bite, symptoms
function along the same lines as the input and forget gates. It appear 10–15 days later.’]
considers the most recent hidden state and information (x_t Removal of stop words: Here, the removal of stop words
and h_(t-1), respectively). A value is generated by the output is removing commonly used words. NLTK includes a list of
gate, which feeds into the hidden state’s computation. stop words in its corpus (collection of different language data
sets). We remove stopwords from our text.
3.2 System Model for Extractive Summerization
Before removing stop words: [‘Malaria is an infectious disease
Extractive summarisation is a valuable tool for quickly
that spreads through mosquitoes and can infect humans and
digesting large volumes of text. It can be helpful when
other animals. The common symptoms of Malaria include
maintaining the original wording and context, which is
fever, fatigue, vomiting, and headaches. In severe cases, it
essential, such as in legal documents or scientific papers.
can cause yellow skin, seizures, coma, and even death. The
However, it may not provide summaries as fluent or concise
symptoms typically appear a few days after being bitten by
as those generated by abstractive summarisation, which can
an infected mosquito.’]
rewrite and rephrase content. The entire flow is shown in
Fig. 46.4. After removing stop words: [‘Malaria infectious disease
spreads mosquitoes infect humans animals, common
Preprocess the symptoms malaria include fever, fatigue, vomiting,
Input Text
Text headaches, severe cases, cause yellow skin, seizures, coma,
death symptoms typically appear days bitten infected
mosquito’]
Lemmatization Lemmatization of sentences: Reducing a word to its
simplest possible form is known as lemmatization. It
analyses the situation and then reduces the term to its most
fundamental and relevant form. Lemmatization with a pos
Cosine Similarity tag is used. Pos means parts of speech. We are considering
four parts of speech: noun, Verb, Adjective, and Adverb. POS
tags are given to words, and we lemmatize the word using it.
Print Calculate Text Finally, the lemmatized sentences are obtained.
Summary Rank ‘caring’----->lemmatization----->’care’

Fig. 46.4 System model for extractive summarisation

Before lemmatization: [‘malaria infectious illness transmitted
mosquitoes. Symptoms typically appear a few days after
Input text: We give text (or) paragraphs of sentences as being bitten by an infected mosquito. Cause skin, seizures,
input—import required modules such as pandas, numpy, nltk. coma, even death]
Here, we consider the following text as input. After lemmatization: [‘Malaria mosquito-borne infectious
Text = Malaria is a disease transmitted by “mosquitoes” and disease affects humans animal,’ ‘malaria cause symptom
affects both humans and animals. Symptoms include fever, typically include fever tiredness vomit headache,’ ‘severe
fatigue, vomiting, and headaches, while severe cases may case cause yellow skin seizure coma death,’ ‘symptom
cause yellow skin, convulsions, coma, and even death. usually begin day bitten infect mosquito’]
304 Algorithms in Advanced Artificial Intelligence

[Malaria is an infectious illness transmitted by mosquitoes

Data
that harm humans and animals. Symptoms typically include Load the Dataset
Preprocessing
fever, exhaustion, vomiting, and headaches. In severe cases, it
can cause yellow skin, seizures, coma, and death. Symptoms
usually begin within a day of being bitten by an infected Splitting the
mosquito] Data

Calculate cosine similarity: We calculated the similarity

between sentences using Cosine similarity. We use term Prepare
frequency to calculate cosine similarity. Tokenizer

array([ [1. , 0.11785113, 0. , 0.13363062],

[0.11785113, 1. , 0.11785113, 0.12598816], Print Summary Build Model
[0. , 0.11785113 , 1. , 0. ]
[0.13363062, 0.12598816, 0. , 1. ] ]) Fig. 46.5 System model for abstractive summarization
Calculate TextRank for sentences: TextRank is similar to
Loading dataset: We have to load the dataset of the CSV
PageRank. PageRank calculates the rank for webpages, and
file by read_csv function—import modules like numpy and
TextRank calculates the rank for text, i.e., sentences. We use
Pandas. We take a dataset that is in CSV format. It consists
the damping factor to calculate the rank for the sentence. We
of two columns: text and Summary. The text column is the
obtain scores for every sentence using TextRank.
original review text, and the summary column is the original
{0: 1.705070850611419, Summary of the review.
1: 1.9872043089314073, Data exploration: After loading the dataset, we must explore
2: 1.2668524673255812, the data by commands like head(), tail(), shape, and describe.
These commands are used to analyze the data. Head() presents
3: 1.7308688729793822}
the first n rows shown in the dataset. The default number of
Sort the sentences: They are sorted in descending order rows is 5. The tail() function will only show the most recent n
based on their scores. rows in a dataset. The default setting displays the most recent
[(1.9872043089314073, The symptoms of Malaria 5 rows. The shape represents the dimensions of the dataset.
include fever, fatigue, vomiting, and headaches.’), It provides the number of rows that are in the dataset as well
(1.7308688729793822, ‘Symptoms usually begin 10 - as the number of columns. Describe the statistical details of
15 days after being bitten by an infected mosquito.’), the dataset.
(1.705070850611419, ‘Malaria is a disease transmitted by Data preprocessing: Before we go on to the modeling
mosquitoes that poses a threat to both humans and animals.’), portion, it is critical that we first complete the fundamental
(1.2668524673255812, ‘In severe cases, it can cause yellow preprocessing stages. Here, we do preprocessing steps like
skin, seizures, coma & death.’)] removing unnecessary characters, removing stopwords, and
Output summary: Print the top most important sentences mapping contraction. In contraction mapping, we map the
in a paragraph. Here, we are printing the two most important words into their complete form. For example,
sentences in our text. Wasn’t ⇒ was not
Output: Malaria causes symptoms that typically include Isn’t ⇒ is not
fever, tiredness, vomiting, and headaches. Infected mosquito
In such a way, we do contraction mapping for the words in our
bites cause symptoms 10–15 days later.
sentence. We removed stopwords and unnecessary characters
3.3 System Model for Abstractive like punctuation marks from our text.
Summarization Splitting the data: They split the data into training and
validation data using the train_test_split function from the
Abstractive summarization uses natural language production
sklearn package. We specify the test set size as 0.2, i.e., 20%
to understand and paraphrase a larger text to express its
of the dataset is considered as test set. The text and summary
primary concepts. Abstractive summarising may rewrite and
arrays are given as input.
reword material, making it more human-like and coherent
than extractive summarisation, which chooses and extracts X _ t r, x _ t e s t , y _ t r, y _ t e s t = t r a i n _ t e s t _ s p l i t ( n p .
phrases. The entire flow is shown in Fig. 46.5. array(data[‘Text’]),np.array(data[‘Summary’]),test_
size=0.2,random_state=0,shuffle=True)
Summarization of Legal Texts by Using Deep Learning Approaches 305

Importing required modules for the model: After splitting is relatively low, both in terms of assessing and producing
data, we have to import the modules needed for building them. Compared to the outline form, summaries might be
the model, such as Embedding, TimeDistributed, Model, challenging to discern the substance of the system.
and EarlyStopping. They are low-dimensional and help to (i) Human Evaluation
represent data in a more meaningful way.
Because of the inherent subjectivity of human judgment when
Tokeniser: We use Keras Tokenizer for our model. This determining what constitutes a “good” outline, developing an
Tokenizer turns each text into an integer sequence or a vector. automated system for conducting analysis is an exceptionally
Building the model: We use Encoder-Decoder LSTM, which challenging endeavor. This means that manual analysis
supports variable length input sequences and predicts variable involves a lot of work. Coherence and coverage are two
length output sequences. It is mainly used for sequence-to further concerns that need to be addressed.
sequence models. Natural Language Processing utilizes a (ii) Recall-Oriented Understanding for Gisting Evaluation
specific architecture, which involves encoding data using (ROUGE)
Stacked LSTM as the first step. The output of one LSTM is
then fed into the next LSTM in the stack, with three LSTMs The value of Count (N-gram) represents the total number of
used to encode the data. We define the model and compile N-grams included in the reference summary.
it. Then, we apply EarlyStopping to our model, which stops ROUGE N
the training of a dataset at a point to overcome the problem
of overfitting. Next, we fit the model and evaluate it. At last,
ÂS Œ reference_summariesÂ N gramsCount match (N gram)
=
we decode the model using embeddings and hidden states. ÂS Œ reference_summariesÂ N gramsCount(N gram)
The decoder reads through the vector and translates it to an
output sequence. Where n is the total length of the N-gram sequence used in
Output summary: Print the outline for the test data. We the algorithm. The maximum number of n-grams that appear
can see the original Summary and predicted Summary in the simultaneously in both a candidate summary and a set of
output. reference summaries is known as the co-occurrence limit.

(iii) Recall R =
|Sref « Scand |
4. Results |Sref |
Our qualitative analysis reveals that key phrases are Where Sref ∩ Scand Represents the number of sentences that
picked during extraction, and certain words are substituted are included in the candidate and reference summaries.
with suitable synonyms during abstraction. However, the
abstraction phase is now replaced sparsely. The present (iv) Precision ( P ) P =
|Sref « Scand |
system effectively analyses news items, technical papers, |Scand |
and encyclopedia entries. The extraction may not provide 2(Precision)(Recall)
comprehensible summaries in essays, novels, and publications (v) F measure F =
Precision + Recall
with much direct speech. We analyzed the performance of our
system against specific reference summaries. News stories (vi) Compression Ratio Cr = Slen ⋅ Tlen
were evaluated, with several given below and their findings. Where, Slen and Tlen are the length of summaries
The average accuracy, recall, and fscores from the ROUGE Finally, the text rank score was evaluated for the given
assessment measure are shown. When an algorithm has high sentences; the sentence repeated more times was estimated
accuracy, it means that it returns more relevant results than it as a word cloud of top-ranked sentences, a scatter plot of
did irrelevant ones. When it has a high recall, it means that it sentence length vs. Text Rank Score was also obtained.
recovered the majority of the relevant results. Sentence Similarity Heatmap visualization was drawn and
Various benchmarking datasets [1] are used for experimental finalized. The top 2 sentences by Text Rank Score were
assessment of extractive summarisation. The most popular obtained for ETS, and the graphs are shown in Fig. 46.6. A
kind of benchmarking is Document Understanding graph was drawn for Loss Vs Val_Loss for AS. The metrics
Conferences (DUC) datasets used for text summarising. It obtained were also shown in Fig. 46.7 as Graphs.
includes both the original materials and summaries of those
documents produced electronically, by hand, and by user 5. Discussion
submissions are the summaries [20]. Based on the literature
reviewed in the articles, it has been discovered that human In this part, the outputs of the suggested model are discussed,
beings tend to agree on things. The value of summarisers and an investigation into the possible explanations for why
306 Algorithms in Advanced Artificial Intelligence

Fig. 46.6 Graphs obtained for extractive summarization

Fig. 46.7 Graphs for abstractive summarization

Summarization of Legal Texts by Using Deep Learning Approaches 307

Fig. 46.8 Screen short for test case: original data with extractive summary

Fig. 46.9 Screen short for test case: original data with abstractive summary
308 Algorithms in Advanced Artificial Intelligence

some findings are presented as they are is carried out. The 6. Conclusion
data are shown in the order obtained from the model. In
terms of the overall score, the findings of the examination In this analysis, many mechanisms of the process of
were quite encouraging. The performance of our system extractive text summarisation have been shown. The
was examined in terms of its ability to summarise single extractive summarising method is very coherent, has fewer
documents as well as several documents at once. Regarding unnecessary steps, and helps create a cohesive whole rich
recollection, our strategy is superior to the strong baseline in Summary and information. The objective is to provide
of lead sentences for single-document summary tasks. As an in-depth analysis and compare the various strategies and
LSTMs are effective memory models, their usage in this procedures used in the extractive text summarising process.
strategy might be beneficial in classifying sentences into The study has not fully addressed the issues of extractive
summaries by incorporating contributions from prior phrases. text summarisation, including time and space constraints.
The evaluation process for created summaries may be In extractive text summarisation, we generated a summary
broadened and include human grading. Contributions are that contains the most important sentences in a text. The
appreciated and accepted in data set gathering and creation. summarization is done by using the TextRank algorithm. In
Note that we used the ROUGE methodology to compare this work, we designed an Encoder-Decoder LSTM model to
our summaries with those humans who wrote them after convert text into a summary. We took a dataset and generated
completing an automatic review procedure. ROUGE (Recall- a summary of the dataset’s test data. We got good results for
Oriented Understudy for Gisting) is a metric that mechanically our model, which we can see in the graph presented in the
assesses text summarization by comparing it to human- Result Analysis. We conclude that our model can be used for
created gold standards. This metric counts overlapping units generating a summary. This model has future scope, and we
between the produced and ideal summaries to evaluate the can improve it further. The model can be improved by adding
summarization quality. more LSTM layers in the stacked LSTM and can implement
packages and concepts that can be built in the future in deep
The application uses ROUGE-N (containing ROUGE1 learning.
and ROUGE2) and ROUGE-L text summarizing methods
from several research. ROUGE-L determines the longest
common substring, whereas ROUGE-N measures n-gram References
recall. However, ROUGE1 and ROUGE2 assess unigram 1. Y. Zhang, J. Liao, J. Tang, W. Xiao, Y. Wang, Extractive
and bigram recall. In the program’s manual, ‘S’ indicates the document summarisation based on hierarchical gru, in: 2018
reference summary and ‘n’ the n-gram length and refers to International Conference on Robots & Intelligent System
the maximum number of word matches between reference (ICRIS), IEEE, pp. 341–346
and produced summaries using n-gram analysis. refers to the 2. M. Allahyari, S. Pouriyeh, M. Assefi et al., “Text summarisation
total number of n-gram words in the reference summary. Four techniques: a brief survey,” International Journal of Advanced
Rouge measurements are listed: Computer Science and Applications, vol. 8, no. 10, 2017.
3. A. B. Al-Saleh and M. E. B. Menai, “Automatic Arabic text
1. ROUGE-N: Statistics on N-gram co-occurrence, summarisation: a survey,” Artificial Intelligence Review, vol.
produced candidate summary, and referenced 45, no. 2, pp. 203–234, 2016.
summaries are compared using n-gram recall. 4. A. Turpin, Y. Tsegay, D. Hawking, and H. E. Williams, “Fast
2. ROUGE-L: The maximum length of the longest generation of result snippets in web search,” in Proceedings
common subsequence between sequences P and Q. of the 30th Annual international ACM SIGIR Conference
on Research and Development in information Retrieval
3. ROUGE-W: Weighted Longest Common Subsequence
SIGIR’07, p. 127, Amsterdam, *e Netherlands, 2007.
takes match lengths into account to improve the 5. Q. A. Al-Radaideh and D. Q. Bataineh, “A hybrid approach
accuracy of ROUGE-L. for Arabic text summarisation using domain knowledge and
4. ROUGE-S: Skipping Bigram Co-Occurrence Statistics genetic algorithms,” Cognitive Computation, vol. 10, no. 4,
Measures produced-referenced summaries overlap. pp. 651–669, 2018.
6. C. Sunitha, A. Jaya, and A. Ganesh, “A study on abstractive
We have done the process for ETS and AS by using the
summarisation techniques in Indian languages,” Procedia
paragraphs. We have shown a test case by taking a paragraph Computer Science, vol. 87, pp. 25–31, 2016.
containing words 138, sentence 11, and several characters 7. D. R. Radev, E. Hovy, and K. McKeown, “Introduction to the
627. This entire process is shown on Figs 46.6–46.9. special issue on summarisation,” Computational Linguistics,
vol. 28, no. 4, pp. 399–408, 2002.
Summarization of Legal Texts by Using Deep Learning Approaches 309

8. A. Khan and N. Salim, “A review on abstractive summarisation 23. Azmi, A.M., Al-Thanyyan, S., 2012. A text summariser for
methods,” Journal of @eoretical and Applied Information Arabic.Comput. Speech Lang. 26 (4), 260–273.
Technology, vol. 59, no. 1, pp. 64–72, 2014. 24. Peng, H., Long, F., Ding, C., 2005. Feature selection based
9. N. Moratanch and S. Chitrakala, “A survey on abstractive on mutual information criteria of max-dependency, max-
text summarisation,” in Proceedings of the 2016 International relevance, and minredundancy. IEEE Trans. Pattern Anal.
Conference on Circuit, Power and Computing Technologies Mach. Intell. 27 (8), 1226–1238.
(ICCPCT), pp. 1–7, Nagercoil, India, 2016. 25. Shankar RS, Rajanikanth J, Sivaramaraju VV, Murthy KV.
10. S. Shimpikar and S. Govilkar, “A survey of text summarisation Prediction of employee attrition using datamining. In2018 ieee
techniques for Indian regional languages,” International international conference on system, computation, automation
Journal of Computer Applications, vol. 165, no. 11, pp. 29–33, and networking (icscan) 2018 Jul 6 (pp. 1–8). IEEE.
2017. 26. Shiva Shankar R, Ravibabu D. Digital Report Grading Using
11. N. R. Kasture, N. Yargal, N. N. Singh, N. Kulkarni, and NLP Feature Selection. InSoft Computing in Data Analytics:
V. Mathur, “A survey on methods of abstractive text Proceedings of International Conference on SCDA 2018 2019
summarisation,” International Journal for Research in (pp. 615-623). Springer Singapore.
Emerging Science andTechnology, vol. 1, no. 6, p. 5, 2014. 27. Jyothirmayee S, Kumar VD, Rao CS, Shankar RS. Predicting
12. P. Kartheek Rachabathuni, “A survey on abstractive stock exchange using supervised learning algorithms.
summarisation techniques,” in Proceedings of the 2017 International Journal of Innovative Technology and Exploring
International Conference on Inventive Computing and Engineering. 2019; 9(1): 4081–90.
Informatics (ICICI), pp. 762–765, Coimbatore, 2017. 28. Kameswari KK, Raghaveni J, Shankar RS, Rao CS. Predicting
13. S. Yeasmin, P. B. Tumpa, A. M. Nitu, E. Ali, and M. I. Election Results using NLTK. International Journal of
Afjal, “Study of abstractive text summarisation techniques,” Innovative Technology and Exploring Engineering. 2019; 9:
American Journal of Engineering Research, vol. 8, 2017. 4519–29.
14. A. Khan, N. Salim, H. Farman et al., “Abstractive text 29. Surabhi MC. Natural language processing future. In2013
summarisation based on improved semantic graph approach,” International conference on optical imaging sensor and
International Journal of Parallel Programming, vol. 46, no. 5, security (ICOSS) 2013 Jul 2 (pp. 1–3). IEEE.
pp. 992–1016, 2018. 30. Dalal V, Malik L. A survey of extractive and abstractive
15. Sobh, I., Darwish, N., Fayek, M. (2006). An optimized text summarisation techniques. In2013 6th international
dual classification system for Arabic extractive generic text conference on emerging trends in engineering and technology
summarisation. Proceedings of the 7th Conf. on Language 2013 Dec 16 (pp. 109–110). IEEE.
English, ESLEC, 149–154. 31. Ganesan K, Zhai C, Han J. Opinosis: A graph based approach
16. Schlesinger, J.D., O’Leary, D.P., Conroy, J.M., 2008. Arabic/ to abstractive summarisation of highly redundant opinions.
English multi-document summarisation with CLASSY—the InProceedings of the 23rd international conference on
past and the future. In: Gelbukh, A. (Ed.), Computational computational linguistics (Coling 2010) 2010 Aug (pp. 340–
Linguistics and Intelligent Text Processing. Springer, Berlin 348).
Heidelberg, pp. 568–581. 32. Moratanch N, Chitrakala S. A survey on extractive text
17. Douzidia, F.S., Lapalme, G., 2004. Lakhas, an Arabic summarisation. In2017 international conference on computer,
summarisation system. Proceedings of the Document communication and signal processing (ICCCSP) 2017 Jan 10
Understanding conference (DUC2004). (pp. 1–6). IEEE.
18. Shankar RS, Babu DR, Murthy KV, Gupta V. An approach 33. Moratanch N, Chitrakala S. A survey on abstractive text
for essay evaluation using system tools. In2017 International summarisation. In2016 International Conference on Circuit,
Conference on Innovative Research In Electrical Sciences power and computing technologies (ICCPCT) 2016 Mar 18
(IICIRES) 2017 Jun 16 (pp. 1–9). IEEE. (pp. 1–7). IEEE.
19. Shankar RS, Srinivas LV, Ravibabu D, Raminaidu C. Novice 34. Siddiqui T, Shamsi JA. Generating abstractive summaries using
Retroaction Report. ARPN Journal of Engineering and sequence to sequence attention model. In2018 International
Applied Sciences. 2018;13 (24): PP 9746–9753. Conference on Frontiers of Information Technology (FIT)
20. Shankar RS, Murthy KV, Rao CS, Gupta VM. An approach 2018 Dec 17 (pp. 212–217). IEEE.
for extracting tweets from social media factors. In2018 ieee 35. Patil AP, Dalmia S, Ansari SA, Aul T, Bhatnagar V. Automatic
international conference on system, computation, automation text summariser. In2014 international conference on advances
and networking (icscan) 2018 Jul 6 (pp. 1–7). IEEE. in computing, communications and informatics (ICACCI)
21. AMaˆ aloul, M. H., Keskes, I., Hadrich Belguith, L., Blache, P. 2014 Sep 24 (pp. 1530–1534). IEEE.
(2010). Automatic summarisation of Arabic texts based on RST 36. Reddy SS, Gadiraju M, Maheswara Rao VV. Analyzing Student
Technique. In Proceedings of 12th International Conference Reviews on Teacher Performance Using Long Short-Term
on Enterprise Information Systems (ICEIS’2010)12th Memory. InInnovative Data Communication Technologies
International Conference on Enterprise Information Systems and Application: Proceedings of ICIDCA 2021 2022 Feb 24
(ICEIS’2010) vol. 2, Portugal. pp. 434–437). (pp. 539–553). Singapore: Springer Nature Singapore.
22. Mathkour, H.I., Touir, A.A., Al-Sanea, W.A., 2008. Parsing 37. Mahesh G, Shankar Reddy S, Maheswara Rao VV, Silpa N.
Arabic texts using rhetorical structure theory. J. Comput. Sci. Preeminent Sign Language System by Employing Mining
4 (9), 713– 720. Techniques. InInternational Conference on IoT Based Control
310 Algorithms in Advanced Artificial Intelligence

Networks and Intelligent Systems 2023 Jun 21 (pp. 571–588). algorithm. 7th International Conference on in Information
Singapore: Springer Nature Singapore. and Communication Systems (ICICS2016), 5–7 April, Irbid,
38. Wang S, Zhao X, Li B, Ge B, Tang D. Integrating extractive Jordan.
and abstractive models for long text summarisation. In2017 41. Al-Zahrani, A., Mathkour, H., Abdalla, H. (2015). PSO-Based
IEEE international congress on big data (BigData congress) Feature Selection for Arabic Text Summarisation, Journal of
2017 Jun 25 (pp. 305–312). IEEE. Universal Computer Science, 21(11): 1454–1469.
39. Jaradat, Y. A. (2015). Arabic Single-Document Text 42. 42. Mahjoub, A. Y. (2015). Text Summarisation Using Particle
Summarization Based on Harmony Search. Master Thesis, Swarm Optimization Algorithm. Master Thesis, College of
Yarmouk Uneversity, Irbid, Jordan. Graduate Studies, Sudan University of Science & Technology,
40. Jaradat, Y. A., & Al-Taani, A. T. (2016). Hybrid-based Arabic Sudan.
single-document text summarisation approach using genatic
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Optimizing Diabetes Prediction

through Intelligent Feature Selection:
A Comparative Analysis of Grey Wolf
Optimization with AdaBoost and Ant
47
Colony Optimization with XGBoost

Chigurupati Ravi Swaroop1

Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, India
Vemuri Jayamanasa2
Department of Computer Science and Engineering,
Sir C R Reddy College of Engineering, Eluru, Andhra Pradesh, India
R. Shiva Shankar3
Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, India
M. Ganesh Babu4, Vahiduddin Shariff5, N S Koti Mani Kumar6
Department of Computer Science and Engineering,
Sir C R Reddy College of Engineering, Eluru, Andhra Pradesh, India

Abstract: Diabetes, a common metabolic disease with serious health effects, is the focus of this investigation. A unique method
to increase predicted accuracy is presented in the study. We use ensemble learning methods like Grey Wolf Optimization
(GWO) with Adaboost and Ant Colony Optimization (ACO) with XGBoost. After data preparation, GWO and ACO algorithms
pick features, and model training is performed. An analysis of a dataset from the National Institute of Diabetes and Digestive
and Kidney Diseases found that Grey Wolf Optimizer (GWO) with AdaBoost outperforms Ant Colony Optimization (ACO)
with XGBoost in accuracy, precision, and AUC. Ant Colony Optimization (ACO) using XGBoost improves recall, detecting
actual positives more accurately. The models’ slight performance differences emphasize the need to select them depending on
healthcare goals. This study shows how ensemble learning and feature selection improve diagnostic accuracy and healthcare
decision-making, advancing diabetes prediction models.
Keywords: Grey wolf optimization, Ant colony optimization, Adaboost, XGBoost

1. Introduction pancreatic beta cells, and it is often diagnosed in infants.

Type 2 diabetes, on the other hand, is characterized by insulin
Diabetes is a complex metabolic condition that results in resistance and insulin shortage, and it results from lifestyle
high blood glucose levels. It can occur due to insufficient factors [2].
insulin production or inadequate insulin use. Chronic illness Diabetes can lead to various complications that affect different
is a major global health concern, and diabetes requires organs and systems in the body, such as the cardiovascular
considerable attention [1]. There are two primary types of system, kidneys, eyes, and nervous system. Additionally,
diabetes: Type 1 and Type 2. Type 1 diabetes is caused by the diabetes is a major risk factor for heart disease, stroke, and
immune system attacking and destroying insulin-producing
1
raviswaroop.chigurupati@gmail.com, 2vemuri.jayamanasa@gmail.com, 3shiva.csesrkr@gmail.com, 4mganeshbabu84@gmail.com, 5shariff.v@gmail.com,
6
koti1248@gmail.com

DOI: 10.1201/9781003529231-47
312 Algorithms in Advanced Artificial Intelligence

several other illnesses [3, 4]. A variety of complications risk. Random Forest outperforms conventional classifiers,
that can impact vital organs as well as systems, such as the increasing T2D medication predictions and showing its
cardiovascular system [5], kidneys [6], eyes [7], and nervous medicinal potential.
system [8], may arise when diabetes is not well managed [9]. In 2020, Xue et al. [16] covered the rising incidence of type 1
Additionally, this condition increases an individual’s chances diabetes in youth and the need for early prediction to minimize
of having heart disease, stroke, and so on. Uncontrolled delayed treatment and chronic consequences. The study
diabetes has severe consequences because it may lead to involved 520 participants who had either been diagnosed with
various complications that can affect critical body organs and diabetes or were at risk of developing it. The age range of
systems like the cardiovascular system, kidneys, eyes, and the participants was 16 to 90 years old. The results indicated
nervous system. Furthermore, this condition is a significant that SVM had the highest classification accuracy compared
risk factor for heart disease development, along with stroke, to the other algorithms and was the best predictor of diabetes.
among other types of health conditions [10]. In conclusion, younger diabetes cases are rising, and early
Diabetes, if left uncontrolled, has serious outcomes in identification is crucial. Machine learning, especially SVM,
that it may lead to a range of complications affecting vital transforms diabetes risk prediction, benefiting medicine.
organs such as the kidney, cardiovascular system, eye, or The article emphasizes the necessity for continuous updates
nervous system. This condition additionally increases one’s with larger case datasets to improve prediction accuracy and
susceptibility to heart disease, among others, like stroke. It is, suggests that sophisticated technology may help doctors
therefore, important for people to understand the factors that make educated illness status decisions.
lead to an increase in diabetes cases all over the continent [11]. In 2021, Ramesh et al. [17] presented a comprehensive
All this shows how serious this problem has become; hence, theoretical framework for diabetic remote patient monitoring
more concentration should be given when handling it [12]. (RPM) employing personal health devices, wearables,
Henceforth, there is a need for people across the continent and cell phones. The proposed end-to-end system uses an
to know why there have been rising cases of diabetes. This SVM to predict diabetes risk. The platform lets patients
underscores its seriousness, so priority should be given to use smartphones and wearables to track key indicators,
addressing it [13]. encouraging proactive diabetes management. The technology
promptly alerts doctors, improving diagnostic decision-
2. Literature Review making. The seamless integration of multiple cloud-based
devices delivers unobtrusiveness, cost savings, and vendor
In 2019, Kavakiotis et al. [14] investigated how machine compatibility, making this method unique. SVM-RBF is
learning and data mining have advanced diabetes research the best alternative because of its modularity and vendor
utilizing high-throughput genetic data and Electronic Health independence. The paper suggests longitudinal investigations
Records. The systematic review examines prediction, and adding gadgets and patient data to the examination.
diagnosis, complications, genetic factors, and healthcare These suggestions support the paper’s claim that continuous,
treatment. SVMs have become popular tools in many automated, and tailored diabetes therapy improves results.
fields. About 85% of SVM research employs supervised
learning, whereas 15% use unsupervised methods such as In 2022, Laila et al.[18] investigates that diabetes is a chronic
association rules. Clinical databases are abundant, allowing disease that may have significant effects if not detected
for valuable insights. The rising global incidence of diabetes early. Thus, this study examines the importance of early
has highlighted the need for sophisticated analytics in detection. The study aims to predict early diabetes incidence.
understanding, diagnosing, and managing this complex Random Forest outperforms the other two techniques in 10
metabolic disorder. fold cross-validation accuracy (97%), precision, recall, and
F1-score. The Chi-Square attribute selection method shows
In 2019, Kowsher et al. [15] aimed to improve Type 2 diabetes that Polyuria predicts diabetes risk statistically. This study
therapy and drug detection by utilizing seven classifier emphasizes the clinical importance of age in diabetes risk
algorithms. Using genetic and clinical characteristics like assessment. This research helps control health by identifying
Fasting, BMI, Duration, Age, and blood pressure, the decision diabetes early. The study suggests algorithmic improvements,
tree-based study will justify patient-appropriate medications. innovative approaches, and the use of additional data to
The technique proposed in this research helps healthcare improve predictive models and overcome diabetes prediction
practitioners make decisions since medicinal intervention challenges.
may reduce problems but may not restore normal blood
glucose levels. A sample of 666 people with type 2 diabetes The 2019–2022 studies show how dynamic diabetes research
is evaluated. The strategy helps prescribe suitable medicines is, emphasizing machine learning and data mining. The
and encourages lifestyle changes to reduce Type 2 diabetes researchers use SVM, decision trees, and ensemble learning
Optimizing Diabetes Prediction through Intelligent Feature Selection 313

to study prediction, diagnosis, treatment, and remote patient selection, Adaboost [25] and XGBoost models are created
monitoring [19]. The research stresses early diagnosis using their specified features. These measurements assess the
and the potential of advanced analytics and technology to models’ accuracy, efficiency, and precision-recall balance.
improve risk prediction, medication selection, and healthcare Optimizing ensemble learning models for diabetes prediction
outcomes for people with diabetes. The research emphasizes through feature selection is the workflow’s goal.
advancements, big datasets, and innovative methods to
enhance diabetes prediction algorithms’ accuracy and 3.1 Data Collection
usefulness [20]. S.S. Reddy et al. worked on various diabetes The National Institute of Diabetes and Digestive and Kidney
side effects and detected whether they had had diabetes or Diseases dataset used to predict diabetes was carefully
not. Among them, they have predicted whether the patients selected to meet specific requirements. All dataset participants
have gestational diabetes or not [21] and indicated whether are Pima Indian women over 21. The dataset focuses on
they have type II diabetes or not [22]. diabetes diagnostic indicators, making it useful for healthcare
research and predictive medicine using machine learning.
2.1 Motivation The collection contains several diagnostic signs that evaluate
The urgent need to address diabetes’s global impact many factors. These include the number of pregnancies,
prompted this investigation. Given the rising prevalence of plasma glucose after a 2-hour oral glucose tolerance test,
the condition and its severe effects, innovative treatments to diastolic blood pressure, triceps skin fold thickness, serum
improve prompt detection, tailored treatment, and thorough insulin levels after 2 hours, BMI, diabetes pedigree function,
administration are needed. Machine learning and data age, and a binary outcome variable indicating diabetes. The
mining may use various information for accurate prediction Kaggle dataset is helpful for researchers, data scientists,
and effective action. This research investigates advanced and healthcare professionals developing and testing
algorithms and methods to improve diabetes risk assessment. diabetes-predicting models. The dataset’s concentration on
The goal is to enhance diabetes risk assessment tools to demographic characteristics and diagnostic measures makes
improve healthcare outcomes and reduce the burden of this it a valuable resource for addressing diabetes diagnosis and
common metabolic illness. prediction in a small population. This dataset lets academics
and practitioners test machine learning and statistics methods.
2.2 Research Gap By doing so, they may better understand diabetes risk factors
Current diabetes risk prediction studies generally neglect and help develop more accurate and tailored healthcare
ensemble learning approaches. Although individual prediction models.
algorithms have been extensively studied, ensemble
techniques, including their comparative effectiveness and 3.2 Feature Selection Using Grey Wolf
subtle properties, have not. Few studies have examined the Optimization
pros and cons of ensemble learning, which combines many Grey Wolves’ cooperative hunting behavior inspired feature
algorithms, for diabetes prediction. Closing this gap is crucial selection in the diabetes dataset using Grey Wolf Optimization
to understanding the pros and cons of ensemble techniques (GWO). This strategy solves the problem creatively. A
for diabetes risk prediction. These insights may improve population of possible feature subsets is initialized for
model accuracy and influence healthcare decision-making. Optimization. Each bit in these binary vectors indicates a
feature’s existence or absence. Adaboost is used to evaluate
3. Proposed Methodology each subset’s fitness in the objective function. The subgroup
is trained and tested for diabetes prediction accuracy. Wolf
The technique suggested begins with raw data preparation to posture is guided by GWO’s three-phase strategy of encircling,
ensure consistency and reliability. This procedure includes attacking, and seeking food. This method permits dynamic
data cleansing, normalization, and missing value correction. modifications in each iteration, allowing wolves to converge
After preprocessing, two feature selection approaches are on an optimal collection of traits. A specified amount of
used. The original technique uses Grey Wolf Optimization iterations stops the process. Next, the selected characteristics
(GWO) [23], inspired by grey wolf hunting habits, to identify train a diabetic dataset-based machine learning model. This
the most critical characteristics for the Adaboost ensemble thorough strategy identifies a particular collection of features
learning algorithm. that considerably improve diabetes prediction.
GWO simulates the grey wolf pack social structure to
improve feature selection[24]. Ant colony optimization 3.3 AdaBoost
(ACO), inspired by ant foraging, is used to choose features Adaboost is an ensemble learning methodology that combines
for the XGBoost ensemble learning approach. After feature multiple weak classifiers to create a robust classifier. It is
314 Algorithms in Advanced Artificial Intelligence

also known as Adaptive Boosting. In order to improve feature 3.5 XGBoost

selection, the Grey Wolf Optimization (GWO) method can
XGBoost, or eXtreme Gradient Boosting [31], is a robust
be used to produce better features. By utilizing Adaboost, a
and effective machine learning technique that falls inside
strong prediction model can be developed for the detection
the Gradient boosting framework. By including enhanced
of diabetes. To learn more about this topic, please visit our
characteristics derived from Ant Colony Optimization
website. After the application of GWO to choose the most
(ACO), XGBoost [32] may be utilized to construct a resilient
pertinent features, Adaboost can be employed in the following
prediction model for the identification of diabetes.
manner:
The workflow involves the following steps:
1. Feature Selection by GWO: Feature selection on
the diabetes dataset will use Grey Wolf Optimization 1. Feature Selection by ACO: Ant Colony Optimization
(GWO). The optimized characteristics will be used in (ACO) is suggested for diabetic dataset feature
this method. selection. This method narrows features by assessing
2. Adaboost Training: The Adaboost approach starts their importance for optimal prediction performance.
by choosing a weak classifier like a decision tree and 2. XGBoost Training: Ant Colony Optimization (ACO)
weighting each dataset instance. identified enhanced features are used to train the
3. Iterative Learning: Each weak classifier is trained XGBoost model. The ensemble learning approach
sequentially, emphasizing situations misclassified XGBoost builds a chain of decision trees, each trying
by the previous classifiers. Increase the weights of to fix its predecessors’ mistakes.
misclassified cases to emphasize their importance in 3. Gradient Boosting: Gradient boosting improves
subsequent rounds. model performance. The Gradient of the loss function
4. Classifier Weighting: The accuracy of weak classifiers helps the XGBoost algorithm repeatedly develop trees,
determines their weight. Classifiers with higher enhancing accuracy and lowering errors.
precision are weighted more in the final combination. 4. Model Evaluation: To evaluate the performance of
5. Prediction: Predict new diabetes outcomes using the the XGBoost model, we can employ commonly used
Adaboost model. evaluation measures such as accuracy, precision, recall,
and F1-score.
Adaboost works because it adapts to the dataset. Adding
5. Prediction: Employ the taught XGBoost model to
GWO’s enhanced characteristics makes the ensemble model
predict the diabetes outcomes for novel situations.
more focused and efficient. AdaBoost ensemble learning and
Grey Wolf Optimizer (GWO) feature selection increase the Using ACO for feature selection and XGBoost for model
diabetes prediction model’s accuracy and generality. The training maximizes their benefits. The Ant Colony
Adaboost algorithm’s iterative structure prioritizes hard-to Optimization (ACO) algorithm selects an improved feature
classify instances, improving model resilience and precision. set of diabetes-prediction-relevant characteristics. However,
the XGBoost algorithm optimizes learning, creating a
3.4 Feature Selection Using Ant Colony durable and accurate prediction model. This workflow
Optimization integration increases diabetes management decision-making
interpretability and performance, increasing its efficacy.
Ant Colony Optimization (ACO) is a metaheuristic algorithm
miming ant foraging. ACO [33] is a powerful feature
selection approach for the diabetes dataset. The application 4. Results and Discussions
simulates ants’ collective decision-making to find food.
The approach assigns pheromone values to characteristics 4.1 Performance Metrics
in the diabetes dataset that are prospective routes and help Performance measurements are crucial to evaluating
achieve the optimization target, which improves prediction machine learning models. These findings assess the model’s
performance. Ants use pheromones to navigate paths. While performance in numerous areas. Frequently used performance
iterating, the algorithm approaches a selection of factors that metrics are listed below:
help forecast diabetes accurately [34]. ACOs are ideal for
complex healthcare datasets due to their flexibility and ability 4.2 Accuracy
to investigate different feature combinations. Researchers
Accuracy measures the proportion of correctly classified
can enhance diabetes prediction machine learning models
instances over the total number of instances. It is a good
using Ant Colony Optimization (ACO) for feature selection
measure when the classes are balanced and the cost of false
[30]. This method can simplify and focus feature selection,
positives and false negatives are equal.
improving healthcare decision-making.
Optimizing Diabetes Prediction through Intelligent Feature Selection 315

Formula: (True Positives + True Negatives) / (Total score shows the model’s favorable predictions are reliable.
Predictions) A recall rate of 63.63% shows that the model can effectively
identify positive instances, indicating its effectiveness in
4.3 Precision determining diabetes. Using accuracy and recall, the F1 score
Precision measures the proportion of true positives over the is 64.22%, which is excellent.
total number of positive predictions. It represents the model’s
ability to not label negative instances as positive.
Formula: True Positives/(True Positives + False Positives)

4.4 Recall (Sensitivity or True Positive Rate)

Recall measures the proportion of true positives over the total
number of actual positive instances. It represents the model’s
ability to identify all positive instances.
Formula: True Positives/(True Positives + False Negatives)

4.5 F1-Score
F1-score is the harmonic mean of precision and recall, and is
a good measure when the classes are imbalanced. It balances
the trade-off between precision and recall.
Formula: 2 * (Precision * Recall)/(Precision + Recall)

4.6 AUC Fig. 47.1 GWO + AdaBoost performance metrics

AUC measures the area below the Receiver Operating The ROC AUC score is 78.37%, which suggests that
Characteristic (ROC) curve. The ROC curve shows the true the model can distinguish between positive and negative
and false positive rates at different threshold settings. AUC is outcomes. To enhance the accuracy of diabetes prediction, we
calculated by integrating the area under the ROC curve. could combine GWO for feature selection and the AdaBoost
1 classification framework.
AUC = ÚTrue Positive Rate / False Positive Rate
Ant Colony Optimization with XGBoost (ACO +
0
XGBoost):
5. Performance Assessments Diabetes prediction using Ant Colony Optimization (ACO)
and the XGBoost classifier is effective. The model’s
Grey wolf Optimization with AdaBoost (GWO + classification accuracy is 69.00%, indicating its ability
AdaBoost): to classify people as diabetic. Precision, which measures
positive prediction accuracy, is 55.39%. This number suggests
Table 47.1 GWO + AdaBoost performance metrics reliable diabetes identification.
GWO + AdaBoost Results
Metrics Values Table 47.2 ACO + XGBoost performance metrics
Accuracy 74.67 ACO + XGBoost Results
Precision 64.81 Metrics Values
Recall 63.63 Accuracy 69.00
f1_score 64.22 Precision 55.39
AUC Score 78.37 Recall 65.46
f1_score 60.01
Diabetes prediction using Grey Wolf Optimization (GWO) AUC Score 68.08
for feature selection and AdaBoost for classification has
shown promising results. The algorithm correctly classified The model’s recall rate is 65.46%, indicating its accuracy in
people as diabetic or not with 74.67% accuracy. The model identifying true positives. This shows the model’s diabetes
has 64.81% accuracy in detecting optimistic scenarios. This detection accuracy. The F1 score, which considers accuracy
316 Algorithms in Advanced Artificial Intelligence

and recall, is 60.01%, indicating good performance. The The F1 score, which considers accuracy and recall, was
model’s AUC score is 68.08% on the ROC curve, showing higher for the GWO with the AdaBoost model at 64.22% than
its ability to distinguish positive and negative events. A the ACO with the XGBoost model at 60.01%. This suggests
collaborative approach using Ant Colony Optimization that the GWO with the AdaBoost model optimizes accuracy
(ACO) for feature selection and XGBoost for classification and recall. Regarding the ROC curve Area Under the Curve
improves diabetes prediction. (AUC) score, the GWO with AdaBoost model surpasses the
ACO + XGBoost model. GWO + AdaBoost have a higher
AUC value of 78.37%, showing better discrimination
between positive and negative cases. In comparison, ACO
with XGBoost has a 68.08% AUC.
In conclusion, both models perform well. The GWO with
the AdaBoost model has better accuracy, precision, and
AUC score, while the ACO with the XGBoost model has
better recall. These models may be chosen based on diabetes
detection priority or precision-recall balance.

Fig. 47.2 ACO+ XGBoost Performance Metrics

Performance comparison ofGWO with AdaBoost and

ACO with XGBoost
The GWO with AdaBoost and ACO with XGBoost models
can predict diabetes, although with some differences. The
GWO with the AdaBoost model outperformed the ACO with
the XGBoost model with 74.67% accuracy versus 69.00%.
This shows that the GWO with the AdaBoost model is more
accurate at classifying people as diabetic or not. The ACO
with XGBoost model has 55.39% precision, whereas the Fig. 47.3 Performance comparision of AdaBoost,
HistGradientBoosting and CatBoost
GWO with AdaBoost model has 64.81% accuracy. Precision
measures positive prediction accuracy. The GWO with the
AdaBoost model detects diabetes more reliably due to 6. Conclusion
its higher accuracy. ACO with XGBoost has a recall rate
(sensitivity) of 65.46%, compared to GWO with AdaBoost This work uses advanced ensemble learning models and
at 63.63%. Recall measures the model’s ability to detect feature selection methodologies to predict diabetes. The study
positive events. This shows that the ACO with the XGBoost examined diabetes and the need for an accurate prognosis
model better sees people with diabetes as true positives. for effective healthcare management. Combining Grey Wolf
Optimization with Adaboost and Ant Colony Optimization
Table 47.3 Performance comparision of GWO + AdaBoost with XGBoost yielded minor benefits. The Genetic Weighted
and ACO + XGBoost Optimization (GWO) method and AdaBoost produce the
accuracy and precision necessary for dependable classification
Algorithm Accuracy Precision Recall f1_score AUC
(%) (%) (%) (%) Score results. In contrast, Ant Colony Optimization (ACO) with
(%) XGBoost performed well in the recall, indicating its ability
GWO + 74.67 64.81 63.63 64.22 78.37
to identify actual positives. This study’s models and methods
AdaBoost contribute to the dynamic field of diabetes prediction by
ACO + 69.00 55.39 65.46 60.01 68.08
delivering personalized solutions that meet individual needs.
XGBoost The work supports adaptive model selection strategies,
Optimizing Diabetes Prediction through Intelligent Feature Selection 317

considering precision-recall balance and healthcare decision- 12. Sowah, Robert A., Adelaide A. Bampoe-Addo, Stephen K.
making implications. This study provides a platform for Armoo, Firibu K. Saalia, Francis Gatsi, and BaffourSarkodie-
predictive modeling developments due to diabetes’s global Mensah. “Design and Development of Diabetes Management
prevalence. Thus, diabetes-prone patients will receive better System Using Machine Learning.” International Journal of
Telemedicine and Applications 2020 (July 16, 2020): 1–17.
personalized healthcare.
https://doi.org/10.1155/2020/8870141.
13. Reddy SS, Sethi N, Rajender R. Rigorous assessment of data
References mining algorithms in gestational diabetes mellitus prediction.
International Journal of Knowledge-based and Intelligent
1. Abhari, Shahabeddin, Sharareh R. NiakanKalhori, Mehdi Engineering Systems. 2021 Jan 1;25(4):369-83.
Ebrahimi, HajarHasannejadasl, and Ali Garavand. “Artificial 14. Kavakiotis, Ioannis, Olga Tsave, AthanasiosSalifoglou,
Intelligence Applications in Type 2 Diabetes Mellitus Care: NicosMaglaveras, IoannisVlahavas, and IoannaChouvarda.
Focus on Machine Learning Methods.” Healthcare Informatics “Machine Learning and Data Mining Methods in Diabetes
Research 25, no. 4 (2019): 248. https://doi.org/10.4258/ Research.” Computational and Structural Biotechnology
hir.2019.25.4.248. Journal 15 (2017): 104–16. https://doi.org/10.1016/j.
2. Reddy SS, Rajender R, Sethi N. A data mining scheme for csbj.2016.12.005.
detection and classification of diabetes mellitus using voting 15. Kowsher, Md., FarhanaSharminTithi, TapasyRabeya,
expert strategy. International journal of knowledge-based and Fahmida Afrin, and Mohammad Nurul Huda. “Type 2
intelligent engineering systems. 2019 Jan 1;23(2):103-8. Diabetics Treatment and Medication Detection with Machine
3. Reddy SS, Sethi N, Rajender R. A comprehensive analysis Learning Classifier Algorithm.” Proceedings of International
of machine learning techniques for incessant prediction of Joint Conference on Computational Intelligence, July 4, 2019,
diabetes mellitus. International Journal of Grid and Distributed 519–31. https://doi.org/10.1007/978-981-13-7564-4_44.
Computing. 2020;13(1):1-22. 16. Xue, Jingyu, Fanchao Min, and Fengying Ma. “Research on
4. Reddy SS, Sethi N, Rajender R. Evaluation of deep belief Diabetes Prediction Method Based on Machine Learning.”
network to predict hospital readmission of diabetic patients. Journal of Physics: Conference Series 1684 (November 2020):
In2020 Second International Conference on Inventive 012062. https://doi.org/10.1088/1742-6596/1684/1/012062.
Research in Computing Applications (ICIRCA) 2020 Jul 15 17. Ramesh, Jayroop, Raafat Aburukba, and Assim Sagahyroon.
(pp. 5–9). IEEE. “A Remote Healthcare Monitoring Framework for Diabetes
5. Reddy SS, Sethi N, Rajender R. Risk Assessment of Prediction Using Machine Learning.” Healthcare Technology
myocardial infarction for diabetics through multi-aspects Letters 8, no. 3 (May 2, 2021): 45–57. https://doi.org/10.1049/
computing. EAI Endorsed Transactions on Pervasive Health htl2.12010.
and Technology. 2020 Dec 16;6(24):e3-. 18. Laila, Umm e, Khalid Mahboob, Abdul Wahid Khan, Faheem
6. Reddy SS, Sethi N, Rajender R. Diabetes correlated renal fault Khan, and WhangboTaekeun. “An Ensemble Approach to
prediction through deep learning. EAI Endorsed Transactions Predict Early-Stage Diabetes Risk Using Machine Learning:
on Pervasive Health and Technology. 2020 Nov 11;6(24):e4-. An Empirical Study.” Sensors 22, no. 14 (July 13, 2022):
7. Reddy S, Sethi N, Rajender R. Discovering optimal algorithm 5247. https://doi.org/10.3390/s22145247.
to predict diabetic retinopathy using novel assessment 19. Reddy SS, Mahesh G, Preethi NM. Exploiting machine
methods. EAI Endorsed Transactions on Scalable Information learning algorithms to diagnose foot ulcers in diabetic
Systems. 2020 Jul 1;8(29). patients. EAI Endorsed Transactions on Pervasive Health and
8. Reddy S, Mahesh G, Preethi N. Evolving a neural network Technology. 2021 Aug 24;7(29):e2-.
to predict diabetic neuropathy. EAI Endorsed Transactions on 20. Reddy SS, Mahesh G, Rao VM, Preethi NM. Developing
Scalable Information Systems. 2020 Oct 26;8(31). preeminent model based on empirical approach to prognose
9. Shifrin, Mark, and HavaSiegelmann. “Near-Optimal Insulin liver metastasis. InUbiquitous Intelligent Systems:
Treatment for Diabetes Patients: A Machine Learning Proceedings of ICUIS 2021 2022 (pp. 665–683). Springer
Approach.” Artificial Intelligence in Medicine 107 (July 2020): Singapore.
101917. https://doi.org/10.1016/j.artmed.2020.101917. 21. Reddy SS, Gadiraju M, Preethi NM, Rao VM. A Novel
10. Reddy SS, Sethi N, Rajender R. Safe prediction of diabetes Approach for Prediction of Gestational Diabetes based on
mellitus using weighted conglomeration of mining schemes. Clinical Signs and Risk Factors. EAI Endorsed Transactions
In2020 4th International Conference on Electronics, on Scalable Information Systems. 2023 Jan 11;10(3).
Communication and Aerospace Technology (ICECA) 2020 22. Reddy S, Mahesh G. Risk assessment of type 2 diabetes
Nov 5 (pp. 1213–1220). IEEE. mellitus prediction using an improved combination of NELM
11. Reddy SS, Sethi N, Rajender R, Vetukuri VS. Non-invasive PSO. EAI Endorsed Transactions on Scalable Information
Diagnosis of Diabetes Using Chaotic Features and Genetic Systems. 2021 May 3;8(32).
Learning. InInternational Conference on Image Processing 23. Mallika, C., and S. Selvamuthukumaran. “A Hybrid Crow
and Capsule Networks 2022 May 20 (pp. 161–170). Cham: Search and Grey Wolf Optimization Technique for Enhanced
Springer International Publishing. Medical Data Classification in Diabetes Diagnosis System.”
318 Algorithms in Advanced Artificial Intelligence

International Journal of Computational Intelligence Systems with Gini Index for Predicting Type 2 Diabetes.”
14, no. 1 (September 1, 2021). https://doi.org/10.1007/ INTERNATIONAL UZBEKISTAN-MALAYSIA
s44196-021-00013-0. CONFERENCE ON “COMPUTATIONAL MODELS AND
24. Bilal, Anas, Guangmin Sun, Sarah Mazhar, and Azhar Imran. TECHNOLOGIES (CMT2020)”: CMT2020, 2021. https://
“Improved Grey Wolf Optimization-Based Feature Selection doi.org/10.1063/5.0057315.
and Classification Using CNN for Diabetic Retinopathy 30. Christmas, Jacqueline, Edward Keedwell, Timothy M.
Detection.” Evolutionary Computing and Mobile Sustainable Frayling, and John R.B. Perry. “Ant Colony Optimisation to
Networks, 2022, 1–14. https://doi.org/10.1007/978-981-16 Identify Genetic Variant Association with Type 2 Diabetes.”
9605-3_1. Information Sciences 181, no. 9 (May 2011): 1609–22. https://
25. Dhilsath Fathima, M., and S. Justin Samuel. “Improved doi.org/10.1016/j.ins.2010.12.005.
Adaboost Algorithm with Regression Imputation for 31. Wang, Liyang, Xiaoya Wang, Angxuan Chen, Xian Jin, and
Prediction of Chronic Type 2 Diabetes Mellitus.” HuilianChe. “Prediction of Type 2 Diabetes Risk and Its
Communication and Intelligent Systems, 2021, 691–708. Effect Evaluation Based on the XGBoost Model.” Healthcare
https://doi.org/10.1007/978-981-16-1089-9_54. 8, no. 3 (July 31, 2020): 247. https://doi.org/10.3390/
26. Chen, Peihua, and Chuandi Pan. “Diabetes Classification healthcare8030247.
Model Based on Boosting Algorithms.” BMC Bioinformatics 32. Prabha, Anju, JyotiYadav, Asha Rani, and Vijander Singh.
19, no. 1 (March 27, 2018). https://doi.org/10.1186/s12859 “Design of Intelligent Diabetes Mellitus Detection System
018-2090-9. Using Hybrid Feature Selection Based XGBoost Classifier.”
27. Kalagotla, Satish Kumar, Suryakanth V. Gangashetty, Computers in Biology and Medicine 136 (September 2021):
and Kanuri Giridhar. “A Novel Stacking Technique for 104664. https://doi.org/10.1016/j.compbiomed.2021.104664.
Prediction of Diabetes.” Computers in Biology and Medicine 33. Manjula G, Gopi R, Rani SS, Reddy SS, Chelvi ED. Firefly—
135 (August 2021): 104554. https://doi.org/10.1016/j. binary cuckoo search technique based heart disease prediction
compbiomed.2021.104554. in big data analytics. InApplications of Big Data in Healthcare
28. Ganji, MostafaFathi, and Mohammad SanieeAbadeh. “A Fuzzy 2021 Jan 1 (pp. 241-260). Academic Press.
Classification System Based on Ant Colony Optimization 34. Reddy SS, Sethi N, Rajender R, Mahesh G. Forecasting
for Diabetes Disease Diagnosis.” Expert Systems with Diabetes Correlated Non-Alcoholic Fatty Liver Disease by
Applications 38, no. 12 (November 2011): 14650–59. https:// Exploiting Naïve Bayes Tree. EAI Endorsed Transactions on
doi.org/10.1016/j.eswa.2011.05.018. Scalable Information Systems. 2023;10(1):e2-.
29. Anwar, NurHadirahKhairul, RizauddinSaian, and Sumarni
Note: All the figures and tables in this chapter were designed by
Abu Bakar. “An Enhanced Ant Colony Optimization
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Real-Time Sign Language Translation

through Deep Learning 48

Sujatha B.1, Leelavathy N.2, K. Navya Sri3,

G. Jagan Mohan4, K. Bosu Babu5
Dept.of CSE, Godavari Institute of Engineering & Technology (Autonomous),
Rajamahendravaram

Abstract: Sharing thoughts, feelings, and information via a shared language is the foundation of communication. Because they
can’t hear or talk, people who are deaf or mute face additional obstacles and must depend on sign language. The general public
frequently fails to recognise the importance of sign language, which hinders communication between those who use it and those
who do not. We offer a fresh deep learning strategy for RT-SLR as a means of overcoming this obstacle. Although there are
systems that can identify gestures, they’re not always able to do it in real-time. Using a camera or video feed, our method starts
by recording sign language gestures. It uses MediaPipe to identify and follow the user’s hand movements, allowing it to extract
important landmarks. A TensorFlow Convolutional Neural Network (CNN) that has already been trained is used to feed these
landmarks, ensuring accurate detection in a wide variety of sign languages. Combining MediaPipe and TensorFlow creates a
flexible and dynamic platform for real-time sign language identification, enabling people with hearing loss to communicate
more effectively and fully participate in society.
Keywords: Sign language, Convolutional neural network, Media pipe, TensorFlow, Deep learning

1. Introduction important to improve the lives and social relationships of

deaf-mute individuals by tackling the communication barriers
Humans rely on communication, which allows them to they encounter as this community rises. Deaf and hard-of
express themselves verbally on a daily basis. Hearing loss, hearing people all over the globe rely on sign languages to
speech disability, or both make communication difficult for communicate; these languages have an impact on a wide
a large percentage of the world’s population. More and more range of visual cues, including hand gestures, signals, body
babies are being born with hearing loss, which is a major language, facial expressions, and lip movements [4]. Taken
problem that makes it hard for them to communicate [1]. as a whole, these elements constitute de facto sign language,
The World Health Organisation reports that the number of which effectively eliminates barriers to communication
people with hearing loss has increased dramatically over the between the hearing and deaf communities. The intricacy
past few decades. In 2019, the number of people impacted of deciphering these visual components is what makes
reached a staggering 466 million, which accounted for 5% of sign language recognition (SLR) so challenging, but it also
the global population, compared to 278 million in 2005. The provides a fertile field for AI study. Language translation,
majority of instances, 83%, involve adults, whereas a small interpretation, HCI, hand tracking, multi-person recognition,
percentage, 17%, involve children [2]. By 2050, the World gaming, VR, controlling robots, and NLP are just a few of the
Health Organisation expects this figure to have doubled, many fields that have benefited from SLR’s many applications
reaching 900 million [3]. It is becoming more and more [5]. Figure 48.1 shows the taxonomy of SLR, which includes

1
birudusujatha@gmail.com, 2drnleelavathy@gmail.com, 3karakanavyasri6@gmail.com, 4genjijaganmohan@gmail.com, 5bosubabu10@gmail.com

DOI: 10.1201/9781003529231-48
320 Algorithms in Advanced Artificial Intelligence

Fig. 48.1 SLR taxonomy and performance factors

important parts such as computation resources, features, sign languages, such as American Sign Language (ASL)
classification algorithms, input modalities (vision-based and International Sign Language (ISL), differ from place
and sensor-based), dataset categorization, and practical to place, localized databases are necessary. Systems for
applications. Recognising facial expressions, body language, recognition are impacted by the use of RGB, or depth
and hand gestures are critical characteristics of SLR. There pictures. A variety of methods are employed by researchers,
are many different kinds of classification methods used in all of which aim to improve accuracy. However, there isn’t
SLR research. These include hybrid methods, deep learning a system that works better for everyone. This work uses an
techniques like Convolutional Neural Networks (CNN), and American Sign Language (ASL) dataset and a deep learning-
more traditional models like Hidden Markov Models (HMM). based methodology to overcome common recognition issues.
Sign languages vary greatly across the globe, influenced Examples of these kinds of barriers are variations in lighting
by factors such as location, ethnicity, and vocabulary. ASL, and distance. The ASLA dataset will prove to be a priceless
BSL, ISL, and CSL are examples of sign languages that tool for researchers, as it will make applying deep learning
developed within deaf populations and are spoken in places and machine learning techniques easier and enable easier
where sign languages are also spoken [6]. Factors such as result comparison. This research offers a method for static
signing speed, picture problems, ambient variations, and hand gestures using convolutional neural network (CNN)
the variety of communication characteristics make the deep learning, which has demonstrated amazing effectiveness
development of SLR systems difficult. A lot of people are in photo classification and pattern recognition applications.
really interested in developing sign language recognition
systems because sign languages use non-manual signs that 2. Related Work
require complicated facial and hand gestures [7].Researchers
have recently used deep learning to improve SLR systems. Sign language recognition (SLR) is one method of automating
These systems have run into a variety of strategies, datasets, the process of converting sign language into text or speech
and challenges. Databases are impacted by variations in [8]. Both signer-independent and signer-dependent SLRs
picture type (RGB versus depth) and geography. Because can distinguish individual signs and continuous phrases,
using the same signer for training and testing. Both sensor
Real-Time Sign Language Translation through Deep Learning 321

based and vision/image-based systems are possible, as and video processing, as well as static picture recognition.
demonstrated in Fig. 48.2 [9]. When it comes to performing The system guarantees optimal performance by incorporating
signs, sensor-based systems demand the use of sensors, while various processes and utilizing multiple programming
vision-based solutions use images taken by cameras, doing languages to apply procedural procedures.In 2020, Sharma
away with the necessity for either sensors or gloves [10]. and Singh specified sign language as a visual language that
Aly et al. (2019) [11] emphasise how inexpensive cameras uses structured hand gestures to express ideas and concepts
have recently become, allowing for extensive application in [16]. By using clustering to group signals into predetermined
research. These domains make use of image processing, ML, categories, transforming video sources into grayscale
and deep learning to increase efficiency, broaden their use, frames, and extracting features using directional histograms,
and lower their overall costs. Nandy et al. [17] were able to attain a 100% identification
rate for ISL gestures. Mekala et al. [18] introduced a neural
network system for real-time SLR and text creation from
video streams. This system focused on hand position and
movement and used 55 hand attributes as CNN-based neural
network points of interest (POIs). It claimed to be able to
identify all letters of the English alphabet (A–Z) 100% of
the time and be 48% immune to noise.With the use of deep
learning models, Rastgoo et al. [19] created a system for real-
time isolated hand stereo vision that uses three-dimensional
hand coordinates to extract information. Their 99% accuracy
rate is quite remarkable. However, the model’s inability to
handle strong inter-class similarities and significant occlusion
between hands in specific sign cases hinders accurate sign
prediction and may cause misclassification. Using computer
vision techniques and the HSV colour scheme, Hurroo et al.
[20] presented a CNN-based system that could accurately
recognise ten ASL gesture alphabets with a 90% success rate.
They also brought attention to the usage of 3D convolutional
neural network (CNN) models, such as I3D40, for sign
Fig. 48.2 SLR modalities
language recognition (SLR), pointing out that, although
Vision-based approaches have gained priority in recent SLR computationally demanding, these models are less stable and
research. The vision-based approach improves usability accurate than other CNN models.For American Sign Language
by decreasing the dependency on sensory devices for sign finger-spelled word categorization, Rathi et al. [21] presented
language interpretation; it does this by using data obtained a ResNet50-based deep neural network that achieved 99.03%
by cameras and algorithms for processing images. Acquiring accuracy. Another study by Daroya et al. [22] used DenseNet
images, preprocessing them, segmenting them, extracting to achieve an accuracy of 90.3% in real-time sign language
features, and finally classifying them are the five main recognition. A number of reports Implementing CNN models
steps in vision-based SLR [12]. Picture acquisition gathers for ASL recognition, Rahman et al. [23], Bastwesy et al.
information from both private and public databases. Noise [24], and Abdulhussein et al. [25] achieved high accuracy
is reduced, and image quality is improved through pre rates ranging from 93% to 99.92%.Based on deep learning
processing. Segmentation isolates the area of interest. techniques often employed for sign language recognition, this
Feature extraction converts this area into recognizable feature study adds to the existing body of knowledge. Through the
vectors. Lastly, in order to recognise signs, categorization use of a convolutional neural network (CNN) model—albeit
compares these qualities with those already stored in the one with a unique architecture—it aims to identify hand-
database.Aiming to improve accuracy, deep learning has sign language alphabets communicating with the deaf. Prior
been applied to SLR systems in recent years. To account for research showing CNN’s effectiveness in picture recognition
regional differences in picture types (RGB or depth) in sign [26–28] justifies its selection. This research contributes
languages, researchers compile a number of datasets [13]. by proposing a CNN model trained on the ASL dataset,
Some cameras use depth images [15] and others use RGB incorporating scaling and backdrop correction approaches
images [14], hence the choice of image format is camera- for improved alphabet sign recognition, and presenting a
specific. There has been a lot of study in depth-camera sensing real-time hand sign language image acquisition model that
captures frames by webcam.
322 Algorithms in Advanced Artificial Intelligence

3. Mediapipe
MediaPipe provides a flexible platform for processing
perceptual data, such as photos, videos, and audio. It is an
open-source framework. It is designed for use in real-time
and works well with machine learning; it is especially good at
things like gesture detection and hand tracking [29]. Among
the many factors that go into the reliable identification of
sign signals, MediaPipe’s accuracy in monitoring fingers and
hands stands out. The focus of this research is hand tracking
in particular.
Fig. 48.4 Proposed model of real-time sign language
Important three-dimensional coordinates on a human hand translation through CNN
are represented by hand landmarks in MediaPipe Holistic.
Ensuring accurate hand tracking is an essential part of Python script successfully extracted the user’s hand from
the larger framework that aims to estimate human poses the original image using the MediaPipe framework, which
holistically. Two models are used in this process: Blaze Palm, is mainly built for component landmark detection. In order
which finds the hand in the input image efficiently, and the to store the hand landmarks as floating-point numbers in
hand keypoint localization model, which uses 21 points in 2D NumPy arrays, this module performed further processing
or 3D space, including knuckles and other features, to refine on them. The programme computed the distance between
the localization. Detailed hand landmarks, a likelihood flag for each landmark point and a default reference landmark by
the presence of the hand, and a left/right binary classification comparing the hand’s landmark coordinates to the palm’s
are all part of the result (Fig. 48.3). Applications that rely on base landmark coordinates. A convolutional neural network
these landmarks, such as hand pose estimation and gesture (CNN) model was built with TensorFlow and Keras using
recognition, are crucial. the preprocessed landmarks for training. Computer vision
projects and image data processing benefit greatly from
CNNs because of their exceptional suitability for spatial
data. For the purpose of predicting sign language motions,
this model makes use of the CNN model and the MediaPipe
Library to acquire hand and palm landmarks. Since the CNN
model relied on landmarks instead of raw visual data, it used
less storage space. We need to build an all-inclusive system
for “Sign Language Conversion to Text and Speech using
Fig. 48.3 Mediapipe’s hand landmarks: A visual overview MediaPipe and TensorFlow” by combining algorithms from
[30] various fields, such as computer vision, deep learning, and
natural language processing. Here is a carefully curated list
of algorithms for critical implementation phases:
4. Proposed Method
(a) Real-time Video Input (Webcam): Real-time video
Data preprocessing, feature extraction using the MediaPipe input from the webcam is achieved through OpenCV.
framework, and gesture recognition are the three processes This powerful library provides indispensable functions
that make up our proposed SLR technique. The initial step for accessing and capturing video frames, constituting
is to use the input frames and the built-in data augmentation a fundamental component of the system’s functionality.
algorithms to extract keypoints and landmarks from the body, (b) Hand Gesture Detection (MediaPipe): MediaPipe,
hands, and face. In order to identify and remove null entries, a sophisticated deep learning model is employed
the system stores the extracted keypoints in a file prior to data for efficient hand detection and accurate landmark
labeling in stage 2. A convolutional neural network (CNN) estimation. This model adeptly identifies and tracks
model, after training and classifying to identify American key hand landmarks, including fingertips, knuckles,
Sign Language (ASL), displays the translated sign gestures and the palm.
as text on the screen. Figure 48.4 provides a summary of
(c) Sign Language Recognition (TensorFlow): Employing
the planned architecture. The initial step of this strategy
deep learning methodologies such as CNN the system
was to employ a webcam to record the users’ real-time
undertakes the recognition of sign language gestures.
movements on the web. A Flask API was used to recognize
This involves the option to design and train a customized
and preprocess hand landmarks in the video stream. The
Real-Time Sign Language Translation through Deep Learning 323

model or utilize pre-trained models tailored for image

or sequence recognition tasks.
(d) Gesture-to-Text Conversion: The recognized sign
language gestures undergo conversion to text through
algorithms based on dictionaries or sophisticated
sequence-to-sequence (S2S) models. The selection of
the approach hinges on the intricacy of sign language
interpretation.
(e) Text-to-Speech (TTS) Synthesis: The system integrates
Text-to-Speech (TTS) synthesis utilizing renowned
libraries such as GTTS or pyttsx3. These libraries
facilitate the transformation of recognized text into
natural-sounding speech, ensuring accessibility for Fig. 48.5 Sign language detection model training [31]
auditory communication.
(f) User Interface and Interaction: GUI design with (Conv1, Conv2, and Conv3) that comprise the CNN’s feature
Python or other UI frameworks. This encompasses the extraction section uses three × three convolution filters. Each
incorporation of intuitive buttons for customization, of these layers uses a different number of filters: CovNet3
options for selecting sign languages, and the display of uses 128 filters, ConvNet2 uses 64 filters, and ConvNet1 uses
recognized text or synthesized speech. 32 filters. Rectified Linear Units (ReLu) are applied with each
convolution. After that, to preserve the critical information’s
This systematic approach is to realize a robust Sign Language representation, utilize MaxPooling using a 2 × 2 grid. To
Conversion system. The strategic integration of MediaPipe prepare the data for the classification stage, we flatten it in
for hand gesture detection, TensorFlow for sign language the convolutional step. For prediction in the first stage, we
recognition, and complementary components ensures a employ fully connected layers, a ReLu activation function
comprehensive solution for users with diverse communication in the second stage, and a SoftMax output layer in the final
needs. stage. With the provided input frames, this CNN architecture
is able to recognize sign language gestures with success.
5. Convolutional Neural Network
5.1 CNN Model Training for Classification
(CNN)
The suggested CNN model begins with scaled input frames
Convolutional neural networks (CNNs) are a powerful class from films for processing ease. Next, we employ a succession
of deep learning algorithms with numerous applications, of convolutional layers to train the model using these frames.
including image classification, audio recognition, scene These layers apply filters to find certain patterns or features
labeling, and natural language processing. Because neural in the input images, with each filter represented by a smaller
networks scan the images, assign values to features, and then matrix than the original image. Construct the model in a
utilize these values to identify specific objects, they are able sequential fashion using layers, where each layer’s output
to distinguish between different regions of the input photos. serves as the input for the subsequent layer. Initially, the
A convolutional neural network (CNN) employs input, stride value is larger to save processing, and it is eventually
convolution, pooling, and fully connected layers to produce decreased to capture finer details. Batch normalization keeps
an output, as shown in Fig. 48.5. In order to minimize feature the range of values consistent across layers, making training
dimensions and prevent overfitting, the pooling layer chooses more efficient and preventing difficulties like internal covariate
the most pertinent features after the convolutional layer has shift. As an activation function, rectified linear units (ReLU)
extracted the features. Then, the fully linked layer with an make features more linear by activating only positive values.
activation function passes the collected characteristics in By taking the maximum value from a pool, max pooling layers
the last stage. CNN’s automated feature extraction yields downsample the network, reducing the number of features
far superior outcomes to more traditional image processing and calculations. The full model architecture, displayed
techniques [31]. As Fig. 48.5 illustrates, our research in Fig. 48.6, includes alternating convolutional and max-
entailed creating a CNN model with several layers. The pooling layers to efficiently reduce the size for computing.
64 × 64 × 3 dimensions of the input images that the proposed After the convolutional layers have produced their output, the
convolutional neural network (CNN) architecture can last step in the classification process is to flatten it. To avoid
handle correspond to the size of the sign language frames overfitting, dropout randomly disables some nodes, which
that our system uses. Each of the three convolutional layers decreases the amount of interdependent learning that occurs
324 Algorithms in Advanced Artificial Intelligence

6. Dataset
The majority of the deaf and hard-of-hearing population
in North America uses American Sign Language (ASL), a
visual and gesture language, to communicate. The American
Sign Language uses hand shapes, gestures, and facial
expressions as a substitute for spoken words. The focus of
this research is the recognition of ASL alphabetic characters
using the ASL dataset that was sourced from Kaggle [32].
Figure 48.7 shows an example of an image from the dataset
that intricately represents the 26 letters of the American Sign
Language (ASL) alphabet, mostly utilising one-handed signs
to represent the letters A through Z. This dataset focuses on
single-handed signs, while American Sign Language covers
various variants of signs. Although the complex photos
in the Kaggle dataset are a challenge, the study’s goal is
to demonstrate that their suggested methodology works
effectively in this setting. There are 26 classes in this dataset,
with each class representing an ASL letter, and it contains a
total of 780,000 photos. Character recognition research using
machine learning or computer vision methods will greatly
benefit from this dataset, which has 3,000 samples per class.

7. Experimental Setup
We begin by introducing the datasets used and outlining
the necessary pre-processing steps. After that, we test every
part of our system thoroughly using both quantitative and
qualitative methods. To make data suitable for an SLR system,
it is necessary to transform pixel data into images that are
compatible with the algorithms. We use the MediaPipe Hands
solution to collect 63 3D landmarks from all of the photos in
the dataset, which are the result of 21 points with x, y, and
depth information, in order to recognise the ASL alphabet.
These landmarks provide important spatial information about
hand motions, including x and y coordinates as well as depth
data. In a new coordinate system, the wrist’s coordinates
are selected as the origin (0,0,0), shifting the values of each
coordinate to make these landmarks suitable for classification.
The objective of this classification job is to anticipate one
of the 26 alphabet labels in American Sign Language using
these processed 63 data points as input. When classifying
ASL signals, we do not consider handedness because the
signs for the alphabet are similar regardless of the hand used
and can be performed with either hand. Also, keep in mind
Fig. 48.6 CNN model for gesture recognition that MediaPipe might miss some photographs that contain a
hand. To make sure the dataset is relevant and of high quality
among neurons. Repeating the application of fully connected for our classification task, we exclude these photographs
layers and the dropout algorithm acquires the output class from the test, validation, and training sets. A data generator
values. The softmax function assigns a probability to each introduces noise and transformations to improve the training
gesture class, and the model’s output is the class with the data. After twenty epochs of training with thirty-two batches,
highest probability. This method guarantees the model’s the CNN model achieved a validation accuracy of 98.69%
ability to identify and categorize sign language gestures. by the tenth epoch and a peak accuracy of 98.91% by the
Real-Time Sign Language Translation through Deep Learning 325

Fig. 48.7 Exploring the 26 characters of the American sign language alphabet [32]

(a) (b)
Fig. 48.8 (a) Training and validation loss, (b) Training and validation accuracy

twentieth epoch. Consistent improvements in accuracy and 7.1 Evaluation Metrics

loss, as shown in Fig. 48.8(a) and 48.8(b), indicate that
To assess the effectiveness of proposed model, we utilized
overfitting is not an issue.
key evaluation metrics: mean squared error (MSE), mean
326 Algorithms in Advanced Artificial Intelligence

absolute error (MAE), and R-squared as outlined in true positives (TP) or accurate predictions, true negatives
Table 48.1.The Mean Absolute Error (MAE) represents the (TN) or correct non-predictions, false positives (FP) or
average absolute difference between predicted and actual incorrect predictions, and false negatives (FN) or incorrect
values in the dataset. Its formula is given by: non-predictions. This aids in our comprehension of how
1 N accurately the model identified various objects.
MAE = Â |yi - ŷ|
N i=1
(1) TP + TN
Accuracy ( A) = (4)
TP + FP + TN + FN
The Mean Squared Error (MSE) is the average of the squared
TP
difference between predicted and actual values in the dataset. Precision ( P ) = (5)
Its formula is: TP + FP
1 N TP (6)
Recall ( R) =
MSE = Â (yi - ŷ)2
N i=1
(2) TP + FN
TP
The R2 score indicates the goodness of fit of the model to the F1 Score ( F1) = (7)
TP + FN
dataset. Score indicates the goodness of fit of the model to the
dataset. It ranges between 0.0 and 1.0, where 0.0 denotes the Accuracy refers to the number of accurately predicted
worst fit and 1.0 denotes a perfect fit. The formula is: data points, and it should ideally be near 1. When false
positives are expensive, precision—also known as the
1 N
RMSE = MSE = Â (yi - ŷ)2
N i=1
(3) positive predicted value—becomes essential. It computes
the percentage of positive predictions among all anticipated
positive class values. Recall measures the percentage of
Here, yi, represents actual values, ŷi represents predicted
positive outcomes accurately predicted. The harmonic mean,
values, ȳ is the mean of actual values, and n is the number of
or F1-score, strikes a balance between recall and precision.
data points. Table 48.1 presents MSE,MAE and R-squared
It maximizes at 1, which occurs when recall and precision
values for different models, including simple RNN, LSTM,
are flawless. The results were calculated using Equations
Standard GRU, BIGRU, and BiLSTM. These values indicate
(4–7), and the classification report is shown in Fig. 48.9. The
higher average residual and variance in the residuals for these
suggested model’s accuracy, recall, and F1-score are all near
models. Despite BiLSTM's success in other applications,
1, with only a few values marginally below, suggesting that
it performed poorly in our dataset due to limited data for
the model successfully mastered the training set.
sequence prediction. As a result, we introduced the Mediapipe
CNN model in an attempt to reduce prediction errors.
8. Conclusions
Table 48.1 Comparing MAE, MSE, and R2 for Various Models
Finding a method to recognize American Sign Language
Network Model MAE MSE R2 (ASL) using deep learning was the aim of this study in order
Simple RNN 4.1 28.9 –1.38 to facilitate real-time communication between hearing and
LSTM 0.75 4.95 0.59 deaf individuals. We utilize a webcam in combination with the
Standard GRU 0.44 1.38 0.83
MediaPipe framework to identify hand landmarks in order to
implement the model. After preprocessing the recorded hand
BIGRU 0.4 2.5 0.79
landmarks, we trained a convolutional neural network (CNN)
BILSTM 0.85 5.35 0.56 model to predict sign language motions with a 98% success
Proposed CNN 0.23 1.28 0.7 rate. Even if the model accomplishes its short-term goals—
like capturing webcam feeds in real-time and making correct
7.2 Qualitative Analysis predictions—there remains room for improvement in terms
of testing and evaluation to determine the system’s scalability
The study employed classification measures to assess the and sustainability. The development of this prototype has the
precision of predictions for specific sign motions. The study potential to greatly benefit sign language users by facilitating
computed these metrics using ASL data to evaluate the better communication and laying the groundwork for further
quality of the predictions, including accuracy (A), precision advancements and wider applications in computer vision and
(P), recall (R), and F1-score. We also used a confusion matrix gesture detection.
to assess the model’s performance, which provides data on
Real-Time Sign Language Translation through Deep Learning 327

1.2

0.8

0.6

0.4

0.2

0
A B C D E F G H I J K L M N O P
Precision (P) 1 1 1 1 1 1 1 1 1 0.80.9 1 1 1 1 1
Recall (R) 1 1 1 1 1 1 1 1 1 1 1 0.8 1 1 1 1
F1-Score (F1) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Precision (P) Recall (R) F1-Score (F1)

Fig. 48.9 Evaluation of Model Performance: Precision, Recall, and F1-Score

19, 7056–7063. https:// doi. org/ 10. 1109/ JSEN. 2019. 29098
References 37 (2019).
1. Krishnaveni, M., Subashini, P., & Dhivyaprabha, T. T. (2019). 9. Mahmood M. R. and Abdulazeez A. M., “A Comparative
An assertive framework for automatic tamil sign language Study of a New Hand Recognition Model Based on Line of
recognition system using computational intelligence. Features and Other Techniques,” in International Conference
Intelligent systems reference library: 150. Springer of Reliable Information and Communication Technology,
International Publishing. 2017, pp. 420–432.
2. Savur, C., & Sahin, F. (2016). Real-time American Sign 10. Rautaray S.S., A. Agrawal, Vision based hand gesture
Language recognition system using surface EMG signal. In recognition for human computer interaction: a survey Artif.
Proceedings of the IEEE 14th international conference on Intell. Rev., 43 (1) (2015), pp. 1–54.
machine learning and applications, ICMLA 2015 (pp. 497– 11. Aly, W., Aly, S., & Almotairi, S. (2019). User-Independent
502). American Sign Language Alphabet Recognition Based
3. El-Din, S. A. E., & El-Ghany, M. A. A. (2020). Sign language on Depth 6 Image and PCANet Features. IEEE Access, 7,
interpreter system: An alternative system for machine 123138–123150.
learning. In Proceedings of the 2nd novel intelligent and 12. Bantupalli, K., & Xie, Y. (2019), American Sign Language
leading emerging sciences conference, NILES 2020, Ml Recognition using Deep Learning and Computer Vision,
(pp. 332–337). 2018 IEEE International Conference on Big Data (Big Data),
4. Cheok, M.J., Omar, Z. & Jaward, M.H. A review of hand p. 4896–4899.
gesture and sign language recognition techniques. Int. J. 13. Jain V., Jain A., Chauhan A., Kotla S.S., Gautam A., American
Mach. Learn. & Cyber. 10, 131–153 (2019). Sign Language recognition using support vector machine and
5. Wadhawan, A., Kumar, P. Deep learning-based sign language convolutional neural network, Int. J. Inf. Technol. 13 (2021)
recognition system for static signs. Neural Comput & Applic 1193–1200.
32, 7957–7968 (2020). 14. Daroya R., Peralta D., Naval P., Alphabet sign language
6. Agrawal, S. C., Jalal, A. S., & Tripathi, R. K. (2016). A survey image classification using deep learning, in IEEE Region 10
on manual and non- manual sign language recognition for Annual Int. Conference, Proceedings/TENCON (2019) vol.
isolated and continuous sign. International Journal of Applied 2018-Octob, no. October, pp. 646–650.
Pattern Recognition, 3 (2), 99. 15. Ameen S., Vadera S., A convolutional neural network to
7. Ahmed M., Idrees M., Abideen Z., Mumtaz R., Khalique classify American sign language fingerspelling from depth
S., Deaf talk using 3D animated sign language in 2016 SAI and colour images, Expert Syst. 34 (3) (2017).
Comput. Conference (SAI) (2016), pp. 330–335. 16. Ashish Sharmaa, Anmol Mittala, Savitoj
8. Mittal, A., Kumar, P., Roy, P. P., Balasubramanian, R. & Singha,VasudevAwatramani, Hand Gesture Recognition
Chaudhuri, B. B. A modified LSTM model for continuous using Image Processing and Feature Extraction Techniques,
sign language recognition using leap motion. IEEE Sens. J. Procedia Computer Science 173 (2020) 181–190.
328 Algorithms in Advanced Artificial Intelligence

17. Nandy, A.; Prasad, J.; Mondal, S.; Chakraborty, P.; Nandi, G. 25. A. Abdulhussein and F. Raheem, “Hand Gesture Recognition
Recognition of Isolated Indian Sign Language Gesture in Real of Static Letters American Sign Language (ASL) Using Deep
Time. Commun. Comput. Inf. Sci. 2010, 70, 102–107. Learning,” Eng. Technol. J., vol. 38, no. 6, pp. 926–937, 2020.
18. Mekala, P.; Gao, Y.; Fan, J.; Davari, A. Real-time sign 26. M. Al-Hammadi et al., “Deep learning-based approach for
language recognition based on neural network architecture. In sign language gesture recognition with efficient hand gesture
Proceedings of the IEEE 43rd Southeastern Symposium on representation,” IEEE Access, vol. 8, pp. 192527–192542,
System Theory, Auburn, AL, USA, 14–16 March 2011. 2020.
19. Rastgoo, R., Kiani, K. & Escalera, S. Video-based isolated 27. Y. Dong, Q. Liu, B. Du, and L. Zhang, “Weighted Feature
hand sign language recognition using a deep cascaded model. Fusion of Convolutional Neural Network and Graph Attention
Multimedia Tools Appl. 79(31–32), 22965–22987. https:// Network for Hyperspectral Image Classification,” IEEE Trans.
doi. org/ 10. 1007/ s11042- 020- 09048-5 (2020). Image Process., vol. 31, pp. 1559–1572, 2022.
20. Hurroo, M. & Elham, M. Sign language recognition system 28. Y. L. Chang et al., “Consolidated Convolutional Neural
using convolutional neural network and computer vision. Int. Network for Hyperspectral Image Classification,” Remote
J. Eng. Res. Technol. (IJERT) 9(12), 59–64 (2020). Sens., vol. 14, no. 7, pp. 1–16, 2022.
21. P. Rathi, R. K. Gupta, S. Agarwal, A. Shukla, and R. Tiwari, 29. Subramanian, B., Olimov, B., Naik, S.M. et al. An integrated
“Sign Language Recognition Using ResNet50 Deep Neural mediapipe-optimized GRU model for Indian sign language
Network Architecture Pulkit,” Next Gener. Comput. Technol. recognition. Sci Rep 12, 11964 (2022).
2019 Sign, pp. 1–7, 2019. 30. F. Zhang, V. Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung,
22. R. Daroya, D. Peralta, and P. Naval, “Alphabet Sign Language C. Chang, and M. Grundmann, “Mediapipe hands: On-device
Image Classification Using Deep Learning,” IEEE Reg. 10 real-time hand tracking,” CoRR, vol. abs/2006.10214, 2020.
Annu. Int. Conf. Proceedings/TENCON, pp. 646–650, 2019. 31. Van Hiep Phung and Eun Joo Rhee, “A High-Accuracy Model
23. M. M. Rahman, M. S. Islam, M. H. Rahman, R. Sassi, M. Average Ensemble of Convolutional Neural Networks for
W. Rivolta, and M. Aktaruzzaman, “A new benchmark on Classification of Cloud Image Patches on Small Datasets”, 23
american sign language recognition using convolutional October 2019.
neural network,” in International Conference on Sustainable 32. Kaggle. ASL Alphabet. Available online: https://www.kaggle.
Technologies for Industry 4.0, STI 2019, pp. 1–6, 2019. com/grassknoted/asl-alphabet (accessed on 19 July 2021).
24. M. R. M. Bastwesy, N. M. ElShennawy, “Deep Learning Sign
Note: All the figures and table in this chapter were designed by the
Language Recognition System Based on Wi-Fi CSI,” Int. J.
author.
Intell. Syst. Appl., vol. 12, no. 6, pp. 33–45, 2020.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Ensuring Data Privacy in the Cloud:

Authprivacychain’s Blockchain
Access Control
49

R. Tamilkodi1
Professor, Godavari Institute of Engineering & Technology,
Rajahmundry, Andhra Pradesh, India
K. Surya Kala2
Assistant Professor, Department of CSE(AIML&CS),
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
T. Durga Sukanthika3, B. Aanantha Sai Datta Kiran4,
V. Hemanth Reddy5, K. Srimani Neha6
Department of Computer Science & Engineering (AIML & CS),
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India

Abstract: The issue at hand is that as cloud computing grows, there is growing worry about cloud security. Sensitive data
stored in the cloud is susceptible to illegal access or alteration by hackers or internal cloud administrators when centralized
access control methods are in place. The security and integrity of corporate and personal data are seriously threatened by this.
One proposed answer for this issue is AuthPrivacyChain, a blockchain based access control architecture AuthPrivacyChain
uses blockchain node addresses as identities to renegotiate permissions for cloud data access control. These permissions are
kept on the blockchain and are encrypted. Processes for authorization, revocation, and access control are also included in the
framework. When AuthPrivacyChain is used on an enterprise operating system such as EOS, it significantly improves cloud
security by guarding against unwanted access and guaranteeing the privacy of authorized users.
Keywords: Authorization revocation access control framework, Enterprise operating system EOS (Enterprise operating
system), Unwanted access, Authorized users cloud security improvement

1. Introduction hot report region. Keeping unapproved clients from getting to

or stealing information housed on cloud servers is its point.
The expression “cloud computing” portrays the on-request Since the three primary cloud computing service frameworks
internet access to a scope of PC assets, for example, — software as a service (SaaS), platform as a service
improvement instruments, organizing, information capacity, (PaaS), and infrastructure as a service (IaaS) — all rely upon
servers (virtual and genuine), applications, and that’s just access control to safeguard basic assets, it is imperative.
the beginning. These assets are overseen by a cloud services Nonetheless, centralized methods of storing and managing
supplier, or CSP, and are situated in a far off server farm. identity, key, authority, authentication data, etc. are available
The CSP charges a month to month participation expense, or in both academia and industry. Therefore, there are still two
utilization based instalment, as a trade-off for making these security and privacy issues with access control technologies.
assets available. Meanwhile, access control has turned into a At first, there is An external attacker compromises the
1
tamil@giet.ac.in, 2surya.k0314@gmail.com, 3 sukithota3998@gmail.com, 4 kiranbatchu02@gmail.com,5 20551A4655.hemanth@gmail.com,
6
nehakaramcheti2003@gmail.com

DOI: 10.1201/9781003529231-49
330 Algorithms in Advanced Artificial Intelligence

confided in focus’ security, tampers with the allowed data set shown by simulations and assessments, indicating that it is a
on the central servers and acquires unapproved admittance to useful addition to IC-SNs.
steals the assets that clients have put away there. Second, a Jianhua Li, Xi Lin, and others, [6] They present that is intended
malevolent system administrator could abuse this capacity to to make knowledge trading easier in Internet of Things (IoT)
get unapproved admittance to assets or to change the consent contexts that are enabled by Edge artificial intelligence.
data set to get unlawful access, since the cloud system It provides an architecture for the knowledge market’s
director is responsible for the approval information base and implementation, complete with a blockchain for knowledge
approaches assets. consortiums for safe and effective knowledge exchange and
The computation and storage modes of cloud computing administration. This blockchain includes smart contracts, a
have changed significantly from the previous computer new cryptographic cryptocurrency called “knowledge coin,”
paradigm. These changes are mostly evident in the following and a special consensus process called “proof of trading.” To
five aspects: There are several reasons why users may find it get more individuals engaged with the market, the framework
difficult to control cloud resources: likewise offers a knowledge cost technique in light of
1. lack of trust between users and the cloud; noncooperative games with remunerations. Security tests
2. data may change the security domain due to migration and performance models show that the framework functions
technologies; admirably; this is the main illustration of a P2P knowledge
market that functions admirably and is driven by motivators
3. access subjects may be redefined due to multitenant
in Edge-AI fuelled IoT.
technologies; and
4. virtualization technologies may make it possible for Chen Jianing, Wu Jun, et al. [24] The suggested system is
difficulties, scholarly research on cloud access control an Unbiased Collaborative Trust-Based Control Transfer
has proliferated, and industry attempts to use access Mechanism (CTM) that aims to improve industrial automation
control systems already in place have also been made. security and trust. By offering a trust-based delegated
proof of stake agreement, it beats the shortfall of trust in
They do, however, have centralized methods for managing
modern control frameworks. Control authority are assigned
and storing identity, key, authority, authentication data, etc.
in a fair and dynamic manner by this consensus approach.
Therefore, there are still two issues with access control
Furthermore, a CTM is put into place for catastrophe backup,
technologies related to security and privacy:
allowing blockchain nodes to switch over control authority.
1. An external attacker compromises the central server, The viability and efficacy of CTM in enhancing industrial
assaults the trusted centre, and gains unauthorized automation security are validated by simulations.
access to or theft of user resources kept in the cloud.
Suyong Eum, Keping Yu, et al., [26] The goal of the suggested
2. A malevolent system administrator could abuse this system is to provide a summary of the current state of research
capacity to get unapproved admittance to assets or to and standards related to information-centric networking, or
change the approval data set to get unlawful access, ICN. The history of international ICN operations starting
as the cloud system chairman is responsible for the in 2010 is traced, with references to several initiatives.
approval information base and approaches assets. The study then explores the latest developments in ICN
component technology standardization, namely in ITU-T
2. Literature Review and in ICNRG’s documentation. Lastly, it considers potential
future paths for ICN’s development as a cutting-edge network
Mianxiong Dong, Jun Wu, et al., [4] Information-centric
architecture.
social networks (IC-SN) have changing demands, and the
suggested solution, called FCSS (Fog-Computing-based Yuwei Su, Xin Qi, and others [27] In order to overcome
Content-Aware Filtering Method for Security Services), is the current shortcomings in information-centric networking
made to meet such needs. By bringing fog computing to IC (ICN), the suggested solution presents the idea of named-
SN, it moves resources and computational intelligence to the node networking (3N). ICN’s emphasis on content-centric
edge of the network. End-to-end connectivity and low-latency communication has drawn attention, however it is devoid of
security service filtering are guaranteed by this method. By host-centric features and seamless mobility support. The 3N
using content-label technology, FCSS integrates an effective system provides a complete solution that includes mobility
content-aware filtering system that allows precise security assistance, data security, data transmission, and naming. A
service filtering at the network edge. The benefits of FCSS in 3N-based real-time video streaming system is developed
terms of hit ratio, filtering latency, and filtering accuracy are and tested; it performs better than TCP video streaming,
Ensuring Data Privacy in the Cloud: Authprivacychain’s Blockchain Access Control 331

especially when there are several clients, and it exhibits Modules:

useful Wi-Fi-based handoff features. The creator of the proposed paper is making the accompanying
modules.
3. Proposed Methodology 1. Initialization: the information owner, information
Weight is first acquainted with multi-authority based property client, and cloud server are the three clients who make
encryption plans in literature. A weighted attribute encryption up this module.
procedure with multi-authority in light of cloud computing 2. Registration: Each client will finish an application,
is recommended. Different loads are allotted to qualities by and the Smart Contract feature will keep their data on
trait authority in view of their relative pertinence. The review the blockchain. Blockchain might be utilized to store
shows the security of the proposed plan. Contrasted with the control or permission for access after registration.
ongoing plans, the plan is more appropriate for the cloud Blockchain produces recognize keys for every client.
computing climate since it might address the pertinence of 3. Cloud to Blockchain: the cloud will send a solicitation
characteristics. to register on the blockchain.
We desire framework with privacy protection. Regardless, 4. User to Blockchain: The owner of the information will
we appropriate the node’s blockchain report address as its give clients admittance to the blockchain, empowering
personality. Simultaneously, we change the authorizations for them to transfer, distribute, and repudiate information.
cloud data access control, since the information is encoded Here, the AES strategy is utilized to encrypt each record.
and kept in blockchain. Then, we make the AuthPrivacyChain We have traded information between two clients who are
approval, disavowal, and access control methods. Finally, we entrusted with completing this undertaking: “DOCTOR”
set AuthPrivacyChain up as a regular occurrence utilizing the and “RESEARCHER.” The information proprietor awards
enterprise operating system (EOS). As well as guaranteeing admittance to the information client who is entrusted with
the assets’ secrecy, uprightness, accessibility, legitimacy, doing this project. Users of information are additionally
and responsibility, our design, Authprivacychain, is likewise allowed to endorse each
equipped for enduring different inward and outer dangers.

Fig. 49.1 System architecture

332 Algorithms in Advanced Artificial Intelligence

Fig. 49.2 System architecture

4. Experimental Results
Double tapping the “Start_IPFS.bat” document will send off Click the “Data User Signup Here” link in the top page to add
the cloud server and show the screen below. other users, such as researchers, doctors, and data owners.

The data owner is registering on the screen above; click the

button to finish the registration process and add physicians
and researchers in a similar manner.

The cloud server has started in the screen above document to

scatter the Python server and see the screen below.

Python Web Server started on the screen above. To see

the screen below, open a program, type in the URL After completing the enrolment procedure in the above page,
“http://127.0.0.1:8000/index.html,” and hit the Enter key. select the “Data Owner” link to log in as the data owner.
Ensuring Data Privacy in the Cloud: Authprivacychain’s Blockchain Access Control 333

The data owner is logged in on the top screen; upon login, the The file is uploaded on the screen above, and the storage hash
screen below appears. code is shown. The data owner may click the “Revoke user”
link to revoke access to the file by choosing it.

Choose any file on the screen above, then click the button
to remove access from it. Then, log out and log back in as
The data owner may upload a file to the cloud in encrypted
“doctor” to check if access is still allowed.
format by clicking the “Upload Data” option in the above
page. The doctor is logged in, and after that, they will see the screen
below.

The doctor may examine all files shared by the data owner by
The data owner uploads the file on the above page, chooses clicking the “Access Share Data” option in the aforementioned
an access user—a doctor, researcher, or both—and presses page.
the upload button. on the same screen, I provide permission The doctor may see all files shared by the data owner on the
to the user named “doctor.” Pressing the button yields the interface, and by clicking the “Click Here” link, they can
result below. download the file.
334 Algorithms in Advanced Artificial Intelligence

The file is now downloading, as shown in the browser status Now the researcher has access to the file. To see the graph
bar above. By selecting the “Indirect Access Control” option, below, click the “Smart Contract Computation Graph” link.
a doctor may now provide a researcher access to this file.

The filename on the x-axis and the calculation time to encrypt

The doctor may choose a file on the above interface, click and save the file in the blockchain on the y-axis are shown
a button to grant the researcher access, and get the results in the above graph. Similar to that, you may add users and
below. allow them to exchange and remove data from each other (see
Fig. on next page).

5. Conclusion and Future Scope

To prevent aggressors from getting into assets without
permission, we made an access control framework called
AuthPrivacyChain that protects privacy in cloud settings.
The client adds to the blockchain each action that has to
do with authorizations. We utilized the framework model
in light of the EOS blockchain. We viewed highlights like
access privileges and other data as adding more importance
to blockchain occasions. According to the experiment’s
findings, resources can only be accessed by those who have
In the screen above, “angular.txt” is granted indirect access. access permissions.
To view that shared file, log out and then in back in as a
researcher.
References
The researcher user is logged in, and they will see the screen
below once they log in. 1. P. Mell and T. Grance, ‘‘The NIST definition of cloud
computing,’’ Nat. Inst. Standards Technol., Gaithersburg, MD,
USA, Tech. Rep. Special Publication 800-145, 2011.
2. F. Liu, J. Tong, J. Mao, R. Bohn, J. Messina, L. Badger, and D.
Leaf, ‘‘NIST cloud computing reference architecture,’’ NIST
Special Publication, vol. 500, no. 211, pp. 1–28, 2011.
3. M. Armbrust, I. Stoica, M. Zaharia, A. Fox, R. Griffith, A. D.
Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, and A.
Rabkin, ‘‘A view of cloud computing,’’ Commun. ACM, vol.
53, no. 4, 2010.
4. J. Wu, M. Dong, K. Ota, J. Li, and Z. Guan, ‘‘FCSS: Fog
computing-based content-aware filtering for security services
in information-centric social networks,’’ IEEE Trans. Emerg.
Topics Comput., vol. 7, no. 4, pp. 553–564, Oct. 2019.
5. K. Yu, M. Arifuzzaman, Z. Wen, D. Zhang, and T. Sato,
When a researcher clicks the “Access Share File” link on the ‘‘A key management scheme for secure communications of
above page, the result shown below is what they can access. information centric advanced metering infrastructure in smart
Ensuring Data Privacy in the Cloud: Authprivacychain’s Blockchain Access Control 335

grid,’’ IEEE Trans. Instrum. Meas., vol. 64, no. 8, pp. 2072– 11. C. Lee, P. Chung, and M. Hwang, ‘‘A survey on attribute-based
2085, Aug. 2015, doi: 10.1109/TIM.2015.2444238. encryption schemes of access control in cloud environments,’’
6. X. Lin, J. Li, J. Wu, H. Liang, and W. Yang, ‘‘Making Int. J. Netw. Secur., vol. 15, no. 4, pp. 231–240, 2013.
knowledge tradable in edge-AI enabled IoT: A consortium 12. R. Charanya and M. Aramudhan, ‘‘Survey on access control
blockchain-based efficient and incentive approach,’’ IEEE issues in cloud computing,’’ in Proc. Int. Conf. Emerg. Trends
Trans. Ind. Informat., vol. 15, no. 12, pp. 6367–6378, Dec. Eng., Technol. Sci. (ICETETS), Feb. 2016, pp. 1–4, doi:
2019. 10.1109/ICETETS.2016.7603014.
7. Y. Q. Zhang, X. F. Wang, X. F. Liu, and L. Liu, ‘‘Survey on 13. J. M. Ferris, ‘‘Providing access control to user-controlled
cloud computing security,’’ J. Softw., vol. 27, no. 6, pp. 1328– resources in a cloud computing environment,’’ U.S. Patent 8
1348, 2016. 984 505, Mar. 17, 2015.
8. Z. Tari, X. Yi, U. S. Premarathne, P. Bertok, and I. Khalil, 14. S. Namasudra and P. Roy, ‘‘Secure and efficient data
‘‘Security and privacy in cloud computing: Vision, trends, and access control in cloud computing environment: A survey,’’
challenges,’’ IEEE Cloud Comput., vol. 2, no. 2, pp. 30–38, Multiagent Grid Syst., vol. 12, no. 2, pp. 69–90, May 2016,
Mar. 2015, doi: 10.1109/MCC.2015.45. doi: 10.3233/MGS-160244.
9. M. Almorsy, J. Grundy, and I. Müller, ‘‘An analysis of the 15. Y. Wang, J. Yang, C. Xu, X. Ling, and Y. Yang, ‘‘Survey on
cloud computing security problem,’’ 2016, arXiv:1609.01107. Access Control Technologies for Cloud Computing,’’ J.
[Online]. Available: http://arxiv.org/abs/1609.01107 Softw., vol. 26, no. 5, pp. 1129–1150, 2015.
10. Cloud Security Alliance, Security Guidance V4.0. Accessed: 16. J. Zhou, Y. Zhang, and Y. Gao, ‘‘Research of ABAC model
Apr. 16, 2020. [Online]. Available: https://c-csa.cn/i/ based on usage control under cloud environment,’’ J. Comput.
file/20171225/ 2017122523220533533.pdf Appl., vol. 31, no. 12, pp. 3692–3694, 2014.
336 Algorithms in Advanced Artificial Intelligence

17. J. Zhu and Q. Wen, ‘‘SaaS access control research based on 24. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
UCON,’’ in Proc. 4th Int. Conf. Digit. Home, Nov. 2012, pp. in Health Care Using Hybrid Model Algorithms”,Journal of
331–334, doi: 10.1109/ ICDH.2012.50. Artificial Intelligence, Machine Learning and Neural Network
18. Y. Zhu, D. Ma, C.-J. Hu, and D. Huang, ‘‘How to use attribute- (JAIMLNN), Volume 3, 2023.
based encryption to implement role-based access control in 25. J. Chen, J. Wu, H. Liang, S. Mumtaz, J. Li, K. Konstantin,
the cloud,’’ in Proc. Int. Workshop Secur. Cloud Comput. A. K. Bashir, and R. Nawaz, ‘‘Collaborative trust blockchain
Cloud Comput. New York, NY, USA: ACM, 2013, pp. 33–40. based unbiased control transfer mechanism for industrial
19. Y. Wang, D. Zhang, and H. Zhong, ‘‘Multi-authority based automation,’’ IEEE Trans. Ind. Appl., early access, Dec. 13,
weighted attribute encryption scheme in cloud computing,’’ in 2019, doi: 10.1109/TIA.2019.2959550.
Proc. 10th Int. Conf. Natural Comput. (ICNC), Aug. 2014, pp. 26. X. Shen, Q. Pei, and X. Liu, ‘‘Survey of blockchain,’’ J. Netw.
1033–1038, doi: 10.1109/ ICNC.2014.6975982. Inf. Secur., vol. 2, no. 11, pp. 11–20, 2016.
20. L. Popa, M. Yu, S. Y. Ko, S. Ratnasamy, and I. Stoica, 27. K. Yu, S. Eum, T. Kurita, Q. Hua, T. Sato, H. Nakazato, T.
‘‘CloudPolice: Taking access control out of the network,’’ Asami, and V. P. Kafle, ‘‘Information-centric networking:
in Proc. 9th ACM SIGCOMM Workshop Hot Topics Netw. Research and standardization status,’’ IEEE Access, vol. 7, pp.
(Hotnets). New York, NY, USA: ACM, 2010, pp. 1–6. 126164–126176, 2019, doi: 10. 1109/ACCESS.2019.2938586.
21. P. He, R. Huang, N. Chen, and Z. Li, ‘‘Research progress 28. X. Qi, Y. Su, K. Yu, J. Li, Q. Hua, Z. Wen, J. Lopez, and T.
on sidechannel attacks in cloud environment,’’ Appl. Res. Sato, ‘‘Design and performance evaluation of content-oriented
Comput., vol. 35, no. 4, pp. 969–973, 2018. communication system for IoT network: A case study of
22. J. Guo, W. Yang, K. Lam, and X. Yi, ‘‘Using blockchain named node networking for realtime video streaming system,’’
to control access to cloud data,’’ in Proc. Int. Conf. Inf. IEEE Access, vol. 7, pp. 88138–88149, 2019.
Secur. Cryptol. Cham, Switzerland: Springer, 2018, Art. no.
Note: All the figures in this chapter were designed by the author.
274C288, doi: 10.1007/978-3-030-14234-6_15
23. Y. Yuan and F.-Y. Wang, ‘‘Blockchain: The state of the art and
future trends,’’ Acta Autom. Sinica, vol. 42, no. 4, pp. 481–
494, 2016.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Optimizing Cloud Load Balancers for

Reduced Network Latency 50

V. Murali Mohan1, Radha Yaraguti2,

Silpa Sharon Chinta3, Bhargavi Jonnavithula4
Department of Computer Science and Engineering,
Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur

Abstract: In today’s cloud computing era, improving the overall efficiency and responsiveness of cloud-based services relies
heavily on maximising the performance of cloud load balancers. In order to distribute tasks or workloads evenly across the
nodes or servers, load balancing is a crucial component and a major obstacle. [13]. An essential component of contemporary
communication and computer systems is network latency, frequently called the lag or delay in data transmission across a
network. Network latency, the delay in data transmission over a network, greatly affects the responsiveness and effectiveness
of cloud-based services. In the never-ending race to reduce network latency, cloud load balancers are indispensable. They
mediate communication between clients and a group of servers, dividing up incoming data packets according to the resources
that are available. Improving load balancer algorithms and setups is the primary emphasis of this research, which intends to
alleviate the pressing problem of network latency. Enhancing load balancing mechanisms significantly reduces transmission
delays, making cloud services more responsive and efficient. Load balancers play a crucial role in optimising data routing to
minimise delays, creating a more seamless user experience. Consequently, there is no denying the connection between load
balancing and network latency. The advent of cloud computing has completely altered how businesses rely on their IT systems.
Businesses now use cloud computing services such as Google Cloud Platform, Amazon Elastic Compute Cloud, and Microsoft
Azure to host their apps and data rather than building their own data centers. Although clients have the freedom to determine
the specifications of their computation times as they see fit, it is not yet feasible to establish guarantees in terms of community
latency for an application. [7]
Keywords: Cloud computing, Network latency, Network traffic, Load balancer

1. Introduction to save money [8]. Network latency is an important factor

that has a direct effect on how customers perceive and use
The term “cloud computing” describes the practice of cloud services. The term “network latency” describes the
providing on-demand access to shared computing resources time it takes for data to go from a client to a server via a
such as databases, storage, networking, software, analytics, network. The criticality of network latency is difficult to
and more through the Internet in order to facilitate more exaggerate. Applications and services lose some of their
agility, scalability, and speed in innovation. The term “cloud effectiveness when users experience slow or uneven network
computing” describes the practice of making data centre response times, which may be quite unpleasant. High
resources and desktop programmes accessible online over network latency is crucial in commercial settings since it can
a network connection. With fewer IT employees needed to lead to lost revenue and productivity. This barrier affects both
maintain security, businesses are turning to cloud computing website load times and the responsiveness of real-time apps.

1
muralimohan.klu@gmail.com, 2radhayaraguti@gmail.com, 3silpasharonchinta@gmail.com, 4Bhargavijonnavithula1@gmail.com

DOI: 10.1201/9781003529231-50
338 Algorithms in Advanced Artificial Intelligence

(1) Comparing 21% of public IPv4 addresses, the latency of network delay in load balancing as well as management
disparity between Amazon and Alibaba’s clouds is more than techniques in this expert review. There are several ways that
20 ms. (2) By routing traffic across its well-balanced private latency may enter the load balancing process: through client-
WANs, Google is able to achieve a lower latency imbalance load balancer, load balancer-to-backend server, and backend
compared to rival clouds destinations. Thirdly, DCs in the server-to-client connections.[2].
cloud often experience latency imbalance; in fact, researchers
found that eight pairs of DCs had load-balanced pathways
with latency discrepancies greater than 40 ms [1].

2. Background
Fig. 50.1 Architecture of network latency
The concept of load balancing emerged with the proliferation
of computer networks in the late 20th century. Initially, load
balancing aimed to distribute network traffic evenly among 4. Network Latency in Load Balancing
servers to prevent overload and ensure smooth operation. In
the 1990s, as network traffic grew, load balancers faced the
Related Issues
challenge of handling increasing data loads efficiently. Wait time for network loading Concerns about maintaining a
Network latency issues arose due to the varying response steady equilibrium in the world of digital results are common
times of different servers, making it difficult to evenly owing to a number of factors, one of which is communication.
distribute traffic. In the Late 1990s Round-robin load Delays can occur due to a number of factors, including the
balancing algorithms were introduced as a simple method to efficiency of routing algorithms, congestion in the network,
distribute requests evenly among servers. While effective for and the actual distance between devices. As a never-ending
basic load balancing, they did not account for variations in battle, reducing network latency requires optimising network
server response times, leading to latency issues. Early 2000s design, improving routing algorithms, and establishing
Dynamic load balancing algorithms like Least Connections content delivery networks to store and transport material
and Weighted Round Robin were developed to consider from servers closer to end users. When network latency is
server load and response times. These algorithms aimed to high, it can cause communication problems and slow data
reduce network latency by directing traffic to servers with transfers. When there’s a lag between pressing “play” and the
lower loads or faster response times. video starting when watching a movie online due to a slow
network connection, this is known as network latency. Data
In Mid-2000s. The advent of CDNs brought a significant shift
packet transit time, or network latency, is the time it takes for
in load balancing. CDNs used geographically distributed
data to travel from source to destination. [6].
edge servers to reduce latency by serving content from
servers closer to end-users. Late 2000s-Present with the rise
of cloud computing, load balancing faced new challenges.
Virtualized environments and on-demand resource allocation
added complexity to load-balancing decisions. Network
latency between cloud data centers and end-users became a
critical concern.

3. Network Latency in Load Balancing

In load-balancing systems, network latency is crucial,
particularly in the context of today’s data-driven and real-time
application landscape. To guarantee flawless user experiences
and optimal system performance, load balancing is a
technique that divides network traffic across several servers Fig. 50.2 Effect of network latency
in an effective manner. It depends on low latency. Latency is
significantly influenced by the physical distance between the Does distance affect latency? Yes. The farther the distance
source and the destination. Data takes longer to get where between the requesting device and the server, the higher the
it’s going when it goes farther. In international networks, latency will be. For example, a server 20 miles away from
this is particularly apparent. We examine the importance you will respond faster than a server 2,400 miles away [13].
When the volume of facts becomes excessively big without
Optimizing Cloud Load Balancers for Reduced Network Latency 339

simultaneous regulations, it affects in heightened network

site visitors, leading to behind-schedule responses from
equipment and causing community delays. The number one
factors contributing to network delays embody transmission
put off, propagation delay, packet switching, queuing, packet
drop, and processing[14].

5. Literature Review
Network latency, defined as the time it takes for data to
be transmitted across a network, is a significant element
influencing the efficiency and responsiveness of cloud-based
services. This paper examines the changing environment of
network latency in the context of load balancing, emphasizing
major discoveries and contributions from recent research.

5.1 Impact of Network Latency

In cloud systems, network latency has a direct impact on the
user experience and service quality. Data transfer delays might
result in decreased responsiveness and poorer application
performance. Researchers studied the influence of latency
on customer satisfaction and discovered a clear association
between reduced latency and increased service quality.

5.2 Cloud Service Provider Practices

Cloud service providers play an important role in network
latency management. Recent research has looked into the
methods and technology used by prominent cloud providers
to optimize load.

5.3 Content Delivery Networks (CDNs)

Content Delivery Networks are instrumental in addressing
network latency challenges. CDNs deploy a network of
strategically located servers to reduce the physical distance Fig. 50.3 Flowchart for DNS
data must travel. Recent research highlights the role of
CDNs in load balancing and how they contribute to latency 6. Methodology
reduction.
Network latency in the context of load balancing involves
5.4 Domain Name System (DNS) a systematic approach to gathering data, conducting
Domain Name System (DNS) is the maximum simple experiments, and analyzing the results.
application studied, that is extensively used inside the cloud.
6.1 Data Collection
It gives a site name lookup service. As a server, we use NSD
(call Server Daemon)2, which is an open-supply call server, Data collection is a foundational step in addressing network
the authoritative handiest. DNSPerf3 (version 2.1.0.0) is used latency in cloud-based services. Gathering the right data
on the patron side to generate requests. We outline our utility through various methods provides the insights needed to
overall performance metric because of the variety of requests optimize load balancers and improve network performance.
in keeping with the second that the name server can acquire. It is essential for making data-driven decisions and enhancing
DNS follows a client-server version, and we are aware at the the overall user experience.
server side and the effect that network latency as discovered Collect information about the volume and patterns of
by using the server has on its average performance. incoming and outgoing network traffic. This data can include
340 Algorithms in Advanced Artificial Intelligence

the number of requests, data transfer rates, and traffic Without aiming squarely at network latency, RR’s primary use
fluctuations over time. case is spreading network requests across numerous servers.
Distributing network traffic fairly across several servers or
6.2 Analysis of Current Load Balancers endpoints is the goal of this algorithm. By distributing the
The analysis of current load balancers is a critical step in workload and preventing any one server from becoming a
understanding their performance and efficiency in managing bottleneck, it helps to decrease latency. Round-robin is the
network latency in cloud-based services. Here is more load-balancing algorithm that is most widely used. Round
information on how to conduct this analysis effectively: robin algorithm cyclically routes client requests to accessible
servers. For round-robin server load balancing to work,
1. Inventory of Load Balancers: Start by creating an
each server’s processing and storage capacities should be
inventory of all load balancers currently in use within
about equal. Connection requests arrive at web servers in
your cloud-based services environment. Document
a specific sequence, determining their delivery in a round-
their names, types, and locations (e.g., data centers or
robin method. For argument’s sake, let’s pretend a company’s
cloud regions).
cluster consists of three servers, A, B, and C. Requests are
2. Configuration Analysis: Load Balancing Algorithms: sent to three separate servers: A, B, and C.
Identify which load balancing algorithms are being used,
such as Round Robin, Least Connections, Weighted
Round Robin, or Weighted Least Connections. Routing
Rules: Analyse how incoming requests are routed
to backend servers based on factors like URL paths,
domain names, or source IP addresses.
3. Traffic Distribution: Collect data on how traffic is
distributed among backend servers. Understand the
current traffic distribution patterns and whether they
are optimized for efficiency.
4. Health Checks: Evaluate the health check mechanisms
in place to determine server availability. Ensure
that servers are not serving requests when they are
underperforming or experiencing issues.
5. Load Balancer Logs: Analyse load balancer logs to
gain insights into real-time traffic patterns. Log data
can reveal variations in traffic volume, request rates,
and response times.
6. Performance Metrics: Collect performance metrics
related to load balancers, such as response times,
request processing times, and error rates. Compare
these metrics across different load balancers.
7. Latency Analysis: Specifically focus on latency
metrics. Measure the round-trip time (RTT) for requests
sent through each load balancer. Identify instances of
latency spikes or consistently high latency.
8. Scalability: Assess whether the current load balancers
can handle increases in traffic and workload. Scalability
is crucial for maintaining low latency during traffic
spikes. Assess whether the current load balancers can
handle increases in traffic and workload. Scalability
is crucial for maintaining low latency during traffic Fig. 50.4 Round Robin algorithm
spikes. To address network latency more effectively, you might
consider combining Round Robin with additional techniques
7. Round Robin
or algorithms that take server health and latency into account.
One of the oldest and most popular methods of load balancing The fundamental steps involved in load balancing while
is the RR algorithm. Using this method is a breeze [15]. considering network latency and server performance.
Optimizing Cloud Load Balancers for Reduced Network Latency 341

9. Conclusion
To sum up, a crucial undertaking in the realm of cloud
computing is improving cloud load balancers for reduced
network latency. The necessity for fast, low-latency data
transfer is growing as more and more businesses depend on
cloud services and apps. Network latency is a problem in load
balancing, and this paper presents research and solutions to
that problem. Through the optimisation of load balancers, we
have tackled the key challenge of reducing network latency
in cloud-based services in this study. Our findings show
that cloud-based services may be made more efficient and
responsive by reducing network latency, finding its causes,
creating tailored solutions, and optimising load-balancing
setups. We have investigated current load balancers, had
a look at their settings, and found that things like heavy
server loads and ineffective load-balancing algorithms
could be causing delays. We minimised network latency by
optimising load balancers through thorough load testing and
implementing configuration modifications. Our research
shows that optimising load balancers is a great way to lower
network latency, which means that users will have a better,
faster experience while using cloud services. Going forward,
it will be crucial to do additional research and development in
this field in order to fulfil the growing demands of the digital
era and make cloud-based services even more efficient.
Fig. 50.5 Flow chart for fundamental steps involved in load
balancing
10. Acknowledgement
8. Strategies to Resolve Our profound appreciation goes out to Dr. Murali Mohan, our
respected research adviser, for all of the help, encouragement,
To address network latency in load balancing, organizations and insightful comments they gave us during the course of
can employ several strategies: our study. Dr. Mohan’s knowledge and guidance greatly
1. Geographic Load Balancing: Use geolocation-based influenced the direction of our work. Our deepest gratitude is
load balancing to direct users to the nearest data center due to all of our hardworking coworkers and fellow scholars
or server, reducing the impact of long-distance network at KL University. Their insightful comments, lively debates,
latency. and intellectually challenging academic
2. Content Delivery Networks (CDNs): Employ CDNs to The quality of our study has been significantly improved.
cache and deliver content closer to end-users, reducing Additionally, we would like to express our gratitude to
the need for long-distance data transfers. everyone who took the time to fill out the user survey and use
3. Anycast Routing: Implement anycast routing to direct cloud-based email services; their input and ideas were very
traffic to the closest server based on routing metrics, invaluable. Without their help, our study would never have
reducing network latency. gotten off the ground. Their participation has been invaluable.
4. Intelligent Load Balancing Algorithms: Use load
balancing algorithms that consider not only server References
capacity but also network latency as a factor when
making routing decisions. 1. Feng Qian, Peter Danzig, Sugih Jamin, and Yibo Pi. 2020.
A cloud-centric perspective on latency imbalance among
5. Continuous Monitoring: Monitor network performance Internet load-balanced paths. Article 32, Proc. ACM Meas.
and latency in real-time to adjust load balancing Anal. Comput. Syst. 4, 2, June 2020, 29 p.
configurations dynamically.
342 Algorithms in Advanced Artificial Intelligence

2. Emerging Nature of Load Balancing to Handle Latency Issues journals.pen2print.org/index.php/ijr/article/view/ 7600/7370,

in Logistics Over Cloud, Dubey, Shivani, and Dahiya, Mamta is the most recent edition of the IJR journal.
and Jain, Sunayana (April 21, 2018). The third International 9. Sameer Tamrakar, Manoj Shakya, and Anand Singh. (2015).
Conference on Internet of Things and Connected Technologies Cloud-Based Load Balancing for Traffic Websites.
(ICIoTCT) 2018 proceedings are available at SSRN: https:// 10. “Actual time Virtualization Networking Functions (VNF)
ssrn.com/abstract=3166731. The conference was held March Conversion into Minimal Networking The latency in cloud-
26–27, 2018, at Malaviya National Institute of Technology in based Conditions,” D. Cho, J. Taheri, A. Y. Zomaya, and P.
Jaipur, India. Bouvry, 2017 IEEE 10th International Conference on Cloud
3. “Load Balancing in Data Center Networks: A Survey,” by J. Computing (CLOUD), Honolulu, HI, USA, 2017, pp. 798–
Zhang, F. R. Yu, S. Wang, T. Huang, Z. Liu, and Y. Liu, IEEE 801, doi: 10.1109/CLOUD.2017.118
Communications Surveys & Tutorials, vol. 20, no. 3, pp. 2324 11. Smith, J., Zhang, Q., Heinzelman, W., Soyata, T., Chen, H., &
2352, third quarter 2018, doi: 10.1109/COMST.2018.2816042. Wang, L. (2014). Enhancing Cloud Server Selection through
4. ACM SIGCOMM Computer Communication Review, Volume Network Latency Profiling and Redundancy. In Proceedings
454, October 2015, pp. 478, Edge-based Load Balancing for of the 2014 IEEE 7th International Conference on Cloud
Fast Datacenter Networks. Computing (pp. 826- 832). Anchorage, AK, USA. IEEE.
5. In 2023, Selvakumar, G., Jayashree, L. S., and Arumugam, doi: 10.1109/CLOUD.2014.114.
S., “Latency minimization using an adaptive load balancing 12. Hanlin Sun 2019, J. Phys. : Conf. Ser. 1314 012211 “Research
technique in microservices applications,” Computer Systems on Latency Problems and Solutions in Cloud Game”
Science and Engineering, vol. 46, no. 1, pp. 1215–1231. 13. Akash Dave, Bhargesh Patel, and Gopi Bhatt. 6. 10.1109/
6. Quang Trung Luu’s book Finding the Origin of Increased CESYS.2016.7889883. Load balancing in cloud computing
Latency Bereldung, Inc. CS/Networking-Latency utilizing optimization techniques: A study.
7. In 2017 Diana Andrew W. Moore, Noa Zilberman, and 14. Sujatha Krishanmoorthy et al 2020 IOP Conf. Ser.: Mater. Sci.
Andreea PopescuThe original report from November 2017 Eng. 937 012054 DOI 10.1088/1757-899X/937/1/012054
was published in December 2017, with some minor revisions 15. Taufik Hidayat, Yasep Azzery, and Rahutomo Mahardiko. A
made, describing how network delay affects the functionality Systematic Literature Review on Load Balancing Networks
of cloud-based apps. Using the Round Robin Algorithm. 4. Jurnal Online Informika.
8. Latency’s Effect on Cloud Computing Domains Vol 8, No 5 10.15575/join.v4i2.446
(2021): Connection to Current Issue Now accessible at https://
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Boosting Precision: Strategies for

Improving Spam Detection in Cloud-Based
Email Services
51

V Murali Mohan1, Rohitha papolu2,

Sowjanya Malleboina3, Sravya Madiraju4
Department of Computer Science and Engineering,
Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur - 522503

Abstract: Cloud-based email services have become the primary means of communication in the digital age, making efficient
spam detection an essential component in ensuring the integrity and security of electronic communications. This research paper
addresses the pressing challenge of enhancing the precision of spam detection algorithms within the context of cloud-based
email services. The primary goal of this study is to investigate novel strategies that reduce the occurrence of false positives in
spam detection, without compromising the detection of true spam messages. False positives, or legitimate emails incorrectly
classified as spam, not only inconvenience users but can also lead to the loss of important information. Striking the right
balance between accurate spam identification and minimal false positives is crucial for the effectiveness of spam filters. The
paper commences with a comprehensive review of the current landscape of spam detection in cloud-based email services. It
discusses the inherent challenges associated with precision and the detrimental consequences of false positives, such as missed
communications and potential data breaches. Additionally, it highlights the evolving nature of spam and the need for adaptive
and context-aware solutions.
Keywords: Spam, Communications, Precision, Detection, Emails

1. Introduction refers to unsolicited bulk emails sent without discrimination

or targeting [10]. The sheer volume of spam is staggering,
In the digital age, email has revolutionized communication, with some estimates suggesting that over half of all emails
becoming an indispensable tool for personal and professional sent are spam. This prevalence not only disrupts the flow of
correspondence. However, this widespread use has also legitimate communication but also poses serious security and
given rise to a persistent nuisance: email spam. In today’s privacy concerns for email users.
rapidly evolving landscape of spam tactics, it is essential to
employ filtering techniques that undergo continual updates To mitigate the adverse effects of email spam, the
to effectively counter the ever-changing strategies employed development of robust spam detection systems is of utmost
by spammers. [11]. Spam, characterized by unsolicited importance. Traditional methods of spam filtering, relying on
and often irrelevant or malicious content, inundates email rule-based approaches and pattern matching, are increasingly
inboxes worldwide. The proliferation of spam not only inadequate in combating the ever-evolving tactics of
diminishes the quality of information on the Internet but also spammers. Moreover, as email services migrate to the
raises concerns among search engines and web users [4]. cloud, new challenges arise in terms of scalability, real-time
Customers who shop online get emails from dubious senders processing, and adapting to dynamic spam patterns. Thus,
phishing their bank account details or passwords. Spam there is a critical need for advanced and adaptive strategies

1
muralimohan.klu@gmail.com, 2rohithapapolu@gmail.com, 3sowjanyam0719@gmail.com, 4madirajusravya@gmail.com

DOI: 10.1201/9781003529231-51
344 Algorithms in Advanced Artificial Intelligence

to improve the precision of spam detection, particularly in also pose security threats and privacy concerns. In [14] the
cloud- based email services. authors discuss current and potential future spam filtering
The significance of this research lies in its potential to address technologies. We look at the problems posed by spam, what
the multifaceted challenges associated with spam detection in spam is and how we measure it.
cloud-based email services. Hence, utilizing social knowledge The State of Spam Detection in Cloud-Based Email Services:
can aid in combating spam, particularly among “independent”
A. Rule-Based Filters
malicious users who do not collaborate [1]. Unwanted and
unsolicited emails, commonly known as spam, are intruding Early spam detection systems predominantly relied on rule-
upon users without their consent, inundating their mailboxes based filters. These filters operated on predefined patterns
with unwanted email clutter [7]. Effective spam detection not and rules, which often struggled to adapt to evolving spam
only enhances the user experience by reducing the clutter of tactics, leading to high false positive rates and missed spam.
unwanted emails but also safeguards users against phishing In knowledge engineering-based spam filtering, rules are
attacks, malware distribution, and other malicious activities devised and implemented, relying on distinct keywords for
often embedded in spam. Moreover, for email service the technical detection of spam as opposed to regular emails.
providers, improved spam detection translates into increased [3]
user trust, reduced operational costs, and enhanced brand B. Machine Learning Approaches
reputation. Consequently, this study holds significance for
both end-users and email service providers alike. The advent of machine learning revolutionized spam
detection. Supervised learning techniques such as Naive
This research paper aims to achieve the understanding the Bayes, Support Vector Machines (SVM), and Random Forests
evolving landscape of email spam, including its various have been widely employed. However, achieving a balance
forms, motivations, and the tactics employed by spammers between high precision and recall remains challenging.
and analyse the unique challenges posed by cloud-based
email services in the context of spam detection, encompassing C. Feature Engineering and Selection
issues of scalability, real-time processing, and integration of Feature Engineering: Researchers have explored
advanced technologies. innovative feature engineering techniques, including text-
based features such as term frequency-inverse document
frequency (TF-IDF), n-grams, and word embeddings.
These features aim to capture nuanced aspects of email
content and improve precision.
Feature Selection: Dimensionality reduction and feature
selection methods, such as Principal Component Analysis
(PCA) and Information Gain, have been applied to identify
the most discriminative features for spam detection,
contributing to enhanced precision.

3. Methodology
3.1 Data Collection and Preprocessing
Data collection and preprocessing are foundational steps
in our research aimed at enhancing the accuracy of spam
Fig. 51.1 Spam emails over the years detection within cloud-based email services. For this study,
we gathered a diverse and representative dataset comprising
thousands of email samples obtained from multiple sources,
2. Literature Review including public email repositories and cloud-based email
Email communication remains an indispensable tool for service providers. The dataset encompasses a balanced
personal, professional, and business interactions, with cloud- distribution of legitimate (ham) emails and spam emails, a
based email services becoming the predominant platform reflection of real-world email traffic dynamics. These emails
for managing electronic communication. Despite their encompass a wide array of linguistic styles, languages, and
convenience, these services are plagued by the relentless content types, thus ensuring the robustness of our analysis
influx of spam emails, which not only clutter inboxes but and model training.
Boosting Precision: Strategies for Improving Spam Detection in Cloud-Based Email Services 345

In the preprocessing phase, we employed a rigorous data Feature selection is crucial for model efficiency and
cleaning process to ensure the quality of our email data. effectiveness. We employed the following techniques:
This involved the removal of extraneous elements such Mutual Information: We calculated mutual information
as HTML tags, special characters, and superfluous white scores between features and the target variable (spam or
spaces. Additionally, we addressed missing or incomplete legitimate) to identify the most informative features. This
email components judiciously to avoid data loss. The guided our selection process, ensuring that only the most
classification and identification of spam emails constitute relevant features were included in our models.
essential measures in the battle against these threats, ensuring
the protection of email communication.[6] The crucial step of Recursive Feature Elimination: To further refine our feature
tokenization was performed to break down the textual content set, we utilized recursive feature elimination (RFE) with
of emails into individual words, facilitating subsequent machine learning models. RFE ranks features by importance
analysis. Feature extraction involved TF-IDF vectorization and iteratively eliminates the least informative
to represent email content as numerical feature vectors,
enabling machine learning algorithms to operate effectively. 3.3 Machine Learning Models
Moreover, metadata and header information, including sender This section discusses the machine learning algorithms
reputation, timestamps, and routing details, were extracted selected and tailored for the task of enhancing spam detection
and integrated as valuable features for spam detection. This accuracy in cloud-based email services. Each algorithm offers
comprehensive data preprocessing pipeline ensures that our unique advantages in addressing the complex challenges
research is conducted on a high-quality dataset, enabling the posed by evolving spam threats.
development of accurate and robust spam detection models Logistic Regression is a straightforward and interpretable
for cloud-based email services.
choice for classifying email spam, modeling the probability
3.2 Feature Engineering and Selection of an email being spam based on features like keywords and
sender information. Support Vector Machines (SVM) are
Feature engineering and selection are critical aspects of a powerful option, capable of handling complex decision
our research aimed at enhancing spam detection accuracy boundaries and high-dimensional feature spaces through
in cloud-based email services. In this section, we detail the kernel functions, but they require careful hyperparameter
methods and strategies employed to construct an effective tuning and can be computationally expensive. Both models
feature set for our machine learning models. Feature selection can be effective, with the choice depending on dataset
can be accomplished through two techniques: textual and characteristics and computational resources.
content-based approaches.[12] Feature engineering involves
the creation of relevant and informative features from the raw Logistic Regression
data, which can significantly impact the performance of our In our research paper on enhancing spam detection accuracy
spam detection algorithms. in cloud-based email services, Logistic Regression emerges
Textual Features as a fundamental and interpretable tool in our arsenal.
Within the context of cloud-based email services, where the
To capture the linguistic characteristics of emails, we
dynamic and evolving nature of spam threats demands agile
engineered the following textual features:
and robust solutions, Logistic Regression offers an elegant
N-Grams: O-We generated n-grams (sequences of contiguous starting point. This simple yet powerful linear classification
words) to capture phrases and contextual information within algorithm excels in distinguishing between legitimate (ham)
emails, allowing our models to discern spammy language and spam emails, assigning probabilistic scores that align
patterns. with the inherent binary nature of spam detection.
Word Frequencies: We computed word frequencies, both The sigmoid activation function inherent to Logistic
globally and within each email, to identify terms that are Regression ensures that predicted probabilities remain within
more prevalent in spam or legitimate emails. the [0, 1] range, facilitating clear-cut classification decisions.
Content-Based Features To empower the model further, we engage in comprehensive
We leveraged the content of emails to create features such as: feature engineering, extracting meaningful information from
email text through TF-IDF vectorization and incorporating
Attachment Presence: We encoded whether an email valuable metadata and header details. Our rigorous model
contained attachments, as this is a common spam indicator. training involves k-fold cross-validation and hyperparameter
URL Count: We counted the number of URLs within an tuning, fine-tuning the model’s parameters for optimal
email, as an excessive number of links can be indicative of performance. Logistic Regression’s interpretability and
spam. explainability also shine through, as it allows us to gain
346 Algorithms in Advanced Artificial Intelligence

critical insights into why certain emails are classified as Then we create the logistic regression object and train it
spam, thereby enhancing our understanding of the model’s with the data. Finally, we create a set of messages to make
decision- making process. In the dynamic landscape of predictions.
cloud-based email services, Logistic Regression serves as a Support Vector Machines
reliable and foundational element in our pursuit of bolstering
spam detection accuracy, ultimately contributing to a safer In our pursuit of enhancing spam detection accuracy
and more secure user experience. within the realm of cloud-based email services, Support
Vector Machines (SVMs) stand out as a formidable and
“Interpretable Functional Logistic Regression” (IFLR) is indispensable tool. SVMs are well-known for their ability
a method that categorizes functional data into two distinct to handle high- dimensional data and non-linear decision
groups, offering a classifier that is both easy to interpret and boundaries, making them a robust choice for addressing
highly predictive. [9] the complex and dynamic nature of spam detection. In the
Here We load the dataset using pandas. Then we split in a context of cloud-based email services, where spam tactics
training and test set. We extract text features known as TF- continually evolve and diversify, SVMs offer a versatile and
IDF features, because we need to work with numeric vectors. effective solution. Advocates of content-based filtering have
LR is used for reduce noisy data or instance before data feed recommended the utilization of Support Vector Machines
to DT induction. Logistic Regression (LR) reduces noisy (SVMs), citing their state-of-the-art performance in text
data by filtering correct predictions based on a specified false classification within the realm of machine learning.[13] By
negative threshold.[2] leveraging the principles of margin maximization, SVMs
meticulously craft an optimal hyperplane that effectively
separates spam from legitimate emails. This inherent capacity
allows SVMs to capture intricate relationships and patterns
within email content, thereby enhancing the accuracy of spam
classification. Additionally, SVMs demonstrate resilience
against overfitting and can accommodate categorical features,
aligning them perfectly with the multifaceted characteristics
of email data. Their precision in distinguishing between spam
and legitimate emails, coupled with the ability to minimize
false positives, makes SVMs an indispensable component
of our spam detection framework, ensuring that users of
cloud-based email services are shielded from unwanted and
potentially harmful content.
Furthermore, SVMs bring an element of adaptability to our
spam detection system, a crucial requirement in the ever-
changing landscape of cloud-based email services. Support
vector machine (SVM) classifiers can cope with many different
classification tasks but improperly selected hyperparameters
may deteriorate their performance. [8] By effectively
handling non-linear feature interactions, SVMs can adapt to
emerging spam tactics, allowing email service providers to
stay ahead of evolving threats. Their utility extends beyond
mere accuracy to encompass real-time decision-making, a
pivotal feature in a dynamic email environment where user
experience and security are paramount. The combination of
precision, resilience, and adaptability positions SVMs as a
cornerstone of our efforts to enhance spam detection accuracy
in cloud-based email services, ensuring that users can enjoy a
safe and uninterrupted email experience.
Here We load the dataset using pandas. Then we split in a
training and test set and the we apply SVM algorithm to get
Fig. 51.2 Logistic regression algorithm accuracy.
Boosting Precision: Strategies for Improving Spam Detection in Cloud-Based Email Services 347

In response, we conducted an extensive review of the state-of-

the-art techniques in spam detection. We covered a spectrum
of machine learning algorithms, from traditional approaches
like Logistic Regression to more sophisticated model like
Support Vector Machines. We meticulously detailed their
strengths, weaknesses, and applicability within cloud-based
email services. Furthermore, we examined the role of feature
engineering, model selection, and ensemble methods in fine-
tuning our spam detection arsenal.
Our empirical study involved the collection of a diverse and
representative dataset from cloud-based email services. This
dataset spanned thousands of emails, encompassing both spam
and legitimate messages, mirroring the real-world challenges
faced by email service providers. We then rigorously pre-
processed the data, cleaning noisy text, tokenizing content,
and extracting relevant features, including metadata and
header information.
Our experimentation phase was marked by the diligent
application of various machine learning models to our dataset.
We assessed their performance in terms of precision, recall,
F1-score, and receiver operating characteristic (ROC) curves.
The models were evaluated not only for their accuracy in
classifying spam but also for their ability to minimize false
positives, a critical aspect in preserving the user experience
in cloud-based email services.
The results of our study showcased the strengths and
weaknesses of each model. Logistic Regression, known
for its simplicity and interpretability, served as a reliable
baseline. Support Vector Machines displayed prowess in
Fig. 51.3 SVM algorithm handling high- dimensional data.
Nonetheless, our journey does not end here. The ever-
4. Conclusion changing nature of spam tactics demands ongoing vigilance
In this research paper, we have embarked on a journey to and adaptation. Future research avenues may involve
enhance spam detection accuracy within the complex and exploring novel approaches, such as reinforcement learning,
dynamic landscape of cloud-based email services. The to enhance real-time adaptability and resilience to emerging
proliferation of cloud-based email platforms has brought threats. Moreover, the quest for greater interpretability
unparalleled convenience to users but has also attracted and explainability in machine learning models remains
an array of sophisticated spam threats. Our mission was to paramount, especially in cloud-based email services, where
develop and evaluate advanced techniques and models to user trust and understanding of decisions are essential.
bolster the effectiveness of spam detection mechanisms, In summary, our research represents a significant contribution
ultimately ensuring a safer and more secure email experience to the field of cloud-based email spam detection. By leveraging
for users. advanced machine learning techniques and models, we have
Our research began with a comprehensive exploration of the not only enhanced accuracy but also fortified the defense
existing challenges in cloud-based email spam detection. against spam, ensuring that cloud-based email services
We highlighted the evolving nature of spam tactics, the remain a safe, efficient, and trustworthy communication
importance of preserving legitimate emails, and the necessity platform for users worldwide. As we move forward, we are
of maintaining a low false positive rate to avoid disrupting committed to staying at the forefront of this critical domain,
user workflows. These challenges underscored the urgency continuously striving to deliver excellence in spam detection
of our endeavor. and email security.
348 Algorithms in Advanced Artificial Intelligence

5. Future Directions Acknowledgment

In the realm of “Boosting Precision: Strategies for Improving First and foremost, we extend our sincere appreciation to our
Spam Detection in Cloud-Based Email Services,” one research advisor, Dr Murali Mohan sir, for their invaluable
promising avenue for future research involves the integration guidance, unwavering support, and insightful feedback
of advanced artificial intelligence and machine learning throughout the entire research process. Their expertise and
techniques. Explore the utilization of state-of-the-art deep mentorship have been instrumental in shaping the direction
learning architectures, such as transformer-based models, to of our work.
tackle the dynamic and evolving nature of spam. Investigate We would like to thank our colleagues and fellow researchers
techniques like self-supervised learning, where models can at KL University, who provided valuable input, engaging
leverage vast amounts of unlabeled email data to further discussions, and a stimulating academic environment. Their
enhance their understanding of spam patterns. Additionally, perspectives and collaboration significantly enriched the
consider the application of explainable AI and interpretability quality of our research.
techniques to make these advanced models more transparent
and comprehensible to users and administrators. By pushing Our sincere thanks go to the participants of the user survey and
the boundaries of AI-driven spam detection, researchers can the users of the cloud-based email services who generously
work towards achieving unprecedented levels of precision shared their feedback and insights. Without their cooperation,
and adaptability in identifying and mitigating email-based this study would not have been possible.
spam threats.
Furthermore, future research can delve into the realm of References
cross- platform and cross-service collaboration. Develop 1. P. Heymann and G. Koutrika, “Fighting Spam on Social Web
interoperable spam detection frameworks that can seamlessly Sites: A Survey of Approaches and Future Challenges,” in
integrate with various cloud-based email providers. This IEEE INTERNET COMPUTING: IEEE Computer Society,
would enable a unified and standardized approach to spam 2007, pp. 36--45.
detection, improving consistency and effectiveness across 2. A. Wijaya and A. Bisri, “Hybrid decision tree and logistic
different email platforms. Explore the creation of open- regression classifier for email spam detection,” 2016 8th
source tools and APIs that email service providers can readily International Conference on Information Technology and
implement, fostering a collaborative ecosystem for spam Electrical Engineering (ICITEE), Yogyakarta, Indonesia,
prevention. By addressing the interoperability challenge, 2016, pp. 1-4, doi: 10.1109/ICITEED.2016.7863267.
3. Ghaith Manita, Amit Chhabra, Ouajdi Korbaa,Efficient e-mail
researchers can contribute to a more comprehensive and
spam filtering approach combining Logistic Regression model
interconnected defense against spam that transcends and Orthogonal Atomic Orbital Search algorithm,Applied
individual service boundaries. Soft Computing,Volume 144,2023,110478,ISSN 1568-4946,
Lastly, ethical considerations and user empowerment should https://doi.org/10.1016/j.asoc.2023.110478.
be at the forefront of future research directions. Investigate 4. Pedram Hayati and Vidyasagar “Evaluation of spam detection
the potential biases and fairness issues that may arise in spam and prevention frameworks for email and image spam: a
detection algorithms and work towards mitigation strategies state of art” in iiWAS08: 10th International Conference
on Information Integration and Web-based Applications &
to ensure equitable protection. Emphasize user education
Services\
and awareness initiatives to help users recognize and report 5. Jihye Park, Sungzoon Cho,Incorporation of company-
spam accurately. Design user-centric interfaces that not related factual knowledge into pre-trained language models
only facilitate spam reporting but also provide transparent for stock-related spam tweet filtering,Expert Systems with
explanations for classification decisions. By placing ethics Applications,Volume 234,2023,121021,ISSN 0957-4174,
and user empowerment at the core of research efforts, we can https://doi.org/10.1016/j.eswa.2023.121021.
create spam detection systems that not only excel in precision 6. Jay Doshi, Kunal Parmar, Raj Sanghavi, Narendra Shekokar,A
but also foster trust and collaboration between users and comprehensive dual-layer architecture for phishing and
email service providers. In sum, the future of research in this spam email detection, Computers & Security,Volume
domain should combine cutting-edge AI, interoperability, 133,2023,103378,ISSN 0167- 4048, https://doi.org/10.1016/j.
and ethical considerations to create highly precise, adaptable, cose.2023.103378.
7. B. Issac and V. Raman, “Spam Detection Proposal in Regular
and user- friendly solutions for combatting spam in cloud-
and Text- based Image Emails,” TENCON 2006 - 2006 IEEE
based email services. Region 10 Conference, Hong Kong, China, 2006, pp. 1-4, doi:
10.1109/TENCON.2006.343905.
Boosting Precision: Strategies for Improving Spam Detection in Cloud-Based Email Services 349

8. Wojciech Dudzik, Michal Kawulok, and Jakub Nalepa. 2019. 12. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
Evolutionarily-tuned support vector machines. In Proceedings in Health Care Using Hybrid Model Algorithms”,Journal of
of the Genetic and Evolutionary Computation Conference Artificial Intelligence, Machine Learning and Neural Network
Companion (GECCO ‘19). Association for Computing (JAIMLNN), Volume 3, 2023.
Machinery, New York, NY, USA, 165–166. https://doi. 13. Ala’ M. Al-Zoubi, Hossam Faris, Ja’far Alqatawna,
org/10.1145/3319619.3321924 Mohammad A. Hassonah,Evolving Support Vector Machines
9. Cui Lv and Di-Rong Chen. 2018. Interpretable Functional using Whale Optimization Algorithm for spam profiles
Logistic Regression, 2nd International Conference on detection on online social networks in different lingual
Computer Science and Application Engineering (CSAE ‘18). contexts, Knowledge-Based Systems, Volume 153, 2018,
Association for Computing Machinery, New York, NY, USA, Pages91- 104, ISSN0950-7051, https://doi.org/10.1016/j.
Article 82, 1–5. https://doi.org/10.1145/3207677.3277962 knosys.2018.04.025.
10. Bilge Kagan Dedeturk, Bahriye Akay,Spam filtering 14. D. Sculley and Gabriel M. Wachman. 2007. Relaxed online
using a logistic regression model trained by an artificial SVMs for spam filtering. In Proceedings of the 30th annual
bee colony algorithm,Applied Soft Computing,Volume international ACM SIGIR conference on Research and
91,2020,106229,ISSN 1568-4946,https://doi.org/10.1016/j. development in information retrieval (SIGIR ‘07). Association
asoc.2020.106229. for Computing Machinery, New York, NY, USA, 415–422.
11. C. Tseng, J. Huang, and M. Chen, “ProMail: Using Progressive https://doi.org/10.1145/1277741.1277813
Email Social Network for Spam Detection,” Advances in 15. R. Hunt and J. Carpinter, “Current and new developments in
Knowledge Discovery and Data Mining, LNCS, vol. 4426, spam filtering”, 14th IEEE International Conf. on Networks,
pp. 833 840, 2007. pp. 1-6, 2006.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
350 Algorithms in Advanced Artificial Intelligence

Crafting Personalized Film Suggestions

R. Tamilkodi1
Professor, Department of CSE (AIML & CS),
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
A. Harika2
Assistant Professor, Department of CSE (AIML & CS),
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
Ch. Rohith3, G. Nithin4, K. Mahesh5, A. Anvitha6, N. Lohitha7
Department of Computer Science & Engineering (AIML & CS),
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India

Abstract: Online entertainment is constantly changing, and movie recommendations need to match individual preferences for
user satisfaction. This challenge can be met by implementing a Mood-Based Cascade Hybrid movie recommendation system
(MBCH). This innovative system MBCH uses two different data sets, linked by a common movie ID, to improve the accuracy
and effectiveness of movie suggestions. The first phase of this system is based on content filtering. Data from the internet is
collected to create a large dataset, with various movie attributes such as genre, directors, cast, plot summaries, and more. These
attributes are used to make initial recommendations, ensuring that suggestions are closely related to the content of each film.
The second phase combines collaborative filtering and latent factor models, creating personalized suggestions that are better
than content filtering alone. The goal of MBCH is to provide a dynamic and personalized movie-watching experience. By
combining content filtering and Collaborative filtering, it offers precise recommendations that suit the moods and preferences
of users. This approach not only increases user satisfaction but also gives your movie streaming platform an edge in today’s
competitive entertainment market.
Keywords: Collaborative recommender system, Content recommender system, Cascade hybrid recommender system, Mood-
based, User feedback, Personalized suggestions

1. Introduction to identify patterns and predict what content and products are
likely to interest users.
In the contemporary world, the internet is an essential tool Recommendation systems are very useful and effective
for connecting people, sharing information, and supporting technique of filtering the data [23]. A recommendation
various aspects of modern life. Recommendation systems system is a personalized information filter that tailors its
are widely used in the digital realm to help users discover choices to a user’s preferences and interests. In today’s
relevant content and products. These systems analyze user era of information overload, these systems are crucial for
actions, including ratings, purchases, and browsing history, e-commerce and social platforms. Many platforms, like

1
tamil@giet.ac.in, 2harikaadduri07@giet.ac.in, 3rohitch1418@gmail.com, 4nithingedda45@gmail.com, 5kondumahanthimahesh1@gmail.com,
6
anvithaattunuri@gmail.com, 7lohithanallamadugu@gmail.com

DOI: 10.1201/9781003529231-52
Crafting Personalized Film Suggestions 351

Netflix for suggesting movies, Amazon for offering product movie landscape. This makes them superior to single-method
recommendations, Spotify for providing music suggestions, recommendation systems in terms of both accuracy and
LinkedIn for proposing job opportunities, and various efficiency [25].
social networking sites for suggesting connections, all
function through recommendation systems [19], [20]. Movie 2. Literature Survey
recommendation systems filter out irrelevant data and only
include data that has matching characteristics or features [18]. Kim Mucheol et al. [6] described an interactive movie
They help users to find interesting items customized to their recommendation system for online communities, utilizing a
preferences, making online experiences more enjoyable and community network model to understand social information
efficient [1]. Movie recommendation systems are invaluable dynamics. This model tailor’s movie suggestions to
in helping us find our favorite films amid the vast array of individual user preferences and adapts to evolving social
options, saving us time and effort [2]. These systems must be network trends. Nanou et al. [7] described challenges
highly dependable to be effective, offering recommendations in movie recommendation presentation and evaluated
that closely match our preferences and interests. different methods. The study highlighted the effectiveness of
“planned outline” and “textbook and videotape” interfaces in
Mood-based movie recommendation systems assist users in
establishing a strong link between user opinions and approval
discovering films that match their current emotional state.
across various experimental scenarios.
Users can select their mood, and the system then suggests
movies known to elicit those feelings. Ruotsalo et al. [8] introduced a mobile recommendation
system, emphasizing SMARTMUSEUM, employing
Three distinct models in movie recommendation systems
semantic network speech and ontologies to connect se-
are content-based filtering, collaborative filtering, and
mantic gaps, sensor data, and user profiles. The system
popularity- based filtering. The choice of model for mood-
employed an information retrieval framework, and its results
based movie recommendations depends on the system’s
demonstrated effectiveness in meeting user needs. Sharma
design, each having unique strengths in tailoring suggestions
et al. [9] this paper reviewed the several approaches used
to user preferences and emotions.
for the Recommen- dation system. Approaches may be
Content-based filtering recommends movies based on the categorized into two parts Content filtering and Content-
content of previously liked movies. This can include features based recommendation. Also, this paper describes the merits
such as the genre, director, actors, and plot [4]. Content-Based and demerits of the recommendation approaches. Tekin
Filtering, also referred to as cognitive filtering [21], operates et al. [10] proposed distributed online learning in a social
by suggesting items to users based on their past interactions recommendation system, where suggestions are query-
and preferences. Collaborative filtering is an approach dependent. Recommendations consider user history, gender,
that suggests items to users by analyzing the resemblances and age. The approach emphasizes decentralized sequential
between users and the items themselves [24]. Collaborative decision-making.
filtering models suggest movies to users by considering the
Chapphannarungsri and Maneero [11] presented a
ratings and preferences of other users with akin tastes [3].
multidimensional approach for an advanced recommendation
Goldberg introduced the concept of collaborative filtering system, offering high-quality recommendations. They
in 1991 [22]. Models utilizing popularity-based filtering proposed a method for the Multiple Criteria approach,
suggest movies to users by taking into account their level adjusting weightings and addressing film feature selection
of popularity, which could be determined by factors like the concerns, and applied Multiple Linear Regression to study
frequency of views or ratings. client characteristics, resulting in more accurate outcomes
The hybrid movie recommendation system, on the other hand, compared to existing Hybrid Recommendation systems.
represents an approach that merges various recommendation George et al. [12] presented the mixed approach combining
models to provide users with a more extensive and precise content-based and collaborative filtering for a film
assortment of movie suggestions. By merging content-based recommendation system. The approach was experimentally
filtering, collaborative filtering, and other techniques this evaluated against existing collaborative and content-based
hybrid system aims to overcome the limitations of methods filtering techniques, offering valuable insights into its
[5]. The hybrid recommendation system is an advanced performance. Yoshii et al. [13] an incrementally trainable
approach that combines user preferences, movie attributes, probabilistic model for mixed recommendations, combining
and historical data to provide diverse and personalized collaborative and content-based techniques to enhance
movie recommendations. This method enhances the user precision and artist diversity. The model effectively merges
experience by aligning suggestions with individual tastes collaborative and content data, maintaining high accuracy,
and emotions, making it a powerful tool in today’s extensive even with the inclusion of new users.
352 Algorithms in Advanced Artificial Intelligence

Christakou et al. [14] introduced a clustering technique based recommendations further. While this innovative system offers
on semi-supervised learning for movie recommendations exciting opportunities for personalized movie suggestions, it
that combine collaborative and content-based data. Their must prioritize privacy and data security, as well as scalability
system was tested on the MovieLens dataset, yielding high- to accommodate a growing user base, ensuring a seamless
precision recommendations. Symeonidis et al. [15] proposed and enjoyable user experience.
“MoviExplain,” a movie recommendation system that extends However, the existing mood-based movie recommendation
beyond mere movie recommendations. It aims to provide system that relies on facial emotion analysis faces several
both accurate and justifiable recommendations, allowing challenges that could affect its accuracy and reliability. One
users to understand the rationale behind a recommendation. of the primary issues is the ambiguity of facial expressions,
Recommender systems predict user preferences by analyzing which can convey multiple emotions and potentially lead
activity patterns and are categorized as content-based (CBR) to misinterpretations. Emotions are also highly subjective
and collaborative filtering (CF) models. CBR uses item and can vary from person to person, making it challenging
properties for predictions, while CF identifies user or item
to predict what a user might genuinely enjoy. Additionally,
similarities to make recommendations based on users with
emotions can change rapidly during a movie or be influenced
similar tastes, harnessing collective user wisdom.[16].
by external factors, presenting a considerable challenge for the
Collaborative filtering (CF) includes memory-based techniques system to adapt in real time. Ethical and legal concerns related
that identify similar users through rating comparisons, to privacy and data security are critical, as continuous facial
forming user-based recommender systems. Model- based analysis may raise questions. Furthermore, the technology’s
CF analyzes rating patterns to build precomputed item limited emotional range and potential biases in training
recommendation models. Content-based recommendation data could hinder its ability to accurately detect complex or
(CBR) systems use item features and attributes to find and culturally diverse emotions. Often, the system over- looks
suggest similar items based on user preferences. [17]. Both crucial contextual factors, such as a user’s intention behind
CF and CBR methodologies encounter certain restrictions. watching a particular movie or their social environment. To
CF-based recommender systems typically encounter the mitigate these challenges and enhance accuracy, developers
problems of [16] sparse rating as many users don’t rate items must continuously refine the technology, incorporate user
and [17] cold start problems. The cold start challenge within feedback, and consider a broader range of factors beyond
recommendation systems pertains to situations involving facial expressions when making movie recommendations.
novel users and items. The new user cold start problem occurs
when a user with no ratings joins, mainly affecting content-
based recommendation (CBR). Both collaborative filtering
4. Proposed System
(CF) and CBR encounter difficulties in generating reliable The proposed methodology MBCH uses a user-centric
recommendations due to these limitations. [16]. approach to overcome the limitations of existing systems that
rely on capturing emotions through a camera. A two- step
3. Overview of Existing System process to address privacy concerns and improve accessibility
was implemented. The proposed system working architecture
An advanced approach to enhancing user engagement and is shown in Fig. 52.1.
satisfaction involves a mood- based movie recommendation
In the first stage, a standardized survey is used to assess user
system that utilizes facial emotion analysis. This system
preference across different moods. The survey asks users to
employs sophisticated computer vision techniques and
machine learning models to identify and categorize a choose movies they would like to watch in different emotional
user’s emotional state based on their facial expressions, en- states, such as “What kind of films help you unwind and
compassing emotions such as happiness, sadness, anger, relax when you’re feeling down?”,” In moments of love and
and surprise. Subsequently, it taps into a comprehensive affection, which movies resonate with you?” these provide
movie emotion database, associating films with specific vital information for the recommendation process, this
emotional tags, enabling the system to align the user’s current information serves as the foundation for our cascade hybrid
emotional state with appropriate movie recommendations. model.
To enhance precision, the system takes into consideration The second step adopts a two-tiered approach, beginning with
the user’s viewing history and preferences, ensuring content-based filtering. Textual attributes such as overview,
a more tailored and personalized experience. These tagline, storyline, genre, cast, and director are transformed
recommendations are delivered through a user-friendly inter- into numerical vectors using TF-IDF in this instance. The
face, offering comprehensive details about each suggested cosine similarity is then evaluated to detect similarities
movie. The system encourages user feedback to refine its between movies.
Crafting Personalized Film Suggestions 353

Fig. 52.1 Working of the proposed model

P.Q We aim to establish a mood-based recommendation system

similarity (P, Q) = cos(q) =
||P|| ||Q|| that not only protects user privacy but also generates a more
The model is evaluated for accuracy and optimized for inclusive and accessible platform for a varied user base by
better performance. This step’s output has been meticulously using this technique. This approach is consistent with modern
prepared for integration into the succeeding matrix user-centric designs, ensuring that suggestions are accurate
factorization stage. The improved dataset is then subjected to as well as sensitive to users’ emotional states. This approach
matrix factorization in the final stage. places a greater emphasis on the user’s participation in
crafting suggestions, resulting in a more interesting and
Matrix factorization is a class of collaborative filtering
relevant viewing experience.
models. It is employed to learn latent factors that capture
user-item interactions. The user-item interaction matrix is
decomposed into user and item matrices, revealing underlying 5. Result
preferences. This stage refines the recommendations based The results of MDCH finds the best recommendation for a
on learned latent factors. The user-item interaction matrix is user based on mood using Cascade hybrid model that consists
represented as R with dimensions m × n (m users, n items).
of Content filter model and collaborative filter model.
It aims to find two lower-dimensional matrices with U (users
and latent factors) and V (items and latent factors) such that 5.1 Data Set
R ≈ UVT The recommendation system uses a dataset that contains
This approach refines the recommendations even more, information about different movies from the IMDB website.
resulting in results that are precisely matched to specific The dataset has many features for each movie, such as the
user preferences. Matrix factorization enables a more in- name of the movie, the type of movie (for example, comedy,
depth knowledge of user-item interactions, resulting in more drama, horror, etc.), a short summary of what the movie is
accurate and tailored movie recommendations. about, the names of the main actors who played in the movie,
354 Algorithms in Advanced Artificial Intelligence

Fig. 52.2 Dataset of movies

the year when the movie was released, how long the movie The third step is to display the dataset in a table or a chart
lasts, how the movie was rated by the viewers, how many that can show the data in a clear and organized way. The
people voted for the movie, how much money the movie fourth step is to transform the dataset into a format that the
made, and how the movie was scored by the critics. recommendation model can use, such as a matrix or a vector,
The dataset is processed in several steps to make it ready for that can represent the similarities and differences between the
the recommendation model. The first step is to import the movies.
dataset from the IMDB website using specific tools that can Within this dataset, you’ll find 2,000,0263 ratings distributed
extract the data. The second step is to summarize the dataset across 27,278 distinct movies. The origins of this data
by calculating some statistics, such as the average rating, the can be traced back to 138,493 users who contributed their
number of movies per genre, the most popular actors, etc. ratings between January 9, 1995, and March 31, 2015. This

Fig. 52.3 Questionaries

Crafting Personalized Film Suggestions 355

comprehensive dataset was curated on October 17, 2016. The graph compares four recommendation systems:
Notably, the users included in this dataset were chosen collaborative filtering, weighted recommendations, emotion-
through a random selection process, and it is noteworthy that based recommendations, and the proposed system MBCH.
every user selected had actively voted for a minimum of 20 The MBCH consistently gives more accurate results to users.
movies. This shows that the proposed system MBCH is better at
When users sign up for the first time, they are requested to suggesting movies to users
complete questionnaires in order to determine their movie
preferences. 6. Conclusion
In proposed MDCH, we’ve effectively integrated web
scraping, Mood Assessments, Content-Driven Filtering, and
Collaborative Filtering to deliver users exceptionally tailored
movie recommendations, taking into account their emotional
state and preferences. This innovative system enhances user
engagement and contentment by offering them pertinent
movie recommendations that align with their present
emotional disposition.

References
1. J. Jose Immanuvel, A. Sheelavathi, M. Priyadharshan, S.
Fig. 52.4 Represents the feelings of the user while login
Vignesh, K. Elango, “Movie Recommendation System”,
IJRASET, 2022-06-17
2. Hao F, Park DS, Pei Z (2017) Detecting bases of maximal
cliques in social networks. MUE2017 1–1
3. Desrosiers C, Karypis G (2011) A comprehensive survey of
neighborhood-based recommendation methods. In: Ricci F,
Rokach L, Shapira B, Kantor P(eds) Recommender systems
handbook. Springer, Boston, pp 107–144
4. Ricardo Baeza-Yates, Berthier Ribeiro-Neto, et al. Modern
information retrieval, volume 463.ACM Press New York,
1999.
5. Gediminas Adomavicius and Alexander Tuzhilin. Toward the
next generation of recommender systems: A survey of the
state-of-the-art and possible extensions. Knowledge and Data
Fig. 52.5 Gives the output after processing the user feelings Engineering, IEEE Transactions on, 17(6):734–749, 2005.
6. Kim, Mucheol, and Sang Oh Park, “Group affinity-based
social trust model for an intelligent movie recommender
system”, Multimedia tools and applications 64, no. 2, 505
516, 2013
7. Nanou, Theodora, George Lekakos, and Konstantinos
Fouskas, “The effects of recommendations “presentation on
persuasion and satisfaction in a movie recommender system”,
Multimedia systems 16, no. 4-5, 219-230, 2010.
8. Ruotsalo, Tuukka, KristerHaav, Antony Stoyanov,
Sylvain Roche, Elena Fani, RominaDeliai, Ee- tuMäkelä,
TomiKauppinen, and EeroHyvönen, “SMARTMUSEUM:
A mobile recommender system for the Web of Data”, Web
semantics: Science, services, and agents on the world wide
web 20, 50-67, 2013.
9. Sharma, Meenakshi, and Sandeep Mann, “A survey of
recommender systems: approaches and limitations”, Int J
InnovEng Technol. ICAECE-2013, ISSN, 2319-1058, 2013.
10. Tekin, Cem, Shaoting Zhang, and Mihaela van der Schaar,
Fig. 52.6 Comparison with other systems “Distributed online learning in social recommender systems”,
356 Algorithms in Advanced Artificial Intelligence

Selected Topics in Signal Processing, IEEE Journal of 8, no. 18. Pazzani MJ, Billsus D (2007) Content-based recommendation
4, 638-652, 2014. systems. In: Brusilovski P, Kobsa A, Nejdl W (eds) The
11. Keittima Chapphannarungsri and Saranya Maneero, adaptive web. Springer, Berlin, pp 325–341.
“Combining multiple criteria and multidimensional for movie 19. Çano E., Morisio M. Hybrid recommender systems: A
recommender system”, in Proceedings of the International systematic literature review. Intell. Data Anal. 2017; 21:1487–
MultiConference of Engineers and Computer Scientists, vol. 1524. doi: 10.3233/IDA-163209
1, 2009 20. S. Wattal, Y. Hong, M. Mandviwalla, and A. Jain,” Technology
12. Lekakos, George, and Petros Caravelas, “A hybrid approach for diffusion in the society: Analyz- ing digital divide in the context
movie recommendation”, multi-media tools and applications of social class”, IEEE Proc. of 44th Hawaii International
36, no. 1-2, 55-70, 2008 Confer- ence on System Sciences, 1-10, 2011. DOI: http://
13. Yoshii, Kazuyoshi, Masataka Goto, Kazunori Komatani, dx.doi.org/10.1109/HICSS.2011.398
Tetsuya Ogata, and Hiroshi G. Okuno, “An efficient hybrid 21. M. Goldmann and G. Kreitz,” Measurements on the spotify
music recommender system using an incrementally trainable peer-assisted music-on-demand streaming system”, IEEE
probabilistic generative model”, Audio, Speech, and Language International Conference on Peer-to-Peer Computing, 206
Processing, IEEE Transactions on 16, no. 2, 435-447, 2008 211, 2011. http://dx.doi.org/10.1109/P2P.2011.6038737
14. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis 22. H. Li, F. Cai, and Z. Liao,” Content-based filtering
in Health Care Using Hybrid Model Algorithms”,Journal of recommendation algorithm using HMM”, IEEE Fourth
Artificial Intelligence, Machine Learning and Neural Network International Conference on Computational and Information
(JAIMLNN), Volume 3, 2023. Sciences, 275-277, 2012.
15. Christakou, Christina, Leonidas Lefakis, Spyros Vrettos, 23. D. Goldberg, D. Nichols, B. M. Oki, and D. Terry, ”Using
and Andreas Stafylopatis, “A movie recommender system collaborative filtering to Weave an Information tapestry”,
based on semi-supervised clustering”, in Computational Communications of ACM 35(12):61-70, 1992. DOI: http://
Intelligence for Mod- elling, Control and Automation, dx.doi.org/10.1145/138859.138867
2005 and International Conference on Intelligent Agents, 24. Gupta S. A Literature Review on Recommendation Systems.
Web Technologies and Internet Commerce, International Int. Res. J. Eng. Technol.2020;7:3600–3605.
Conference on IEEE, vol. 2, pp. 897-903, 2005 25. Shen J., Zhou T., Chen L. Collaborative filtering-based
16. Symeonidis, Panagiotis, Alexandros Nanopoulos, and Yannis recommendation system for big data. Int. J. Comput. Sci. Eng.
Manolopoulos, “MoviExplain: a recommender system with 2020;21:219–225. doi: 10.1504/IJCSE.2020.105727.
explanations”, In Proceedings of the third ACM conference 26. Beniwal R., Debnath K., Jha D., Singh M. Data Analytics and
on Rec- ommender systems, pp. 317-320, 2009. Management. Springer; Berlin/Hei- delberg, Germany: 2021.
17. Adomavicius G, Tuzhilin A (2005) Toward the next generation Hybrid Recommender System Using Artificial Bee Colony
of recommender systems: a survey of the state-of-the-art and Based on Graph Database; pp. 687–699
possible extensions. IEEE Trans Knowl Data Eng 6:734–749.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

A Comprehensive Approach to
Detect SQL Injection Attacks Using
Enhanced Snort Rules
53

T. Srinivasarao1
Assistant Professor, Department of ECE
GIET (Autonomous), Rajahmundry, Andhra Pradesh, India
Shrija Madhu2
Professor, Department of CSE
GIET (Autonomous), Rajahmundry, Andhra Pradesh, India
K. Kalyani Vishalakshi3, Preetish Madhu4, K. Satya Sai DurgaManikanta5, P. Sumanth Yadav6
Department of CSE (AIML & CS)
GIET (Autonomous), Rajahmundry, Andhra Pradesh, India

Abstract: SQL Injection attacks continue to be a significant danger to web applications, with the potential for unauthorized
access, data breaches, and application vulnerabilities. Using updated Snort rules, this paper describes a complete approach for
detecting and mitigating various kinds of SQL Injection attacks. We suggested a solution to provide a robust and adaptable
defensive mechanism against various kinds of SQL Injection attacks by addressing the drawbacks of existing detection
methods. Inthis proposed method we illustrate the practical use of the upgraded Snort rules in real-world web application
contexts through comprehensive testing and evaluation. The significance of this paper is that it provides a realistic and effective
solution for organizations to protect their web applications against SQL Injections attacks while protecting sensitive data and
user privacy. This paper will enableeven non-experts to deploy powerful SQL Injections detection capabilities by employing
simpler Snortrules, empowering enterprises of all sizes to improve their security posture. The results of this study will benefit
network security by providing a more advanced and proactive approach to SQL Injections detections, providing the way for
future intrusion detection research and the development of enhancedSnort rules for emerging threats.
Keywords: SQL injections, Web application security, Intrusion detection system (IDS), Snort, Attack detection

1. Introduction applications face. Successful SQL Injections attacks have

the potential to compromise the online application and the
Modern businesses and services now rely heavily on web database that powers it, allowing for unauthorized access,
applications to enable smooth interactions with customers data manipulation,or even total compromise.
all over the world. However, because of how frequently they The possibility of SQL Injection attacks highlights the
are used, they are now popular targets for cyber-attacks. SQL effective detection and prevention measures. Toavoid
Injections, a sort of attack where attackers take advantage of possible data theft and safeguard sensitive user data, real-
bugs in the application’s input validation to insert malicious time SQL injection detection is crucial. Existing methods,
SQL code, are one of the most serious risks that web such as anomaly or signature-based detection, have shown

1srinu.thupakula@giet.ac.in, 2shrija@giet.ac.in, 3kodelakalyani@gmail.com, 4preetmadhu15@gmail.com, 5manikantakothapalli1@gmail.com, 620551a4644.

sumanth@gmail.com

DOI: 10.1201/9781003529231-53
358 Algorithms in Advanced Artificial Intelligence

some degree of effectiveness. The aim of this research is Many other approaches are available for available for detecting
to improve Snort IDS capacity for detecting various kinds and securing documents using Encryption Techniques also
of SQL Injections. We will cover an in-depth set of SQL [16-18].
Injection variations, including Classic, Blind, Time based,
Error based, Union, Boolean, Second order and Out of Band 3. Proposed Methodology
SQLInjections types. The efficiency of the suggested system,
including its accuracy, adaptability, and resource usage, will However, there are some drawbacks in the existing methods,
be thoroughly examined. The establishment of a full-fledged such as narrow coverage, a high probability of false positives,
web application firewall and additional security measures and complex procedures. The proposed methodology would
that go beyond Snort IDS are not the focus of this research. offer a thorough and useful method for quickly and efficiently
recognizing different types of SQL Injection attacks which is
shown in Fig. 53.1. We aim to improve the real-time detection
2. Related Work and prevention of SQL Injection attempts by leveraging the
In this part, we present a comprehensive literature work capabilities of Snort IDS. This will also help to reduce false
that explores the existing research on detecting SQL attacks positives by modifying the Snort rules.
detection. The survey covers papers related to SQL Injections
detection and examines their methodologies, findings, and
limitations. The studies collectively highlight the evolving
strategies for detecting SQL injections for further prevention
steps.
The authors Gupta and Sharma introduced a novel approach
leveraging Snort for effective detection [1]. The authors
proposed an evidence-collection and notification model
using standard IDS systems in network forensics, providing a
comprehensive approach [2]. Caesarano and Riadi introduced
the NIST method in network forensics to detect SQL injection
attacks. This approach integrates NIST’s guidelines and
standards for effective attack identification and response [3].
Alnabulsi et al. focused on the utilization of SNORT IDS for
accurate detection [4]. Fig. 53.1 Block diagram of methodology

Kemalis and Tzouramanis introduced SQL-IDS, a

3.1 Developing Stage
specification-based approach demonstrating a proactive
method for SQL-injection detection [5]. Their methodology Rule Selection and Customization: In this initial step, rules
involves specifying legitimate SQL query structures, allowing are carefully selected to target specific types of SQL injection
deviations from these structures to be identified as potential attacks. Those rules are then adapted to the specific needs
attacks. The authors Kumar et al. present an extensive survey and settings of the network. For example, rules are selected
on SQL injection attacks, providing a valuable overview of and customized for classic SQL, blind SQL, time SQL, error
detection and prevention techniques [6]. SQL, union SQL, stacked queries SQL, Quadratic SQL, out
of-band SQL, and Boolean SQL injections.
Lee et al. proposed an automated approach to identify SQL
injections and CSS attacks [7]. They employ static analysis Rule Integration with SNORT IDS: The Snort Intrusion
methods to identify vulnerable payloads and dynamically Detection System (IDS) is then continuously updated with
generate malicious payloads. These studies collectively the selected and customized rules. During this procedure,
underline the diverse approaches taken to address SQL Snort is set up to identify and utilize the selected rules while
injection vulnerabilities, ranging from advanced detection doing normal packet inspection. Usually, the local rules file
methodologies using Snort and IDS systems to specification- or a specific configuration management system is used to add
based approaches [8,12,14,15]. Their work serves as a the rules. Because of the provided patterns, this makes sure
valuable resource for understanding the landscape of SQL that Snort is now prepared to actively monitor networktraffic
injection vulnerabilities and defenses. Additionally, they for any indications of SQL injection attempts.
highlight the ongoing importance of network forensics and Testing & Validation: This step involves a set of severe tests
comprehensive surveys in understanding and mitigating these to validate the effectiveness of the integrated rules. Various
security threats [9,10,11,13]. SQL injection attacks are simulated in controlled circum
A Comprehensive Approach to Detect SQL Injection Attacks Using Enhanced Snort Rules 359

stances to check the capability of Snort in order to identi alert tcp any any -> any any (msg:"Possible Time-Based
fy and react to those threats. This stage acts as an important Blind SQLInjection Alert"; flow:to server, established;
stage of validation, verifying that the rules work as intended. content:"'AND IF (1=1, SLEEP (5),0) --"; http_uri;
sid:1000003; rev:1;)
4. Testing & Optimizing Stage #Rule 4: Rule for SQL Injection attack (Based on error)
This rule detects Error based SQL Injection Alerts by
Fine Tuning & Optimization: The need for optimization and
injecting code that triggers an error with the database.
fine-tuning may depend on the findingsof the testing phase.
This requires a thorough examination of the Snort alerts, in alert tcp any any ->> any any (msg: "Possible Error
cluding an evaluation of their accuracy. To reduce false pos type SQL Injection Alert"; flow:to_server, established;
itives and improve the system’s general detection capability, content:"'AND 1=CONVERT (int, @@version)
modifications can be made to the rules. For the best results, -"; http_uri; sid:1000004; rev:1;)
both precision and sensitivity must be balanced properly. #Rule 5: Rule for Union SQL Injection attack
Benchmarking for Performance: Benchmarking is crucial This rule detects Union based SQL Injection Alerts by
to verify the IDS effectiveness and responses. The system injecting the UNION keyword, alerting tocombine results
is tested through a range of simulated network traffic to from different database queries.
determine how effectively it performs under various traffic alert tcp any any -> any any (msg:"Possible Union
scenarios. Potential obstacles or areas for improvement SQL Injection Alert"; flow:to_server, established;
canbe found by tracking response times and consumption of content:"UNION"; http_uri; sid:1000005; rev:1;)
resources. During this stage, the IDS is tested to make sure #Rule 6: Rule for Second order SQL Injection attack
it can handle the expected network traffic and successfully This rule detects Second order SQL Injection Alerts, where
identify SQL injection attacks. the malicious payload is stored for laterexecution.
Continuous Monitoring & Updates: Continuous monitoring alert tcp any any -> any any (msg:"Possible Second-
of the IDS after implementation is required to verify its Order SQL Injection Alert"; flow:to_server, established;
continued efficiency. This includes real-time alarm analysis content:"'OR '1'='1''; http_uri; sid:1000007; rev:1;)
and regular assessments of the effectiveness of the system. #Rule 7: Rule for Out of band SQL Injection attack
Furthermore, the rule set should be updated on a regular basis This rule identifies Out of band SQL Injection Alerts by
to incorporate the most recent threat intelligence and adapt to Alerting to retrieve data through an alternative channel.
emerging attack strategies. This step assures that the IDS will
alert tcp any any -> any any (msg:"Possible Out-of-
continue to provide a strong defense against SQL injection
Band SQL Injection Alert"; flow:to_server, established;
attacks all over time.
content:"UNION SELECT NULL, load_file('/etc/
passwd'), NULL --"; http_uri; sid:1000008; rev:1;)
5. Proposed Snort Rules #Rule 8: Rule for Boolean SQL Injection attack
#Rule 1: Rule for Classic SQL Injection attack This rule detects boolean SQL Injection Alerts by injecting a
This rule helps to detect classic SQL Injection by injecting a condition that always evaluatesto true.
condition that always evaluates to true. alert tcp any any -> any any (msg:"Possible Boolean
alert tcp any any -> any any (msg: "Possible Classic SQL SQL Injection Alert"; flow:to_server, established;
Injection Alert"; flow:to_server, established; content:"' content:"'AND 1=1 --"; http_uri; sid:1000009; rev:1;)
OR 1=1"; http_uri; sid:1000001; rev:1;)
#Rule 2: Rule for Blind SQL Injection attack 6. Testing & Evaluation
This rule focuses on Blind SQL Injection Alerts by injecting 6.1 Experimental Setup
a sleep command, causing a delay in the server's response. Step 1: Select any Hypervisor like VM-ware
alert tcp any any -> any any (msg: "Possible Blind SQL
Injection Alert"; flow:to_server, established; content:"' Step 2: In VMware install the following operating systems:
AND SLEEP (5) --"; http_uri; sid:1000002; rey:1;) 1. Ubuntu – for SNORT IDS installation
# Rule 3: Rule for Time based blind SQL Injection attack 2. Kali Linux – as an Attacking machine
This rule detects Time based blind SQL Injection by injecting 3. Metasploitable2– as a testing webpage
a condition that causes a delay if true.
360 Algorithms in Advanced Artificial Intelligence

Fig. 53.2 Installation of snort

Step 3: Locate the snort configuration file and set $HOME_

NET CIDR range. (Ex: 192.168.226.0/24)
Step 4: Open local.rules files and write the proposed snort
rules using any text editor by using thecommand:
Fig. 54.2 Ubuntu OS
vim /etc/snort/rules/local.rules

Fig. 53.3 Rules in local.rules file

Step 5: Verify the syntax and save the rules.

Step 6: Execute the snort in ubuntu terminal to monitor the
alert by using command:
Fig. 54.3 Kali Linux snort -q -l <path/to/snort/log> -i <interface name> -A
console -c <path/to/snort.conf>
Step 7: Generate the Test traffic in Metasploitable web page
from Kali Linux terminal to see the alertswhen there is a SQL
Injection attack.
Results
The proposed rules listed above were tested in a controlled
environment, and all of them produced the expected outcomes.
We tested these rules on various testing websites, including
the DVWA metasploitable2, vulnweb, hackthissite etc.
The proposed snort rules for the various SQL injection
attacks were tested in the DVWA web page as shown in
Fig. 54.4 Metasploitable2 Figs. 53.7-53.14. As expected, the snort has detected all the
Step 3: Provide Internet to the VMware above SQL injection attacks in the ubuntu terminal as shown
in Figs. 53.15-53.22.
6.2 Testing and Results
Testing
Testing is a crucial part, especially when it incorporates
security measures such as SQL injectiondetection using
Snort IDS. Here is a step-by-step approach for doing detailed
testing.
Step 1: Open VMware and switch to Ubuntu Operating system
Step 2: Open Terminal and Install Snort IDS with sudo
permissions by using command

sudo apt-get install snort Fig. 53.4 Classic SQL injection attack
A Comprehensive Approach to Detect SQL Injection Attacks Using Enhanced Snort Rules 361

Fig. 53.9 Second-order SQL injection attack

Fig. 53.5 Blind SQL injection attack

Fig. 53.10 Out of band SQL injections attack

Fig. 53.6 Time-based blind SQL injection attack

Fig. 53.11 Boolean SQL injection attacks

Fig. 53.12 Classic SQL injection detection

Fig. 53.7 Error-based SQL injection attack

Fig. 53.13 Blind SQL injection detection

Fig. 53.14 Time based SQL injection detection

Fig. 53.8 Union-based SQL injection attack

Fig. 53.15 Error based SQL injection detection

362 Algorithms in Advanced Artificial Intelligence

4. Alnabulsi et al. “Detecting SQL injection attacks using

SNORT IDS.” In Asia-Pacific World Congress on Computer
Science and Engineering, pp. 1–7. IEEE, 2014.
Fig. 53.16 Union SQL injection detection 5. Kemalis et al. “SQL-IDS: a specification-based approach for
SQL-injection detection.”
6. In Proceedings of the 2008 ACM symposium on Applied
computing, pp. 2153–2158. 2008.
7. Kumar et al. “A survey on SQL injection attacks, detection
Fig. 53.17 Second order SQL injection detection and prevention techniques.” In 2012 Third International
Conference on Computing, Communication and Networking
Technologies (ICCCNT’12), pp. 1–5. IEEE, 2012.
8. Huang et al. “Craxweb: Automatic web application testing and
attack generation.” In 2013 IEEE 7th International Conference
Fig. 53.18 Out of band SQL injection detection on Software Security and Reliability, pp. 208–217. IEEE,
2013.
9. Sonewar, Piyush A., and Nalini A. Mhetre. “A novel approach
for detection of SQL injection and cross site scripting attacks.”
In 2015 International Conference on Pervasive Computing
Fig. 53.19 Boolean type SQL injection detection (ICPC), pp. 1–4. IEEE, 2015.
10. Pallam, Ravi, Sai Prasad Konda, Lasya Manthripragada,
and Ram Akhilesh Noone. “Detection of Web Attacks using
7. Conclusion Ensemble Learning.” learning 3, no. 4 (2021): 5.
11. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis
The entire strategy to detecting SQL Injection attacks in Health Care Using Hybrid Model Algorithms”,Journal of
using improved Snort rules has been demonstratedin this Artificial Intelligence, Machine Learning and Neural Network
research to be a reliable and efficient method for enhancing (JAIMLNN), Volume 3, 2023.
online application security. The designed criteria, which 12. Akkaya, M., & Yilmaz, A. (2019). Detecting SQL Injection and
were applied to several SQL Injection types demonstrated Cross Site Scripting Attacks using the XGBoost Algorithm. In
remarkable accuracy in detecting malicious injection Alerts. 2019 IEEE 43rd Annual Computer Software and Applications
Conference (COMPSAC) (Vol. 2, pp. 649–654). IEEE.
The system proved at identifying second order, out of band,
13. Silva, D. F., Parizi, R. M., Lira, W. D., Rocha, A. R. L., &
boolean, blind, time, error, union, stacked and classic based Wazlawick, R. S. (2018). SQL injection detection using
SQL Injections, demonstrating its adaptability. Furthermore, XML attribute values. In Proceedings of the 33rd ACM/IEEE
the experimental evaluation conducted in a controlled International Conference on ASE (pp. 525–535).
environment validated the system’s performance in real-world 14. Aditya et al. - An Intelligent method for Detection of SQL
scenarios. The consistently high detection rates demonstrated Injection Attacks in Database System. In Proceedings of the
the system’s reliability in recognizing and catching SQL 13th International Conference on Computational Intelligence
Injection threats. False positives remained low, preventing and Security (pp. 252–258).
15. Alshahrani, M., Kim, Y. S., & Kim, H. K. (2016). A hybrid
unnecessary flagging of real traffic. The accomplishment of
intrusion detection model based on snort and immune
this project demonstrates Snort’s potentialas an effective tool algorithms. Security and Communication Networks, 9(17),
in the defense against SQL Injection attacks. The methods 3933–3944.
and outcomes of the project shed important light on the 16. Elhajjaji, F., & Beni-Hssane, A. (2015). A new approach to
potential and constraints of employing modified Snort rules prevent SQL injection attacks. Procedia Computer Science,
for webapplication security. 56, 487–492.
17. Behnam, M., & Modiri, N. (2014). A new approach for
detection of SQL injection attacks using web log files. In 2014
References 4th International conference on CKE (pp. 528–533). IEEE.
18. S.Somaraj ,M.A.Hussain. Performance and Security Analysis
1. Gupta et al. “A novel approach for detecting sql injection for Image Encryption using Key Image. Indian J.of Sci and
attacks using snort.” Journal of The Institution of Engineers Tech. 2015:8(35)
(India): Series B 103, no. 5 (2022): 1443–1451. 19. S.Somaraj, M.A.Hussain. A Novel Image Encryption
2. Bhardwaj, Sonam, and Mayank Dave. “Sql injection attack Technique Using RGB Pixel Displacement for Color Images,
detection, evidence collection, and notifying system using IEEE 6th International Conference on Advanced Computing
standard intrusion detection system in network forensics.” In (IACC), 2016.
Proceedings of International Conference on Computational 20. S.Somaraj, M.A.Hussain. An Image Encryption Technique
Intelligence, Data Science and Cloud Computing: IEM-ICDC Using Scan Based Approach and Image as Key, Proceedings
2020, pp. 681–692. Springer Singapore, 2021. of the First International Conference on Computational
3. Caesarano et al. “Network forensics for detecting SQL Intelligence and Informatics. Advances in Intelligent Systems
injection attacks using NIST method.” Int. J. Cyber-Security and Computing, 2016;507: 645–653.
Digit. Forensics 7, no. 4 (2018): 436–443.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

ARP and DNS Spoofing Detection with

Attacker IP Capturing 54

T. Srinivasarao1,
Assistant Professor, Department of ECE,
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
N. Leelavathy2,
Professor, Department of CSE,
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
S. Kailash Chandra Sri Satya Dev3, I. Om Ganesh4,
P. Sai Aditya5, P. Sai Krishna6
Department of Computer Science and Engineering (AIML & CS)
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India

Abstract: DNS spoofing attacks are a growing threat to network security, allowing attackers to manipulate DNS responses
and redirect users to malicious websites. Many existing methods for detecting these attacks have limitations, that reduce their
effectiveness. This proposed method aims to overcome these limitations by designing and implementing an improved ARP and
DNS spoofing detection with attacker IP tracing. The proposed method primary objective includes real-time monitoring of ARP
responses to detect abnormalities indicating of ARP spoofing attempts. This is examined by checking source IP, MAC addresses
for discrepancies. Similarly, the solution thoroughly examines DNS responses, instantaneously warning the administrators
of anomalies between requested and received IP addresses, indicating probable DNS spoofing incidents. Along with these
detection capabilities, the project tracks attacker IP addresses, allowing administrators to trace and examine suspicious spoofing
attempts. In summary,”ARP and DNS Spoofing Detection with Attacker IP Capturing” defines a significant improvement in
network security, providing an effective protection against sophisticated spoofing attacks. The result of this research helps
network administrators and organizations to secure their networks by detecting these ARP and DNS spoofing attacks by tracing
the attacker’s IP.
Keywords: DNS spoofing, ARP spoofing, Network security, Intrusion Detection System (IDS), Python

1. Introduction manipulation. The research provides network administrators

and security professionals with a powerful toolset for real-
Network security is an important concern in today’s hyper- time detection of ARP and DNS spoofing incidents, as well
connected world. Spoofing attacks on the Address Resolution as the ability to trace the origin of potential attacks through
Protocol (ARP) and the Domain Name System (DNS) present the identification of the attacker’s IP address, by combining
major risks to network integrity. This project presents a Python’s adaptability with Scapy’s precision. ARP spoofing
Python-based approach for protecting networks against these is the manipulation of ARP responds to associate a different
covert attacks, implementing the Scapy module for packet MAC address with a target IP address, resulting in data

1
srinu.thupakula@giet.ac.in, 2drnleelavathy@gmail.com, 3kailashchandra.sri.satya@gmail.com, 420551a4622.omganesh@gmail.com,
5
20551a4657.adityapolisetti@gmail.com, 620551a4642.saikrishna@gmail.com

DOI: 10.1201/9781003529231-54
364 Algorithms in Advanced Artificial Intelligence

packet interception. DNS spoofing takes use of flaws in the potential harm if the server fails or is compromised, resulting
DNS resolution mechanism to redirect people to malicious in the server becoming a single point of failure2. Even though
websites. These types of attacks can result in unauthorized they are simple, static entry techniques are less appropriate
access to sensitive data, intercepting of confidential for large- scale networks and dynamic situations since manual
exchanges, and even the insertion of harmful payloads within IP address assignments are impractical in these contexts3.
the network. Host-based solutions, while conceptually strong, face issues
Traditional security methods may be ineffective in preventing in determining the reliability and relevance of each host,
making them more difficult for real-world implementation5.
ARP and DNS spoofing attacks. While firewalls and intrusion
detection systems are important, they may lack the detailed
packet-level analysis needed to detect minor spoofing 3. Proposed Methodology
attempts. As a result, a dedicated solution capable of real-
While existing literature on ARP and DNS spoofing
time monitoring and quick response is required. This project
detection techniques, such as cryptographic, server- based,
bridges the gap by utilizing Python’s scripting capabilities and other defence mechanisms, the proposed research
and Scapy’s packet manipulation abilities, providing an introduces a comprehensive and innovative solution that
effective defiance against these emerging threats. The Python addresses limitations in current approaches.Unlike some
scripts runs in the background of the network, continuously cryptographic systems, which may rely on single points
analyzing incoming ARP and DNS packets. When a probable of failure, such as Authoritative Key Distributors (AKD)1,
spoofing attempt is detected, the solution sends immediate the proposed method provides a distributed and reliable
alerts with the suspected attacker’s IP address. This feature detection system. By combining Python’s adaptability with
enables administrators to take quick decisions to reduce Scapy’s precision, the system provides real-time monitoring
potential threats and protect their networks. Finally, “ARP of ARP and DNS responses, allowing for immediate alerts
and DNS Spoofing Detection with Attacker Capturing” in the event of suspicious activity. In addition, the project’s
makes an important addition to network security. The solution dynamic tracking of attacker IP addresses represents a major
solves the important need for comprehensive ARP and DNS improvement over static entries or manual assignments. This
spoofing detection by utilizing Python’s flexibility and Scapy feature allows administrators to quickly trace and investigate
precision, providing network security professionals with unusual actions, improving their capacity to respond to
an important resource to protect their networks and defend potential threats.
against evolving cyber threats.
Step 1: Understanding Spoofing attacks: Gaining a
thorough understanding of spoofing attacks is the primary
2. Related Work objective of the first phase of the methodology. This involves
Morsyet al. Proposed1,2 an innovative approach to counter studying various spoofing strategies, including DNS and
ARP spoofing attacks and a comprehensive survey on DNS ARP spoofing, and their impact on network security.
attack detection and security protection, offering insights into Step 2: Developing code using Scapy in Python: Scapy,
various strategies for safeguarding DNS infrastructures.The a powerful package known for its abilities in packet
research by Hijazi and Obaidat analyzes Address Resolution manipulation and network traffic analysis, is used along with
Protocol spoofing attacks and the security measures taken to Python’s capabilities to create Python based script that can
prevent them, adding to our knowledge of network security capture and analyze network packets.
issues.ARP spoofing detection and mitigation strategies are Step 3: Project and Environment setup: A secure virtual
thoroughly covered in Rohatgi and Goyal’s work, which is environment is created to protect the host machine while
an essential tool for understanding and defeating such a kind running tests. Isolated virtual machines (VMs) that replicate
of attack.Public-key cryptography is used by cryptographic actual network setups were created using virtualization
solutions, such as S-ARP, to authenticate ARP answers. technologies like VMware. These VMs host the Windows
This strategy has limitations even though it works well. It is target, Kali Linux attacker, and Ubuntu monitor machines,
dependent on an Authoritative Key Distributor (AKD), which allowing for thorough testing while protecting the integrity
increases the possibility of a single point of failure. of the host system.
Furthermore, hosts have to get in touch with the AKD for each Step 4: Code Execution: Execution of the developed
ARP reply, which could lead to dependencies and delays1. Python code within the controlled virtual environment is
An alternative method is provided by server-based systems the next phase. This step makes sure that network traffic is
that use a dependable server to examine packets. However, constantly and uninterruptedly monitored in real time. The
they have a critical vulnerability that exposes the network to created Python script captures and analyzes packets as they
ARP and DNS Spoofing Detection with Attacker IP Capturing 365

Fig. 54.1 Methodology

transit over the network, giving the required parameters for Kali Linux for simulated attacks. Ubuntu hosts the detection
detecting DNS and ARP spoofing attacks. system, while Windows and Kali Linux play the roles of
Step 5: Detecting DNS and ARP Spoof: The detection victim and attacker, respectively. Controlled experiments in
of DNS and ARP spoofing attacks is the crucial stage. By this virtual lab setting will evaluate the system’s effectiveness
using the developed Python code’s real-time packet analysis in identifying ARP and DNS spoofing attacks.
capabilities, the network packets are examined for any Step 1: Select any Hypervisor like VMware.
indications of possible spoofing attacks. Identification of Step 2: In VMware install the following operating systems:
suspicious patterns, such as the emergence of several IP
1. Ubuntu – Works as network monitoring system
addresses associated to a single host or differences in DNS
answers, will generate an alert message. 2. Kali Linux – Works as attacker machine
3. Windows – Works as a Victim machine
Step 6: Tracking Attacker IP: In this phase, the attacker
IP address will be captured along with the generated alert Step 3: Provide Internet to the VMware.
message. The Python code created in the earlier phase will
identify the attacker’s IP address by tracing out the attack’s 5. Testing and Results
origin.
Testing is a crucial part, especially when implementing
Step 7: Performance Metrics and Real Time Monitoring:
security features like DNS and ARP Spoofing detection
To evaluate the effectiveness of the detection capability. The
methods. Here is a step-by-step approach for doing
detection accuracy, false positive/negative rates, and response
detailed testing.
time will be evaluated in this phase. Continuous real-time
monitoring improves in staying up to date on emerging Step1: Open VMware and switch to Ubuntu Operating
threats and enhancing detection accuracy. system.
Step 2: Develop the python script and save it in the Network
4. Experimental Setup monitoring system (Ubuntu).
Step 3: Execute the Script with root permissions as shown in
The test environment consists of three virtual machines:
Fig. 54.2, to monitor the Network using command.
Ubuntu for network monitoring, Windows for testing and
366 Algorithms in Advanced Artificial Intelligence

python3 <filename> and identify the attacker IP in the Ubuntu terminal. The
results will be as follows:

Fig. 54.5 Python script execution

Step 4: Open Kali Linux (attacking machine) and then open

terminal with sudo permissions.
Step 5: Start ARP & DNS spoofing attacks (test traffic) on the
target victim machine (windows) using any spoofing tool as
shown in Fig. 54.3 and 54.4.
- In this research we have used Bettercap tool to generate
ARP & DNS spoofing test traffic. Fig. 54.9 ARP and DNS spoofing, Attacker IP detection

6. Evaluation and Performance Metrics

Evaluating the effectiveness and efficiency of this research
crucial in ensuring its reliability in real- world scenarios.
Detection accuracy measures the system’s ability to correctly
identify Spoofing attempts. It is calculated as the ratio of
correctly detected spoofing incidents to the total number
of spoofing attempts. The detection time is a measurement
of how long it takes for a spoofing attempt to start and for
Fig. 54.6 ARP spoofing on target victim the system to identify it as an attack. Lower detection times
indicate quicker response times. The impact of the system
on network resources, such as CPU and memory use, is
measured using resource utilisation.. Making sure the system
runs smoothly without overloading the network infrastructure
is important.

Table 54.1 Performance metrics

Metrics Calculations
True Positives 28
True Negatives 0
Fig. 54.7 DNS spoofing on target victim False Positives 1

Step 6: Browse the target website (stackoverflow.com) on False Negatives 2

the victim machine. It will be redirected to another target Detection Accuracy (TP+TN/TI)x100 93%
spoofing website (vulnweb.com). Detection Response Instantly
Attacker IP Capturing Rate (Total Captured/TI) 97%
CPU Utilization 12%
Memory Utilization 28.10%

To thoroughly assess the effectiveness of our ARP and DNS

spoofing detection system, we carried out experiments with
a collection of thirty instances representing a wide range of
network settings and targets.

Fig. 54.8 Victim responses spoofing

Step 7: Monitor the ARP and DNS Spoofing detected alerts

ARP and DNS Spoofing Detection with Attacker IP Capturing 367

7. Conclusion and Future Scope

The development and implementation of the ARP and DNS
Spoofing Detection system with Attacker IP Capturing
produced positive outcomes in terms of network security.
The system displayed acceptable detection accuracy after
thorough testing and evaluation, reliably recognising instances
of ARP and DNS spoofing attacks. The ability to capture the
attacker’s IP address, together with real-time monitoring
capabilities, provides network managers with helpful
information and quick responses to any security breaches.
Furthermore, the system’s resource utilisation stayed within
acceptable limits, ensuring that network performance was not
impacted.By addressing these future directions, the ARP and
DNS Spoofing Detection system with Attacker IP Capturing
Fig. 54.7 Resources utilization can become an even more effective and flexible measure
for protecting network security and integrity in the face of
evolving cyber threats.

Fig. 54.8 CPU and memory consumption

3. Maksutov, Artem A., Ilya A. Cherepanov, and Maksim S.

References Alekseev. “Detection and prevention of DNS spoofing
1. Morsy, Sabah M., and Dalia Nashat. “D-ARP: An Efficient attacks.” In 2017 Siberian Symposium on Data Science and
Scheme to Detect and Prevent ARP Spoofing.” IEEE Access Engineering (SSDSE), pp. 84–87. IEEE, 2017.
10 (2022): 49142–49153. 4. Hijazi, Sherin, and Mohammad S. Obaidat. “Address
2. Jianwu, Z. H. A. N. G., A. N. Yanjun, and D. E. N. G. resolution protocol spoofing attacks and security approaches:
Huangyan. “A survey on DNS attack detection and security A survey.” Security and Privacy 2, no. 1 (2019): e49.
protection.” Telecommunications Science 38, no. 9 (2022).
368 Algorithms in Advanced Artificial Intelligence

5. Hussain, Mohammed Abdulridha, Hai Jin, Zaid Alaa Hussien, 10. Srinath, D., S. Panimalar, A. Jerrin Simla, and J. Deepa.
Zaid Ameen Abduljabbar, Salah H. Abbdal, and Ayad Ibrahim. “Detection and Prevention of ARP spoofing using Centralized
“DNS protection against spoofing and poisoning attacks.” In Server.” International Journal of Computer Applications 113,
2016 3rd International Conference on Information Science no. 19 (2015).
and Control Engineering (ICISCE), pp. 1308–1312. IEEE, 11. Trabelsi, Zouheir, and Wassim El-Hajj. “ARP spoofing:
2016. a comparative study for education purposes.” In 2009
6. Marchal, Samuel. “DNS and semantic analysis for Information Security Curriculum Development Conference,
phishing detection.” PhD diss., University of Luxembourg, pp. 60–66. 2009.
Luxembourg, Luxembourg, 2015. 12. Al Sukkar, Ghazi, Ramzi Saifan, Sufian Khwaldeh, Mahmoud
7. Reddy Navya, Ramisetty Upendra,”Predict Early Pneumonitis Maqableh, and Iyad Jafar. “Address resolution protocol
in Health Care Using Hybrid Model Algorithms”,Journal of (ARP): Spoofing attack and proposed defense.” (2016).
Artificial Intelligence, Machine Learning and Neural Network 13. Tripathi, Nikhil, Mayank Swarnkar, and Neminath Hubballi.
(JAIMLNN), Volume 3, 2023. “DNS spoofing in local networks made easy.” In 2017
8. Bin, Sun, Wen Qiaoyan, and Liang Xiaoying. “A DNS based IEEE International Conference on Advanced Networks and
anti-phishing approach.” In 2010 Second International Telecommunications Systems (ANTS), pp. 1–6. IEEE, 2017.
Conference on Networks Security, Wireless Communications 14. Sharma, Bandana. “Review paper on prevention of DNS
and Trusted Computing, vol. 2, pp. 262-265. IEEE, 2010. spoofing.” International Journal of Engineering and
9. Jindal, Keshav, Surjeet Dalal, and Kamal Kumar Sharma. Management Research (IJEMR) 4, no. 3 (2014): 164–170.
“Analyzing spoofing attacks in wireless networks.” In 2014
Note: All the figures and table in this chapter were designed by the
Fourth International Conference on Advanced Computing &
author.
Communication Technologies, pp. 398–402. IEEE, 2014.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

A Comprehensive Review of Advanced

Artificial Intelligence Integration in ICT
Systems: Methodologies, Applications, and 55
Future Directions

Gopisetty Pardhavika*, Prisicilla R.

Department of Artificial Intelligence and Data Science
St. Joseph’s Institute of Technology
Old Mahabalipuram Rd, Kamaraj Nagar, Semmancheri, Chennai, Tamilnadu

Abstract: This paper explores the integration of advanced artificial intelligence (AI) in ICT systems, employing machine
learning and symbolic AI for problem-solving, including logic programming, expert systems, fuzzy logic, case-based reasoning,
knowledge graphs, planning, and reinforcement learning algorithms. It focuses on AI applications in medical and health care,
cybersecurity, data management, cloud computing, human-computer interaction, and network communication. The analysis
delves into key AI methodologies and algorithms, highlighting their impact on efficiency and reliability. The paper emphasizes
that addressing challenges and seizing AI opportunities is crucial for ensuring a sustainable and innovative future in ICT. It
underscores the significance of widespread AI integration across various sectors to maximize its benefits. By examining the
synergy of advanced AI systems in solving problems and optimizing processes, the paper contributes to the broader discourse
on the transformative potential of AI in shaping the future landscape of information and communication technology. In essence,
this exploration positions advanced AI as a linchpin for addressing contemporary challenges and fostering innovation in ICT.
With its focus on practical applications and underlying methodologies, the paper serves as a valuable resource for understanding
the current landscape and paving the way for future developments in the integration of advanced AI within ICT systems.
Keywords: Artificial Intelligence, Information and Communication Technology, Cyber security, and Data management.

1. The Role of AI in Reshaping ICT is to offer a panoramic perspective on the cutting-edge

developments in this dynamic field. By delving into existing
1.1 Introduction research, industry advancements, and practical applications,
The rapid advancements in ICT have fundamentally the paper strives to reveal the numerous ways in which AI has
transformed the way we interact, communicate, and conduct enriched ICT systems. Furthermore, this exploration delves
business in today’s world. In parallel, the field of AI has into both the challenges and opportunities that are an inherent
experienced remarkable growth, marked by breakthroughs part of this symbiotic relationship [2].
in machine learning (ML), deep learning, natural language The seamless integration of AI and Information and ICT
processing (NLP), and robotics. As these two domains in knowledge management has revolutionized the way
converge, they herald a new era of improbable possibilities. organizations handle their information resources. AI’s
The integration of Advanced AI into ICT systems offers data analysis capabilities empower businesses to discover
the potential to reshape industries and society itself [1]. valuable insights and patterns in their data, fostering more
This paper undertakes a thorough survey and analysis of informed decision-making. ICT tools, on the other hand,
the integration of Advanced AI and ICT. Its principal aim enhance collaboration, streamline data organization, and

*Corresponding author: pardhavika.gopisetty@gmail.com

DOI: 10.1201/9781003529231-55
370 Algorithms in Advanced Artificial Intelligence

Fig. 55.1 Advantages of AI for knowledge management (KM) [modified after [4]

provide secure and efficient storage solutions, making it recognition to identify health issues at an earlier stage,
easier for employees to access and share knowledge across allowing for more effective treatment and improved patient
geographical boundaries. Together, these technologies create outcomes. Here are some examples of their applications:
a dynamic environment where data is not merely stored Detecting early cancer: AI is vital in early cancer detection,
but actively leveraged to gain a competitive edge, reduce rapidly identifying lesions in X-rays and mammograms
costs, and ensure that employees have access to up-to-date, for timely intervention. It also provides accurate screening
relevant information [3]. In summary, the synergistic use of recommendations. AI’s potential extends to liquid biopsies,
AI and ICT in knowledge management is a game-changer wearable devices, and genetic analysis for even earlier cancer
for businesses, offering improved efficiency, data-driven detection. While common cancers benefit, rare neoplasms
insights, and a competitive advantage in today’s information- progress more slowly due to data requirements. New
driven landscape. As organizations continue to evolve and guidelines from the American College of Medical Genetics
expand, harnessing the full potential of these technologies drive AI development in precision oncology [5].
will be crucial for staying at the forefront of innovation and
Cardiovascular diseases: Heart failure (HF) poses challenges
making the most of their valuable knowledge assets
with poor outcomes, high recurrence, increased mortality, and
economic burdens. However, AI in cardiovascular medicine
2. Applications of AI in Different advances early disease detection through ECG, imaging,
Sectors and wearable monitoring, transforming HF management.
AI swiftly detects cardiac anomalies, enabling life-saving
AI’s broad utility extends across various sectors, including interventions. Predictive models and genetic analysis promise
healthcare for precise diagnostics, finance for improved personalized risk assessments, while remote monitoring
predictive capabilities, and manufacturing for enhanced enhances early detection. AI empowers healthcare providers
efficiency and safety. Its impact also felt in transportation, and improves patient outcomes, reducing the global
customer service, education, and more. Here are a few cardiovascular disease burden [6]
specific examples: Parkinson disease (PD): Parkinson’s Disease (PD) is a
widespread chronic neurological disorder affecting the
2.1 Diagnosis and Detection of Early Diseases
entire body. Although around 500,000 Americans have an
using AI official PD diagnosis, the true number is likely higher due
AI techniques, including Artificial Neural Networks (ANN), to undiagnosed or misdiagnosed cases. AI plays a pivotal
Fuzzy Expert Systems (FES), Evolutionary Computation, role in early PD detection through advanced data analysis.
and Hybrid Intelligent Systems (HIS), are employed to Voice analysis, gait analysis, touchscreen tests, and medical
enhance the detection and diagnosis of early diseases. These imaging interpretation help identify subtle changes. These
techniques leverage the power of data analysis and pattern AI-driven methods, in combination with patient data,
A Comprehensive Review of Advanced Artificial Intelligence Integration in ICT Systems 371

Fig. 55.2 Integrating AI methodologies into agriculture

enable predictive models and efficient monitoring for early variables. Genetic algorithms are employed in crop breeding,
detection, potentially improving outcomes. Ethical standards while predictive modeling enables forecasting of crop disease
and regulatory compliance are essential when implementing outbreaks and weather patterns. The synergy of these methods
AI-based diagnostics in PD management [7]. equips agriculture with the tools to enhance productivity,
sustainability, and efficiency while addressing the challenges
2.2 The Significance of AI on Agriculture of modern farming methodologies. Nonetheless, the
Robotics in agriculture employs AI for autonomous integration of AI in agriculture encounters hurdles associated
navigation, object detection, and manipulation of agricultural with data quality, privacy concerns, and the often-substantial
machinery and robots. These technologies are used in tasks technology costs, which can pose barriers for small-scale
like planting, weeding, and harvesting. In agriculture, AI farmers looking to adopt these advancements. Additionally,
is harnessed through a diverse set of technical methods, connectivity challenges and limited infrastructure in rural
including ML, computer vision, data analytics, and remote regions present obstacles to the real-time data collection and
sensing. ML algorithms and NLP techniques enable the AI-based decision-making processes in agriculture [9].
development of predictive models for crop management
and yield forecasting, while computer vision is crucial for 2.3 The Synergy of AI in Autonomous Vehicles
assessing crop health and detecting pests and diseases from AI stands as the fundamental cornerstone upon which the
image data collected by drones or satellites [8]. edifice of autonomous vehicles is built, encompassing a
Data analytics process information and environmental rich spectrum of applications spanning self-driving cars and
sources to optimize irrigation, fertilization, and resource aerial drones. At its core, AI bestows these vehicles with the
allocation. Robotics and automation, driven by AI, and in remarkable capacity to perceive, interpret, and seamlessly
tasks like precision planting and harvesting, and Internet interact with their dynamic environments, all while ensuring
of Things (IOT) devices furnish real time data for decision- the paramount principles of safety and efficiency. This capacity
making by measuring soil conditions and other environmental is chiefly manifest in the AI’s ability to adeptly decipher data

Fig. 55.3 The AI perception-action loop within autonomous vehicles [11]

372 Algorithms in Advanced Artificial Intelligence

from a diverse array of sensors, including LiDAR, radar, 2.4 Leveraging AI for Fraud Detection in the
cameras, and ultrasonic sensors, equipping the vehicle with Banking Sector
a heightened situational awareness that allows it to identify
AI is indeed a crucial tool in the banking sector for efficient
and respond to the myriad elements in its surroundings –
and effective fraud prevention. It is not only enhancing
be it discerning objects, pedestrians, other vehicles, or the
security and customer trust but also helps banks adapt to
intricate nuances of road conditions [10] Furthermore, AI
evolving threats, maintain compliance with regulations, and
extends its dominion to the realm of mapping, facilitating the
reduce operational costs. As financial landscapes continually
construction of finely grained cartographic representations
evolve, the role of AI in fraud detection becomes increasingly
for both strategic route planning and real-time navigation.
pivotal, offers multifaced advantages:
Guided by AI’s adept hand, the vehicle orchestrates its every
movement, deftly controlling its steering, acceleration, and Real-time Detection: AI allows for real-time monitoring of
braking with a precision dictated by the amalgamation of transactions and can swiftly identify anomalies or suspicious
sensory input and navigational guidance. And yet, AI’s activities, enabling banks to proactively prevent fraudulent
role transcends the mere mechanics of vehicular control, actions. This enhances security and helps build customer
extending into the ethereal territory of decision-making [10]. trust as it shows the bank’s commitment to protecting their
accounts [12].
AI intricately weaves patterns of data, creating a logical
tapestry. Its deep-learning algorithms make nuanced, context- Reduced False Positives: AI’s superior accuracy in
aware decisions that prioritize both safety and efficiency identifying fraudulent activities helps reduce the rate of false
while considering the intricate web of user preferences. AI’s positives. This means that legitimate transactions are less
influence extends further into driver monitoring, ensuring likely to be mistakenly flagged as fraudulent, which ultimately
operators remain alert and engaged, thus enhancing overall improves the overall customer experience. Customers don’t
safety. AI’s purview also encompasses cybersecurity, with have to deal with the inconvenience of having their legitimate
vigilant algorithms guarding against potential cyber threats transactions blocked or delayed [12].
that could compromise vehicle operations. Furthermore, AI Adaptability through ML: AI, particularly through ML
plays a pivotal role in the ever-evolving realm of vehicle-to algorithms, can adapt to evolving fraud patterns. It can learn
everything (V2X) communication, harmonizing the flow of from new data and emerging threats, helping banks stay
data between vehicles and the surrounding infrastructure. ahead of fraudsters who continually develop new tactics. This
Therefore, AI’s role in autonomous vehicles is absolutely adaptability reinforces the banks defenses over time.
dynamic, not static. It engages in an ongoing process of
Regulatory Compliance: AI can assist banks in complying
learning and adaptation. It steadfastly adjusts to ever-changing
with regulatory requirements by monitoring and analyzing
road conditions and the unpredictable rhythms of traffic
transactions to detect suspicious patterns that may indicate
patterns, nurturing a capacity for continuous improvement.
potential money laundering or fraud. This can expedite
In summary, AI is not just a technological foundation; it is
decision-making in compliance-related matters and
an omnipresent force guiding the course of transportation
potentially reduce fines or penalties for non-compliance.
toward a future marked by safety, efficiency, and accessibility,
promising to redefine the landscape of mobility [10, 11].

Fig. 55.4 Bank card fraudulent detection process through ML/DL [12]
A Comprehensive Review of Advanced Artificial Intelligence Integration in ICT Systems 373

Supporting Human Teams: AI is not meant to replace Additionally, Ethical AI places a strong emphasis on privacy
human fraud detection teams but to support them. It can protection and security, respecting individuals’ privacy rights
handle the more routine and repetitive tasks, allowing human and diligently safeguarding data. This commitment extends
experts to focus on more complex and strategic aspects of to the implementation of robust Data Governance and model
fraud detection. This collaboration can lead to more effective management systems that uphold the highest standards of
and efficient fraud prevention. data security. Reliability and safety are equally essential
Cost Savings: By automating certain aspects of fraud principles, as AI systems are expected to consistently function
detection and prevention, AI can potentially lower operational in accordance with their intended purpose, ensuring that they
can be trusted to deliver predictable and secure results [2, 15].
costs for banks. This is especially significant in an industry
where operational efficiency is highly valued. Transparency and explainability are fundamental in Ethical
AI, as these systems are required to provide complete
Recent developments across industries utilizing AI illustrate
transparency regarding their inner workings and decision-
its transformative potential. However, AI also brings ethical
making processes. By offering clear explanations, they
challenges, including concerns about misuse, autonomy,
foster trust and understanding among users and stakeholders.
biases, transparency, and impacts on society. Addressing these
Accountability remains a key principle, as AI systems
demands integrating ethics into design, mitigating biases,
are intended to be under the control of appropriate human
ensuring transparency, and aligning AI with human values.
oversight, allowing for feedback and appeals, when necessary,
Specific ethical challenges are detailed in the following
thereby maintaining human agency in the AI ecosystem [15].
sections [13, 14].
Value alignment is another guiding principle, ensuring that AI
systems consider universal values in their decision-making,
3. AI Related Ethical Challenges ultimately aligning with the ethical principles that guide
Utilizing AI systems gives rise to a multitude of complex human decision-making. Governability is also vital, with
concerns, including issues related to bias, liability, security, Ethical AI designed to work on intended tasks while being
privacy, behavior manipulation, and transparency. The able to detect and avoid unintended consequences, thereby
inheritable biases within AI’s data can lead to unjust outcomes, mitigating risks. Ultimately, Ethical AI is human-centered,
while assigning responsibility for AI errors remains a dynamic valuing diversity, freedom, autonomy, and individual rights.
challenge. With increasing interconnectivity, the protection It serves the interests of humanity by upholding human
of AI systems and data privacy becomes a critical priority, values and avoiding any actions that could be deemed unfair
particularly in light of their digital reliance. Ethical worries or unjustified. These principles collectively form a robust
are heightened by AI’s potential to manipulate behavior and foundation for the responsible and ethical development and
reduce individual autonomy, and the opacity in AI decision- deployment of AI in a manner that benefits individuals,
making processes raises questions about accountability and society, and the environment as a whole [16].
fairness. Effectively addressing these concerns demands
responsible AI development that underscores transparency, 3.2 Ethical AI: How can it be Optimally
ethical design, bias mitigation, and the promotion of Operationalized?
explainable AI. It necessitates a collaborative effort among Implementing data and AI Ethics is imperative. Ethical
various stakeholders to create regulations that strike a balance development and deployment of AI are essential. To achieve
between innovation and safeguarding societal interests [15]. this, adhering to the following steps in creating a tailored,
scalable, operational, and sustainable framework for AI
3.1 Principles of Ethical AI Ethics will enable customers to embrace the AI solutions they
Ethical AI is built upon a comprehensive framework of desire:
principles that are instrumental in guiding its development Advisory council on ethics: The establishment of
and deployment. These principles encompass a wide range of an Ethics Council is crucial. This council, akin to a
crucial aspects. First and foremost, fairness is a cornerstone governance board, should be responsible for ensuring
of Ethical AI, ensuring that these systems are inclusive, fairness, privacy, cybersecurity, and addressing other data-
accessible, and devoid of any unfair discrimination against related risks and concerns. It should be closely aligned
individuals or groups. The aim is to provide equitable access with ethical considerations in the realms of cybersecurity,
and treatment for everyone, addressing the challenge of risk management, compliance, privacy, and analytics.
bias that often arises from AI algorithms being trained on a Additionally, external subject matter experts, including
limited portion of the population, which doesn’t represent the ethicists, should be incorporated into the council. Their roles
diversity of the real world. would encompass: [15, 16].
374 Algorithms in Advanced Artificial Intelligence

1. Overseeing employees tasks and their handling of these revolutionizing educational methodologies, underscore
ethical concerns. its potential to reshape various industries. However, this
2. Managing legal and regulatory risks effectively. transformative influence also brings ethical considerations
3. Aligning the AI ethics strategy with the existing to the forefront. Concerns about algorithmic bias, privacy,
systems. transparency, and accountability highlight the need for a
delicate equilibrium between innovation and responsibility.
Achieving this balance requires collaborative efforts
spanning disciplines, encompassing technologists, ethicists,
policymakers, and society at large. Addressing ethical
concerns demands ongoing dialogue and transparent
frameworks. The interdisciplinary collaboration is essential
for navigating the intricate ethical terrain of AI development
Fig. 55.5 Guiding principles for ethical AI [16] and deployment. The collective commitment to responsible
practices ensures that AI’s remarkable advancements
Developing an ethical AI framework: The development of contribute to societal betterment while upholding ethical
a data and AI ethical risk framework is a valuable strategy standards. In summary, the journey toward a sustainable
for mitigating ethical concerns. This framework includes a future for AI involves celebrating its advancements while
governance structure that requires continuous maintenance. engaging in collaborative, responsible practices. Through
It delineates the ethical standards that should be upheld and transparent dialogue and shared responsibility, we can unlock
followed. The framework also provides guidance on how the full potential of AI for the benefit of society, ensuring
systems should articulate and integrate these fundamental innovation aligns harmoniously with ethical considerations.
Ethical AI principles. It serves as a quality assurance program
to evaluate the efficacy of designing and developing ethical
AI systems [15].
Acknowledgement
Enhancing guidance and tools for optimization: While the I would like to express my heartfelt appreciation to the St.
Ethical AI framework offers valuable high-level guidance, it’s Joseph Institute of Technology’s Artificial Intelligence and
evident that product-level guidance must be more granular. In Data Science Department, the esteemed Head of the Depart
certain cases, particularly when AI systems yield decisions ment, and the dedicated team of teachers for their invaluable
with substantial real-life consequences, there’s a need for contributions in the completion of this review paper.
an explanation of the decision-making process. However,
it’s worth noting that model transparency tends to decrease References
as prediction accuracy increases. In such scenarios, product
managers should possess the skills to strike the right balance. 1. Mohammad SM (2020) Artificial Intelligence in Information
This entails the development of tailored tools to aid product Technology. SSRN Electronic Journal. https://doi.org/10.2139/
ssrn.3625444
managers in these decisions. These tools can assess the trade-
2. Park SH, Kim YH, Lee JY, et al (2019) Ethical challenges
off between explain ability and accuracy for a specific system
regarding artificial intelligence in medicine from the
and offer recommendations on what measures to implement perspective of scientific editing and peer review. Science
for that particular system. [16]. Editing 6:91–98
Fostering ethical awareness: A thriving organizational 3. UNCTAD Catching technological waves innovation with
culture is pivotal for the successful implementation of ethics equity
in AI. It is essential to cultivate a culture where all members 4. Jarrahi MH, Askay D, Eshraghi A, Smith P (2023) Artificial
intelligence and knowledge management: A partnership
of the organization are well-versed in the ethical framework,
between human and AI. Bus Horiz 66:87–99. https://doi.
enabling them to consistently question the ethical aspects of
org/10.1016/j.bushor.2022.03.002
the AI system at every stage or level [15]. 5. Ramesh AN, Kambhampati C, Monson JRT, Drew PJ (2004)
Artificial intelligence in medicine. Ann R Coll Surg Engl
4. Conclusion 86:334–338
6. Khan MS, Arshad MS, Greene SJ, et al (2023) Artificial
In essence, the integration of artificial intelligence (AI) intelligence and heart failure: A state-of-the-art review. Eur J
across sectors, including healthcare, finance, manufacturing, Heart Fail
transportation, agriculture, and education, signifies a 7. Dixit S, Bohre K, Singh Y, et al (2023) A Comprehensive
pivotal advancement in efficiency and safety. AI’s diverse Review on AI-Enabled Models for Parkinson’s Disease
applications, from optimizing medical diagnoses to Diagnosis. Electronics (Switzerland)
A Comprehensive Review of Advanced Artificial Intelligence Integration in ICT Systems 375

8. Kaushal S, Kumar S, Tabrez S (2023) Licensed Under 12. Alamri M, Ykhlef M (2022) Survey of Credit Card Anomaly
Creative Commons Attribution CC BY Artificial Intelligence and Fraud Detection Using Sampling Techniques. Electronics
in Agriculture. Artificial Intelligence in Agriculture Article (Switzerland) 11
in International Journal of Science and Research. https://doi. 13. Najadat H, Altiti O, Aqouleh AA, Younes M (2020) Credit
org/10.21275/SR22524180634 Card Fraud Detection Based on Machine and Deep Learning.
9. Wakchaure M, Patle BK, Mahindrakar AK (2023) Application In: 2020 11th International Conference on Information and
of AI techniques and robotics in agriculture: A review. Communication Systems, ICICS 2020. Institute of Electrical
Artificial Intelligence in the Life Sciences 3:100057. https:// and Electronics Engineers Inc., pp 204–208
doi.org/10.1016/j.ailsci.2023.100057 14. RB A, KR SK (2021) Credit card fraud detection using
10. Muralidharan C, Mohamed Sirajudeen Y, Anitha R (2021) artificial neural network. Global Transitions Proceedings
Synergy of Internet of Things with Cloud, Artificial 2:35–41. https://doi.org/10.1016/j.gltp.2021.01.006
Intelligence and Blockchain for Empowering Autonomous 15. Li HY, An JT, Zhang Y (2021) Ethical Problems and
Vehicles. In: Studies in Computational Intelligence. Springer Countermeasures of Artificial Intelligence Technology. In:
Science and Business Media Deutschland GmbH, pp 225–244 E3S Web of Conferences. EDP Sciences
11. Artificial Intelligence and Autonomous Vehicles _ by Suhasini 16. Bostrom N, Yudkowsky E, Yudkowsky Forthcoming E The
Gadam _ Data Driven Investor Ethics of Artificial Intelligence
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
376 Algorithms in Advanced Artificial Intelligence

Enhanced Network Security: Machine

Learning-Based DDOS Detection 56

R. Tamilkodi1
Professor, Department of CSE (AIML&CS)
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
A. Harika2
Assistant professor, Department of CSE (AIML&CS)
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
B. S. L. D. V. Mythili3, G. KarunaKumar4, B. Dileep Kumar5, S. Sri Harshitha6
Department of Computer Science & Engineering (AIML & CS)
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India

Abstract: The exponential growth of internet users has seriously jeopardised the safety of online assets. Ensuring safety is of
utmost importance as internet usage continues to grow exponentially. Making claims on denial-of-service attacks. It is trying to
come up with a cutting-edge cyber-security plan in response to this dynamic danger. In this paper, we present a machine learning
framework that can identify DDoS attacks by combining logistic regression, K-nearest neighbor, and random forest. To test the
proposed models, we use the latest NSL KDD dataset. Results from our test further demonstrate how effectively the suggested
model differentiates DDoS attacks. In comparison to the best attack detection approaches currently available, our findings show
that our recommended model is superior. Enterprises, cloud services, internet service providers (ISPs), e-commerce, healthcare,
government, telecoms, gaming, the Internet of Things (IoT), education, and media and entertainment are just a few of the many
sectors that can benefit from improved network security through machine learning-based DDoS detection.
Keywords: Distributed denial of service, Deep learning, Logistic regression, K-Nearest neighbors, and NSL KDD dataset

1. Introduction traffic using just one device. Depending on the protocol,

application layer, or volume, distributed denial-of-service
When a malicious actor attempts to prevent people from attacks can take three distinct forms. Attackers attempt to
accessing related websites and online services, they are overwhelm the target’s bandwidth with a volume-based
committing a distributed denial-of-service attack. Distributed assault, measured in bits per second (Bps). Examples of such
denial-of-service assaults cause significant damage to attacks are ICMP or UDP floods. Communication tools may
the economy, government, companies, and foundations. become inaccessible due to the overload caused by these
Distributed denial-of-service attacks are a subset of attacks. Overwhelming the web server with requests per
cyberattacks that target specific websites in an effort to upset second (Rps) is the objective of application layer assaults
their ISPs. One thing that sets a denial-of-service (DoS) like GET/POST floods and low-and-slow attacks. Attackers
attack apart is the fact that it may overwhelm a target with often disguise these assaults as valid requests. Conventional

1
tamil@giet.ac.in, 2Harikaadduri07@giet.ac.in, 3b.mythili123@gmail.com, 4Karunkumarkumar61@gmail.com, 5bonthadileep1234@gmail.com,
6
20551a4648.sriharshitha@gmail.com

DOI: 10.1201/9781003529231-56
Enhanced Network Security: Machine Learning-Based DDOS Detection 377

technology has a hard time keeping up with these attacks DDoS devices. A number of initiatives in a number of nations
because they may last a few seconds or less than an hour. have been the target of assaults. The DR is 0.012%, the AUC
These DDoS attempts may be detectable by certain ML is 97.9%, and the AUC is 0.9921; all of these contribute
methods. Directed denial-of-service attacks have successfully to the dramatically expanded global spectrum of attacks.
affected DNS systems, causing substantial financial losses Deep DDoS Detection on Internet of Things Devices Using
for companies that rely on these services. By providing a Machine Learning: Nowadays, it seems like you can find
flood of useless data, nearly two-thirds of distributed denial someone with Internet access just about anywhere. Thanks
of-service attacks aim to overwhelm the victim’s machine or to technological advancements, the Internet of Things (IoT)
connections to other networks. These attacks work because is now one of the most used technologies, with one billion
they exploit standard queuing rules on servers on the internet, gadgets linked to it via the Internet. The most significant
like DropTail, First-in, and First-Out, which handle all kinds threat to this emerging technology, however, comes from
of data equally. These assaults can adequately diminish the denial-of-service threats, including DoS and distributed
victim’s processing power for incoming data if this is the denial-of-service attacks. Distributed denial of service
case. Attacks with low-volume distributed denial of service (DDoS) attacks nowadays are so complex that they are hard
(DDoS) tend to be harder to detect because they leverage to detect and fight with the resources we have. Thanks to
application layer protocols to drain victims’ resources without developments in big data, data mining, and machine learning,
overpowering their connections. The short duration of these it is now feasible to detect DDoS traffic in a practical and
attacks (a few minutes to an hour at most) makes them hard realistic way. Using data mining and ML approaches, this
to identify using more conventional methods. We present a paper presents a DDoS detection system. Based on the
deep learning strategy for DDoS attack detection that uses most recent dataset, CICDDoS2019, we developed this
data collection, feature extraction and classification, and investigation. It examined the most popular ML methods and
double classification. The suggested method considers the found the traits frequently associated with predicted classes.
protocol, interpacket duration, and packet length in addition Both AdaBoost and XGBoost are very accurate in forecasting
to the network’s behavior. We tested and evaluated several the kind of company traffic, and they are also very exact in
assault detection classifiers, including K-Nearest Neighbour, general. Some testing of hybrid algorithms and updated
Decision Trees, Logistic Regression, Random Forests, and datasets, some evaluation of cross-breed calculations, and
others. We have conducted promising investigations utilizing an improved model for multiclassification of distinct DDoS
the NSL KDD dataset to validate our proposed technique. In assault types might pave the way for more investigation.
this article, we examine the results of different deep learning Application of Machine Learning Classification Algorithms
models that are currently accessible and show how to use to the Determination of DDoS AttacksNowadays, most people
neural networks to identify DoS assaults. rely on the Internet as their main means of communication.
Because of this, cyberattacks are becoming more common,
2. Literature Review and the consequences are becoming worse. In terms of both
effectiveness and cost, distributed denial of service is among
DDoS Attacks in 2022: Current Trends and Challenges In the top five cyberattacks. Attacks known as distributed
the Midst of the Global Political Debate: In recent times, denial of service (DDoS) can severely damage information
there has been a marked increase in the number of hacks infrastructure by preventing authorized users from accessing
occurring on a global scale. In comparison to the same network resources. Limiting damage requires solutions that
period in the previous year, the number of attacks climbed can identify distributed denial-of-service assaults quickly
by 90% globally in Q3 2022. Moreover, they are far more and correctly. Machine learning classification algorithms
potent. Botnet attacks are on the rise in many countries, and can identify target classes much more quickly and accurately
defending yourself from them is no easy task. There is a strong than more conventional methods. Find distributed denial
correlation between legislative issues and DDoS attacks. of service (DDoS) assaults on the CIC-DDoS2019 dataset
At the tail end of February, hacktivist organizations with in this quantitative study using a variety of classification
political motivations started coordinating distributed denial approaches, including Logistic Regression, Decision Tree,
of service (DDoS) attacks on Russian companies. Their aim Random Forest, Ada Boost, Gradient Boost, KNN, and Naive
was to undermine the Russian economy. The most politically Bayes. There are eleven distinct DDoS attacks in the dataset,
motivated hackers, who call themselves the “IT army of and each one has 87 unique properties. The evaluation metrics
Ukraine,” have targeted multiple critical Russian projects that also measured how well the classifiers performed. The
the Russian government has claimed. Now, criminals from experimental data shows that AdaBoost and Gradient Boost
all over the globe are launching some of the most powerful are the best classification algorithms, followed by Logistic
attacks we’ve witnessed thus far by utilizing their homemade Regression, KNN, and Naive Bayes. Decision Tree and
378 Algorithms in Advanced Artificial Intelligence

Random Forest ranked last.Dangers of undetected distributed • Although HTTP flood, SID DoS, and regular traffic are
denial-of-service assaults:Despite a few high-profile, high- mentioned in another study that has already been done,
volume attacks recently, most distributed denial of service it may not have the same variety of attack types and
(DDoS) attacks are still short and occur in modest amounts. situations as our work.
Their quick success rate is annoying, and they often sneak We present a machine learning method that combines binary
into associations undetected because of their low profile and definition, feature extraction and classification, and data
ability to mix in with routine traffic. Even if security forces collection to find DDoS attacks. The suggested method uses
were able to detect these attacks, they wouldn’t have much both network actions and characteristics, like the length of
time to react, given how fast and fierce they are.Bashlite and packets, the time between packets, and the protocol. We
Mirai, Two IoT Botnets in Progress:Every year, botnets that evaluate how well unique attacks detection classifiers —, for
are built using vulnerable Internet of Things devices result example, K-Nearest Neighbor, Random Forests, and Logistic
in losses worth billions of dollars. One kind of botnet that Regression — perform. We lead our tests utilizing the NSL
developed from Bashlite botnets is the subject of this article: KDD dataset to check our proposed procedure.
Mirai botnets. To be more precise, we monitor the virus
and the actions of botnet controllers for any modifications. Benefits:
We compiled information from 47 honeypots’ observation 1. In contrast, our study makes use of network behaviors
logs over the course of eleven months. By showing how and attributes as features, which might result in a more
sophisticated botnet operators, criminal acts, and malware are reliable and accurate DDoS detection model.
becoming, our research contributes to what is already known 2. A more representative and diversified collection of data
about botnets and malware. It uses more robust hosting and may be obtained from the NSL KDD dataset, which is
control infrastructures and allows for more robust attacks often used in intrusion detection research.
than its predecessor, Mirai. 3. We assess how well several attack detection
classifiers—such as K-Nearest Neighbor, Random
3. Methodology Forests, and Logistic Regression—perform. This calls
for a thorough examination of various methods, which
Researchers have described data mining and ML techniques might provide a more complete knowledge of the
as a means of identifying DDoS attacks. Researchers effectiveness of DDoS detection.
experimented with the most well-known ML techniques
on the most current dataset, CICDDoS2019, to identify the
attributes most closely associated with the planned classes.
According to their findings, AdaBoost was remarkably
precise and produced reliable forecasts. A separate report
details the use of an ML-based approach to identify and
describe various network traffic flows. They test their method
on a fresh dataset that incorporates many modern attack
types, such as HTTP floods, SID DoS, and normal traffic. We
classify different types of attacks using an ML technology
called WEKA
Drawbacks:
• The network characteristics and behaviors used in
our study may be more extensive and instructive than
“correlated features,” which are the foundation of
previous research.
• Utilizing the CICDDoS2019 dataset, the current work
The DDoS detection model’s capacity for generalization
and practical application might be influenced by the
dataset selection. This dataset may not provide a more
representative and diversified collection of data since
it is not often utilized for intrusion detection research.
The only AdaBoost algorithm that is highlighted in the
current study is one. Fig. 56.1 System architecture
Enhanced Network Security: Machine Learning-Based DDOS Detection 379

Modules: you’ll find the meta-classifier or regressor. It takes the

We have made the accompanying modules to complete the conventional models’ expectations as input and uses them to
previously mentioned project. create new predictions.
• Data exploration: this module will be utilized to stack
information into the framework. 5. Testing and Results
• Processing: information will peruse for process.
• Splitting data into train & test: information will be
separated into train and test utilizing this module.
• Model generation: Model building - Random Forest,
Logistic Regression, KNN, Voting Classifier (RF
+ AdaBoost), Stacking Classifier (RF + MLP with
LightGBM)
• User sign up and login: This module gathers enrollment
and login information.
Fig. 56.2 Home page
• User input: This module gathers client information for
expectation.
• Prediction: the last expectation is shown
As an expansion, we utilized an outfit approach that joined the
predictions of a wide range of models to give a last forecast
that was more reliable and accurate. In any case, by exploring
more outfit approaches like the 100 percent accurate Stacking
Classifier and the Voting Classifier with RF + Adaboost, we
might additionally work on the presentation. Fig. 56.3 Signup page

4. Implementation
In this endeavor, we have included the following calculations:.
Leo Breiman and Adele Cutler’s well-known ML method,
“random forest,” takes the results of numerous decision
trees and averages them out. Its widespread use is due to its
adaptability, ease of use, and ability to handle relapse and
characterization problems.K-Nearest Neighbours: this is the
formula. A non-parametric supervised learning classifier, Fig. 56.4 Signin page
the k-nearest neighbours approach (also spelled k-NN),
groups different information guides according to proximity
in order to describe or forecast them.Data Analysis using
Logistic Regression: To determine the likelihood of a target
variable, a supervised characterization method known as
Logistic Regression is employed. Due to the binary nature
of the dependent or target variable, only two possible
categories can be considered.Classifier for Voting via RF and
AdaBoost: An AB+RF Voting Classifier: Voting classifiers
are ML assessors that, after preparing different basis models Fig. 56.5 Main page
or estimators, create predictions by adding up the results of
each base assessor. The merging criteria might be shaped by
converging the voting classifiers for each assessor’s results.
Combining RF and MLP with LightGBM to Create a Combo
Classifier Stacking Classifier: This method collects two-layer
estimators for relapse or grouping models. Prior to the initial
layer, we recall all the pattern models that were employed
to assess the issues on the test datasets. In the second level,
Fig. 56.6 Upload input values
380 Algorithms in Advanced Artificial Intelligence

Fig. 56.7 Input values

Fig. 56.12 Precision comparison graph

Fig. 56.8 Prediction result

Fig. 56.9 Upload another input values

Fig. 56.13 Recall comparison graph

Fig. 56.10 Prediction result

Fig. 56.14 F1 comparison graph

6. Conclusion and Future Scope

The distributed denial of service (DDoS) assault is a
devastating kind of cyberattack that effectively targets
network devices and services. Therefore, we explore the
Fig. 56.11 Accuracy comparison graph potential of developing, testing, and assessing a machine
Enhanced Network Security: Machine Learning-Based DDOS Detection 381

learning model to detect DDoS attacks in this study. In order Global Development (INDIACom), 2020, pp. 16-21, doi:
to find out which attributes are best for predicting DDoS 10.23919/INDIACom49435.2020.9083716.
attacks, this article employs a variety of feature selection 11. Jiangtao Pei et al “ A DDoS Detection Method based on
methods. Three ML models were implemented using features Machine Learning“, J. Phys.: Conf. Ser. 1237 032040, 2019.
12. Abdullah Soliman Alshra’a, Ahmad Farhat, Jochen Seitz,
selected from the recently released NSL KDD dataset. While
“Deep Learning Algorithms for Detecting Denial of Service
logistic regression achieves lower accuracy, KNN and
Attacks in Software-Defined Networks”, Procedia Computer
Random Forest outperform them all. Our upcoming research Science, Volume 191, 2021, Pages 254-263, ISSN 1877-0509.
will focus on methods that can identify DDoS attacks as they 13. Seifousadati, Alireza, Saeid Ghasemshirazi, and Mohammad
happen. Fathian. “A Machine Learning Approach for DDoS Detection
on IoT Devices.” arXiv preprintr Xiv:2110.14911 (2021).
References 14. Francisco Sales de Lima Filho, Frederico A. F. Silveira,
Agostinho de Medeiros Brito Junior, Genoveva Vargas-Solar,
1. Statista Research Department, “Worldwide digital Luiz F. Silveira, “Smart Detection: An Online Approach for
population July 2022”, Available: https://www.statista. DoS/DDoS Attack Detection Using Machine Learning”,
com/statistics/617136/digitalpopulation-worldwide/ (Last Security and Communication Networks, vol. 2019, Article ID
Accessed on: December 31, 2022) 1574749, 15 pages, 2019.
2. Ramil Khantimirov, “DDoS Attacks in 2022: Trends and 15. R. Doshi, N. Apthorpe and N. Feamster, “Machine Learning
Obstacles Amid Worldwide Political Crisis”, Available: DDoS Detection for Consumer Internet of Things Devices,”
https://www.infosecurity-magazine.com/blogs/ddos-attacks 2018 IEEE Security and Privacy Workshops (SPW), 2018, pp.
in-2022- trends/ (Last Accessed on: December 31, 2022) 29-35, doi: 10.1109/SPW.2018.00013.
3. S. Sontowski et al., “Cyber Attacks on Smart Farming 16. Ebtihal Sameer Alghoson, Onytra Abbass, “Detecting
Infrastructure,” 2020 IEEE 6th International Conference on Distributed Denial of Service Attacks using Machine Learning
Collaboration and Internet Computing (CIC), 2020, pp. 135 Models”, International Journal of Advanced Computer
143, doi: 10.1109/CIC50333.2020.00025. Science and Applications, Vol. 12, No. 12, 2021.
4. Seifousadati, Alireza and Ghasemshirazi, Saeid and Fathian, 17. Arshi M, Nasreen MD and Karanam Madhavi; A Survey of
Mohammad, “A Machine Learning Approach for DDoS DDOS Attacks Using Machine Learning Techniques 2020.
Detection on IoT Devices”, arXiv, 2021. Doi: 10.48550/ 18. Igor Kotenko and Alexander Uianov;Agent-based simulation
ARXIV.2110.14911 of ddos attacks and defense mechanisms; computing, 2005,
5. A. Marzano, D. Alexander, O. Fonseca et al., “The evolution Vol. 4, Issue 2, 113-123.
of bashlite and mirai IoT botnets,” in Proceedings of the 19. Mouhammd Alkasassbeh, Ahmad B.A Hassanat,Ghazi Al
2018 IEEE Symposium on Computers and Communications Naymat,Mohammad Almseidin; Detecting Distributed Denial
(ISCC), 2018. of Service Attacks Using Data Mining Techniques – 2016.
6. S. Kottler, “February 28th DDoS incident report,” 2018, 20. Khamparia A, PandeS, Gupta D, Khanna A,Sangaiah A. K.
https://github.blog/2018-03-01-ddos-incident-report/. (2020). Multi-level framework for anomaly detection in social
7. Y. Cao, Y. Gao, R. Tan, Q. Han, and Z. Liu, “Understanding networking. Library Hi Tech.2020.
internet DDoS mitigation from academic and industrial 21. C.M.Nalayini, Dr. Jeevaa Katiravan, Araving Prasad V,
perspectives,” IEEE Access, vol. 6, pp. 66641–66648, 2018. “Flooding Attack on MANET – A Survey”,International
8. S. Newman, “Under the radar: the danger of stealthy DDoS Journal of Trend in Research and Development (IJTRD),
attacks,” Network Security, vol. 2019, no. 2, pp. 18-19, 2019. ISSN: 2394-9333, Feb 2017
9. Kumari, K., Mrunalini, M., “Detecting Denial of Service 22. Nalayini, C.M., Katiravan, J. (2019). “Block Link Flooding
attacks using machine learning algorithms”, . J Big Data 9, 56 Algorithm for TCP SYN Flooding Attack”, International
(2022). Conference on Computer Networks and Communication
10. P. S. Saini, S. Behal and S. Bhatia, “Detection of DDoS Technologies. Lecture Notes on Data Engineering and
Attacks using Machine Learning Algorithms,” 2020 7th Communications Technologies, vol 15. Springer, Singapore.
International Conference on Computing for Sustainable Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
382 Algorithms in Advanced Artificial Intelligence

Enhancing Network Security: Deep

Ensemble-Based Attack Detection
Framework
57

R. Tamilkodi1
Professor, Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
S. Ratalu2
Assistant Professor, Department of Computer Science & Engineering(AIML&CS),
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India
Gandham Santoshi3, Vysyaraju Sarath Raju4, Allampalli V M Mukesh Rao5,
Rampa Aditya Raghava Koundinya6
Department of Computer Science and Engineering (AIML & CS),
Godavari Institute of Engineering & Technology, Rajahmundry, Andhra Pradesh, India

Abstract: Networks are significant in business, schooling, and day to day existence since they let individuals converse with
one another over significant distances utilizing a scope of devices. There are, notwithstanding, numerous potential risks and
security openings in this sort of correspondence that can make the wellbeing, honesty, and secrecy of information in danger.
Colossal measures of cash are being lost consistently in light of more organization dangers, malware, hacking, and tricks.
Artificial intelligence based mechanized frameworks can assist with finding these sorts of dangers rapidly and protect private
information. The proposed methodology employs recurrent neural network (RNN), gated recurrent unit (GRU), and long-term
memory (LSTM) in its architecture. It relies upon the greater part vote averages. The NSL-KDD dataset was second hand in
tests that presented that EDVC acted better differed accompanying the unending best processes, accompanying a high accuracy
score.
Keywords: Recurrent neural network (RNN), Gated recurrent unit (GRU), Long-term memory (LSTM), Network threats

1. Introduction dangers and security openings can seem when there are a
ton of systems administration applications. [5] These can
At the point when a few PCs are connected together, they influence the security, strength, and accessibility of arranged
can share data. This is called organizing. Various innovations frameworks and information. Denial-of-service (DoS)
and specialized techniques, similar to Ethernet, Wi-Fi, or attacks, along with man-in-the-middle (MitM) assaults, and
even straightforward direct connections, can be utilized to misrepresentation are probably the most well-known dangers
share information. [1] The fundamental objective of systems to networks. [3] As organization gambles with rise, it turns
administration is to allow things to like printers, record out to be more essential to have a programmed framework
servers, and web joins work with one another and share that can find and stop assaults. Arrangements in light of
assets. Networks are vital in business, training, and day to day artificial intelligence (AI) could possibly detect these sorts
existence since they let individuals converse with one another of assaults, allowing moves toward be taken rapidly to bring
and share data over lengthy distances. [2] Many potential down the gamble of information robbery. Designs are found

1
tamil@giet.ac.in, 2ratalu@giet.ac.in, 3205551a4616.santoshi@gmail.com, 420551a4656.sarathraju@gmail.com, 520551a4601.mukesh@gmail.com,
6
ramparaghava25@gmail.com

DOI: 10.1201/9781003529231-57
Enhancing Network Security: Deep Ensemble-Based Attack Detection Framework 383

in information utilizing ML techniques, which are likewise 2. Literature Review

used to track down potential dangers. [8] Adding these sorts
of procedures to organize security can make it much more The Communication and Networking Roadmap for 6G in
straightforward for an association to find and stop assaults. the Metaverse:
[4] This brings down the opportunity that assaults will With the economy continuously getting along nicely,
succeed and guards significant information and resources. MasterCard traffic has been going through the rooftop
The fundamental objective of systems administration is to throughout the course of recent years. The extortion bunches
allow things to like printers, document servers, and web joins are additionally developing rapidly. [18] This makes extortion
work with one another and share assets. finding an issue that is turning out to be increasingly tricky.
With regards to tracking down assaults, the proposed [16] The degree of trickiness is, in any case, much lower than in
XGBoost approach works the best. Likewise, the NSL-KDD the master exchange. [15,17] This makes the lopsidedness
dataset is utilized to do network intruder detection with a dataset much more hard to test. In this paper, we generally
neural network. As per the consequences of the examinations, discussed how to manage the Visa deception distinguishing
the bidirectional LSTM strategy with a consideration proof issue by utilizing supportive systems. We likewise
framework functions admirably. guaranteed a short gander at the distinctions and likenesses
between these supportive techniques.
Exploring Sequence Learning Models: RNN, LSTM,
GRU:
The initial segment of this part discusses the basic RNN plan
and what it can’t do. [20-22] Then, we discuss long short-
term memory (LSTM), gated recurrent unit (GRU), and
bidirectional recurrent neural network (BRNN). These are
minor departure from the fundamental RNN that were made
to get around these issues and are at present the most ideal
way to demonstrate successions.
Machine learning-based intrusion detection for Contiki
Fig. 57.1 Types of network attacks NG-powered IoT networks using the NSL-KDD dataset:
Numerous potential dangers and security openings can Security blemishes make it hard for a great deal of Internet
show up in enormous systems administration applications, of Things (IoT) contraptions and applications to be generally
jeopardizing the protection, respectability, and accessibility utilized. Since IoT frameworks are not no different either
of organized frameworks and information . Misrepresentation way, standard estimates like the NSL-KDD dataset can’t be
are probably the most widely recognized dangers to networks. utilized to think about and test the presentation of different
Information robbery, framework disappointment, picture Network Intrusion Detection Systems (NIDS). In this review,
harm, monetary misfortunes, spying, and harm to foundation we take a gander at explicit dangers in the NSL-KDD dataset
are a portion of the particular dangers that accompany that can influence IoT sensor hubs and organizations to close
network dangers, As organization takes a chance with this hole. We additionally see eleven ML strategies and offer
rise, it turns out to be more critical to have a programmed the outcomes to find the dangers that were sent off. We show
framework that can find and stop assaults. Arrangements in that tree-based techniques and group strategies work better
light of artificial intelligence [7] (AI) could possibly detect compared to the next ML strategies we took a gander at by
these sorts of assaults, allowing moves toward be taken utilizing mathematical investigation. This is the best directed
rapidly to bring down the gamble of information burglary. strategy Another fascinating thing that this review found is
[23] These sorts of strategies are utilized to take a gander at that the Expectation-Maximization (EM) calculation, which
a ton of organization information and find potential dangers is an uncontrolled strategy, does a very great job of finding
progressively, which allows organizations to act rapidly and dangers in the NSL-KDD dataset and is 22.0% more accurate
successfully. Designs are found in information utilizing ML than the Naïve Bayes classifier.
techniques, which are likewise used to track down potential
dangers. [6] Adding these sorts of procedures to organize Constructing Machine Learning and Deep Learning
security can make it significantly simpler for an association Models on Google Cloud Platform:
to find and stop assaults which brings down the opportunity Intended to make it simple for understudies to find out
that assaults will succeed and protects significant information about machine learning, deep learning, data science, and
and resources. cloud computing gives you the abilities to make and utilize
384 Algorithms in Advanced Artificial Intelligence

enormous scope learning models on Google Cloud Stage strategies. For network distinguishing proof, the outcomes
Figure out how to utilize the Python stack to do the code show that tree-based techniques work the best. With regards
required for ML and deep learning modelling. It accompanies to tracking down assaults, the [10] recommended XGBoost
bundles like Scikit-learn, Tensorflow, Keras, Numpy, and approach functions admirably.
Pandas. Drawbacks:
BAT: Exploring Deep Learning Methods for Network 1. A large portion of the work that has proactively been
Intrusion Detection with the NSL-KDD dataset: done looks at the Naïve Bayes calculation to likelihood
Intrusion detection has been an effective method for based calculations. More convoluted patterns and
protecting organizations since it can track down obscure associations in the information probably won’t be found
dangers in network information. The vast majority of the in this restricted reach. Further developed strategies,
ongoing techniques for finding weird things in an organization like deep learning, can pick these up.
are based on old fashioned ML models, as KNN, SVM, 2. The ongoing work’s techniques for picking highlights
and others. These techniques can get a few extraordinary probably won’t be serious areas of strength for as,
elements, however they aren’t exceptionally exact and rely could imply that elements are addressed less precisely
a ton upon planning traffic highlights the hard way, which and the all out acknowledgment execution is lower.
isn’t required any longer in that frame of mind of enormous 3. Contrasted with deep learning models, Naïve Bayes is a
information. A traffic peculiarity discovery model called quite simple technique to comprehend. This could make
BAT is recommended as a method for fixing the issues of low it harder for the ongoing work to deal with complex
precision and component designing in gatecrasher recognition. assault designs and various ways that organizations act.
Bidirectional Long Short-Term Memory (BLSTM) and a 4. The ongoing work just glances at a little example
consideration interaction are both piece of the BAT model. (UNSW NB15), which could make it harder to apply
The consideration technique is utilized to investigate the the outcomes to greater organizations and various
organization stream vector, which is comprised of parcel kinds of assaults.
vectors delivered by the BLSTM model. This can assist
with tracking down the main attributes for sorting network We propose an ensemble deep voting classifier (EDVC) as
information. We likewise utilize more than one convolutional a generally excellent method for tracking down network
layer to get the nearby highlights of the stream information. dangers. The proposed technique utilizes recurrent neural
We call the BAT model BAT-MC since it utilizes more than network (RNN), gated recurrent unit (GRU), and long short-
one convolutional layer to deal with information tests. The term memory (LSTM). It depends on the larger part vote
softmax calculation sorts network information into various standards. [12] The NSL-KDD dataset is utilized for tests,
gatherings. You don’t have to have significant insight into and both ML and deep learning models are utilized to analyze
highlight designing to utilize the recommended start to finish their outcomes. An examination of the recommended model’s
model. It can gain proficiency with the order’s vital elements presentation with other cutting edge strategies is utilized to
all alone. It can precisely depict how network information acts affirm its viability.
and make it more straightforward to recognize odd things. Benefits:
We utilize a public standard dataset to test our model, and 1. Utilizing complex ensemble learning techniques, then
the outcomes show that our model works better compared to again, could make it simpler to detect network assaults.
other examination techniques. 2. [14] Deep learning models can learn all alone and
take out valuable highlights from raw information, so
3. Methodology include building doesn’t need to be finished manually.
This implies that the model could possibly find
They expound on contrasting the Naïve Bayes calculation
unobtrusive and confounded assault designs that more
and new likelihood based directed ML calculations that
standard strategies, as Naive Bayes, could miss.
utilization the more modest UNSW NB15 dataset to track
down network dangers. To find cyberattacks, they utilized 3. Our work is better at speculation since we utilized a
different ML strategies on the UNSW NB15 dataset. To pick greater and more changed example [9] (NSL-KDD).
highlights and use models like J48 and Naïve Bayes, they The model that was learned on this dataset is bound
have attempted various strategies. [26] Their technique could to function admirably in a more extensive scope
assist with finding new sorts of assaults and typical things that of organization circumstances. This makes it more
occur on networks. One more scientist thought of a method straightforward to use, in actuality.
for finding network interruptions utilizing customary ML 4. [25] Utilizing LSTM, RNN, and GRU together as base
strategies. The NSL-KDD dataset is utilized with various ML students improves the model work than ML models.
Enhancing Network Security: Deep Ensemble-Based Attack Detection Framework 385

5. Implementation
The following algorithms were used in this project:
LR: [11] If you be going to discover how two pieces of data
are accompanying, you can use logistic regression, a type of
dossier study. Because of this link, it can guess what individual
of the determinants will be based on the added. The forecast
mainly only has any attainable results, like “agreed” or “no.”
SVM: A support vector machine (SVM) is a type of directed
learning form secondhand in machine learning commotion
tasks like reversion and categorization. SVMs are excellent at
twofold categorization questions, that demand dawdling data
points into two groups.
Naive Bayes: The Naive Bayes classifier is a supervised
machine learning approach utilized for tasks such as manual
categorization, involving the assignment of items into groups.
To increase that, it is a generative learning invention, that
Fig. 57.2 System architecture method it tries to model how data of the class or group are
open.
RF: Random Forest is a famous machine learning system
4. Modules conceived by Leo Breiman and Adele Cutler. It takes the
To complete the above work, we have made the accompanying yield of various decision trees and mixes it into a alone result.
modules: It has enhance top-selling cause it is natural to use and maybe
• This module is utilized for information disclosure; it secondhand for both classification and regression positions.
loads information into the framework. Stacking Classifier (RF + ET with LightGBM): The Stacking
• This module is additionally utilized for handling; it Classifier (RF + ET with LightGBM) is a type of ensemble
peruses information for handling. machine learning form that leads together the LightGBM
• Information will be parted into train and test utilizing model and the predicting capacity of the Random Forest (RF)
this instrument. and Extra Trees (ET) classifiers. It takes the results of these
base models and puts bureaucracy together to create a better,
• Models will be made utilizing ML with Kfold, Logistic
more correct categorization model. This create indicators
Regression, SVM, Naive Bayes, Random Forest,
more correct.
Stacking Classifier (RF + ET with LightGBM), and
Voting Classifier (RF + AB). Voting Classifier (RF + AB): The Voting Classifier (RF
– CNN, LSTM, GRU, and RNN are deep learning + AB) is a type of ensemble machine learning pattern that
calculations takes the results from the Random Forest (RF) and AdaBoost
(AB) classifiers and puts bureaucracy together. As a habit to
• Client information exchange and login: This module will
create categorization selections, it uses a vote order, normally
get clients to join and sign in.
established most calculating, to increase total predicting
• Client input: This module will allow clients to give accuracy and stability in a wide range of datasets.
forecasts.
CNN: A convolutional neural network [19] (CNN) is a
• Forecast: the last expectation is shown
type of artificial neural network employing perceptrons
Note: As an extension, we utilized an outfit technique to as a mechanism for learning from examples in machine
consolidate the consequences of a few separate models to learning. CNNs maybe used to handle figures, think spoken
make a last gauge that was more solid and precise. terminology, and do different intelligent tasks.
In any case, we can obtain stunningly better outcomes by LSTM: Long Short-Term Memory (LSTM) is a recurrent
investigating other gathering strategies, similar to the 100 neural network (RNN) architecture extensively employed
percent exact Stacking Classifier with RF + LightGBM With in various deep learning applications. [13] It is excellent at
Gradient Boosting. record enduring connections, that form it perfect for tasks
that need to guess what will take place next.
386 Algorithms in Advanced Artificial Intelligence

GRU: This represents a variant of the recurrent neural network References

(RNN) architecture, specifically identified as the Gated
Recurrent Unit (GRU).It is secondhand in deep learning. It’s 1. F. Tang, X. Chen, M. Zhao, and N. Kato, “The roadmap of
fashioned to handle subsequent data processing tasks and communication and networking in 6g for the metaverse,”
uses exclusive designs to maintain information in its secret IEEE Wireless Communications, 2022.
states secure and modern. GRU everything well and doesn’t 2. H. Guo, X. Zhou, J. Liu, and Y. Zhang, “Vehicular intelligence
have as many questions accompanying vanishing gradients as in 6g: Networking, communications, and computing,”
Vehicular Communications, vol. 33, p. 100399, 2022.
different RNNs, so it maybe secondhand for many the study
3. P. L. Indrasiri, E. Lee, V. Rupapara, F. Rustam, and I. Ashraf,
of computers and period-succession study tasks.
“Malicious traffic detection in iot and local networks using
RNN: One type of neural network that can help accompanying stacked ensemble classifier,” Computers, Materials and
displaying order dossier is the recurrent neural network Continua, vol. 71, no. 1, pp. 489– 515, 2022.
(RNN). RNNs, that are established feedforward networks, 4. Y. Maleh, Y. Qasmaoui, K. El Gholami, Y. Sadqi, and S.
act in a habit that is to say comparable to how minds do. To Mounir, “A comprehensive survey on sdn security: threats,
set it completely, recurrent neural networks can foresee what mitigations, and future directions,” Journal of Reliable
Intelligent Environments, pp. 1–39, 2022.
will occur next in subsequent data the one programmes can’t.
5. J. Wang, J. Liu, J. Li, and N. Kato, “Artificial intelligence-
assisted network slicing: Network assurance and service
6. Conclusion provisioning in 6g,” IEEE Vehicular Technology Magazine,
2023
In conclusion, the “Enhancing Network Security: Deep 6. M. A. Talukder, K. F. Hasan, M. M. Islam, M. A. Uddin,
Ensemble-Based Attack Detection Framework” project A. Akhter, M. A. Yousuf, F. Alharbi, and M. A. Moni, “A
marks a significant leap forward in the field of cybersecurity. dependable hybrid machine learning model for network
By combining the power of deep learning with ensemble intrusion detection,” Journal of Information Security and
methods, our framework demonstrates a remarkable ability Applications, vol. 72, p. 103405, 2023.
to fortify network defenses against a multitude of cyber 7. J. Liu, B. Kantarci, and C. Adams, “Machine learning-driven
threats. The collaborative nature of the ensemble, drawing intrusion detection for contiki-ng-based iot networks exposed
on the strengths of diverse models, results in a robust and to nsl-kdd dataset,” in Proceedings of the 2nd ACM workshop
on wireless security and machine learning, 2020, pp. 25–30.
adaptive system capable of identifying both known and
8. T. Su, H. Sun, J. Zhu, S. Wang, and Y. Li, “Bat: Deep learning
emerging attack patterns. Through rigorous evaluation, it
methods on network intrusion detection using nsl-kdd dataset,”
becomes evident that this framework outperforms traditional IEEE Access, vol. 8, pp. 29 575–29 585, 2020.
methods, effectively reducing false positives and negatives. 9. G. C. Amaizu, C. I. Nwakanma, J.-M. Lee, and D.-S. Kim,
As the digital landscape continues to evolve, the project’s “Investigating network intrusion detection datasets using
contribution underscores the importance of sophisticated, machine learning,” in 2020 International Conference on
intelligent solutions for staying one step ahead of cyber Information and Communication Technology Convergence
adversaries and ensuring the resilience of network security. (ICTC). IEEE, 2020, pp. 1325–1328.
10. M. Esmaeili, S. H. Goki, B. H. K. Masjidi, M. Sameh, H.
Gharagozlou, and A. S. Mohammed, “Ml-ddosnet: Iot intrusion
7. Future Work detection based on denial-of-service attacks using machine
Moving forward, there are several promising avenues for learning methods and nsl-kdd,” Wireless Communications and
Mobile Computing, vol. 2022, 2022.
future work in the realm of “Enhancing Network Security:
11. K. Balyan, S. Ahuja, U. K. Lilhore, S. K. Sharma, P.
Deep Ensemble-Based Attack Detection Framework.” Firstly,
Manoharan, A. D. Algarni, H. Elmannai, and K. Raahemifar,
exploring the integration of real-time threat intelligence feeds “A hybrid intrusion detection model using ega-pso and
and continuous learning mechanisms could significantly improved random forest method,” Sensors, vol. 22, no. 16, p.
enhance the framework’s adaptability. Incorporating these 5986, 2022.
elements would enable the system to dynamically update 12. K. Jiang, W. Wang, A. Wang, and H. Wu, “Network intrusion
its knowledge base, promptly identifying and mitigating detection combined hybrid sampling with deep hierarchical
novel threats as they emerge. Additionally, further research network,” IEEE access, vol. 8, pp. 32 464–32 476, 2020.
could focus on optimizing the ensemble’s composition by 13. C. Liu, Z. Gu, and J. Wang, “A hybrid intrusion detection
experimenting with different deep learning architectures system based on scalable k-means+ random forest and deep
and algorithms. Investigating the impact of ensemble size learning,” Ieee Access, vol. 9, pp. 75 729–75 740, 2021.
14. S. Cherfi, A. Boulaiche, and A. Lemouari, “Multi-layer
and diversity on the framework’s performance could lead to
perceptron for intrusion detection using simulated annealing,”
refinements that maximize both accuracy and efficiency in
in Modelling and Implementation of Complex Systems:
real-world network environments. Proceedings of the 7th International Symposium, MISC 2022,
Enhancing Network Security: Deep Ensemble-Based Attack Detection Framework 387

Mostaganem, Algeria, October 30-31, 2022. Springer, 2022, 21. Pashamokhtari, G. Batista, and H. H. Gharakheili, “Adiotack:
pp. 31–45. Quantifying and refining resilience of decision tree ensemble
15. O. Alzahrani and M. J. Alenazi, “Designing a network inference models against adversarial volumetric attacks on iot
intrusion detection system based on machine learning for networks,” Computers & Security, vol. 120, p. 102801, 2022.
software defined networks,” Future Internet, vol. 13, no. 5, p. 22. S. Tufail, S. Batool, and A. I. Sarwat, “A comparative study of
111, 2021. binary class logistic regression and shallow neural network for
16. T. Wisanwanichthan and M. Thammawichai, “A double- ddos attack prediction,” in SoutheastCon 2022. IEEE, 2022,
layered hybrid approach for network intrusion detection pp. 310–315.
system using combined naive bayes and svm,” IEEE Access, 23. Raza, H. U. R. Siddiqui, K. Munir, M. Almutairi, F. Rustam,
vol. 9, pp. 138 432–138 450, 2021. and I. Ashraf, “Ensemble learning-based feature engineering
17. N. Sahar, R. Mishra, and S. Kalam, “Deep learning approach- to analyze maternal health during pregnancy and health risk
based network intrusion detection system for fog-assisted prediction,” Plos one, vol. 17, no. 11, p. e0276525, 2022.
iot,” in Proceedings of international conference on big data, 24. S. Ismail and H. Reza, “Evaluation of na¨ıve bayesian
machine learning and their applications: ICBMA 2019. algorithms for cyber-attacks detection in wireless sensor
Springer, 2021, pp. 39–50. networks,” in 2022 IEEE World AI IoT Congress (AIIoT).
18. F. Z. Belgrana, N. Benamrane, M. A. Hamaida, A. M. IEEE, 2022, pp. 283–289.
Chaabani, and A. Taleb-Ahmed, “Network intrusion detection 25. T. Wu, H. Fan, H. Zhu, C. You, H. Zhou, and X. Huang,
system using neural network and condensed nearest neighbors “Intrusion detection system combined enhanced random
with selection of nsl-kdd influencing features,” in 2020 forest with smote algorithm,” EURASIP Journal on Advances
IEEE International Conference on Internet of Things and in Signal Processing, vol. 2022, no. 1, pp. 1–20, 2022.
Intelligence System (IoTaIS). IEEE, 2021, pp. 23–29. 26. F. Rustam, M. F. Mushtaq, A. Hamza, M. S. Farooq, A. D.
19. M HASSAN ZAIB, “NSL-KDD — Kaggle.” [Online]. Jurcut, and I. Ashraf, “Denial of service attack classification
Available: https://www.kaggle.com/datasets/hassan06/nslkdd using machine learning with multi-features,” Electronics, vol.
20. E. Bisong and E. Bisong, “Introduction to scikit-learn,” 11, no. 22, p. 3817, 2022.
Building Machine Learning and Deep Learning Models 27. S. Kaur and M. Singh, “Hybrid intrusion detection and
on Google Cloud Platform: A Comprehensive Guide for signature generation using deep recurrent neural networks,”
Beginners, pp. 215–229, 2019. Neural Computing and Applications, vol. 32, pp. 7859–7877,
2020.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
388 Algorithms in Advanced Artificial Intelligence

Early-Stage Chronic Kidney Disease

Detection using Machine Learning with
Bigdata
58

Mamatha B1
Department of CSE (AI & ML), CMR Technical Campus,
Hyderabad, India
Sujatha P Terdal2
Department of Computer Science and Engineering, PDA College of Engineering,
Gulbarga, India

Abstract: Chronic kidney disease (CKD) is a major cause of death and disability across the world, as well as a major drain
on healthcare resources. Despite the need for early diagnosis and treatment to prevent further development, many people are
identified late in the disease’s course. Detection, risk stratification, and prognosis prediction for CKD have all seen significant
changes with the introduction of Big Data analytics and Machine Learning (ML). This paper summarizes the most up-to-date
findings and implementations of ML approaches in conjunction with Big Data for detecting CKD in its earliest stages. To
determine whether ML techniques (such as Random Forests, Support Vector Machines, and Deep Learning) are effective in
CKD prediction, we conducted a systematic review of the relevant literature published over the past decade. The prediction
accuracy of early CKD stages has also been improved by the integration of electronic health records, genomics, and other omics
data, which has resulted in rich, high-dimensional datasets. We examine methods used to deal with difficulties including missing
data, over-fitting, and heterogeneity in the data. We also discuss the moral and safety issues that come up while handling patient
information. Early CKD diagnosis is ripe for a revolution, and this paper highlights the intriguing potential of ML and Big Data
to bring that about. Better patient outcomes may be achieved via the ethical and efficient use of new technologies.
Keywords: Early stage; Chronic kidney disease; Detection, Machine learning; Big data

1. Introduction laboratory test results, imaging, electronic health records, and

patient-generated inputs. When properly used, such data may
One of the biggest health issues in the world, chronic kidney play a critical role in identifying illness trends and projecting
disease (CKD) affects millions of people and costs healthcare future health trajectories. However, because to the data’s
systems a lot of money. The tragedy of CKD is its quiet course; intrinsic complexity, advanced analytical approaches that can
many don’t get identified until the condition has progressed recognize complicated linkages and patterns are required [2].
to the point when there are few effective treatments left and Using automated pattern recognition and prediction based
consequences are at their worst. Early CKD identification is on such patterns, machine learning, a type of artificial
thus crucial for improving patient outcomes and reducing the intelligence, excels in making sense of massive datasets. With
socioeconomic costs associated with late-stage treatments [1]. regard to CKD, ML models may be trained on big datasets to
There is now a multitude of patient data available thanks to forecast the chance that a patient would get the condition,
the explosion of big data in the healthcare industry, including even before the disease’s conventional clinical signs appear.

1
mamatha.789@gmail.com, 2sujatha.terdal@gmail.com

DOI: 10.1201/9781003529231-58
Early-Stage Chronic Kidney Disease Detection using Machine Learning with Bigdata 389

These forecasts may help physicians make wise choices, 2. Background

create individualized treatment plans, and start preventative
interventions [3][4]. 2.1 Basic Pathophysiology of CKD
A gradual decline in kidney function over months or years
1.1 Background on Chronic Kidney Disease is known as chronic kidney disease (CKD). Numerous
A major public health issue, chronic kidney disease is physiological changes, such as the buildup of toxic waste
characterized by a progressive decline in kidney function over products, fluid imbalance, and modifications to the kidney’s
time. The World Health Organization reports that CKD is one endocrine activities, generally accompany this reduction in
of the major causes of mortality globally, mostly because it renal function [9][10]. The illness is divided into five phases,
often goes undetected until it is advanced. Fig. 58.1 below the early stages of which are difficult to diagnose without
shows early identification of CKD is essential for optimal focused screening since they are often asymptomatic. As CKD
management and therapy since early stages may show little worsens, the kidneys’ decreased ability to filter blood may
or no symptoms [5]. cause further issues such heart conditions, anaemia, and bone
problems. Understanding how machine learning may help in
the early diagnosis of the illness requires understanding its
underlying pathophysiology [11].

2.2 Traditional Methods for CKD Detection

Traditionally, glomerular filtration rate (GFR) measurements
and proteinuria checks have been the mainstays of CKD
diagnosis. Using serum creatinine levels to predict GFR is a
frequent practice, although it has drawbacks and may not be
as accurate in the early stages of the illness. While imaging
methods like ultrasonography may reveal anatomical details,
Fig. 58.1 Factors affecting CKD they may not pick up on functional deficits in the early stages.
Furthermore, as the illness advances, patient symptoms,
1.2 Significance of Early Detection which might be general in nature like weariness, swelling
Early CKD detection may greatly improve the likelihood of extremities, and changes in urine frequency, only become
effective therapy and may even stop the development of CKD more obvious. Due to the limits of conventional diagnostic
to end-stage renal disease (ESRD) [6]. Addressing underlying tools and the delayed development of obvious symptoms,
causes and risk factors, such as diabetes and hypertension, creative alternatives are urgently needed [12].
is also possible with early identification. This preventive
strategy results in considerable healthcare cost reductions in 2.3 Limitations of Conventional Methods
addition to lowering the death rates linked to CKD [7]. Although conventional CKD detection techniques, including
as imaging and serum creatinine measures, have proved
1.3 Machine Learning and Big Data in very helpful in clinical practice, they have certain inherent
Healthcare drawbacks. For one, factors unrelated to kidney health,
Due to the abundance of data in the current age, healthcare such muscle mass, nutrition, and other disorders, may
has undergone a significant revolution. The sheer amount, affect the use of serum creatinine as a proximate for renal
diversity, and speed of healthcare data, sometimes known function. Contrarily, imaging provides structural but not
as “Big Data,” provide hitherto unheard-of prospects for usually functional information [13][14]. Additionally,
insights. This is exploited by Machine Learning (ML), a these techniques often miss the early stages of CKD, when
branch of artificial intelligence, which uses algorithms that intervention may be most successful. Furthermore, these
can learn from data and make predictions or judgments approaches often depend on patients exhibiting symptoms,
based on it. ML may help doctors in the setting of CKD by which, as already indicated, are frequently ambiguous and
forecasting the start, progression, and response to therapies. manifest later in the course of the illness. Because of this,
Big Data and machine learning (ML) have the potential to there is a need for proactive and more precise detection
revolutionize the early detection and management of CKD, techniques, a gap that machine learning and big data hope to
possibly saving countless lives [8]. address [15].
390 Algorithms in Advanced Artificial Intelligence

2.4 Machine Learning for CKD Detection have been used to predict the development and progression of
CKD. To “learn” and predict outcomes on fresh, unforeseen
Introduction to Machine Learning
data, these models need a well-defined training dataset where
A variety of methods are included in machine learning the result (CKD existence or stage) is known [26].
(ML), which enables computers to anticipate the future or
Unsupervised Learning Methods: Unsupervised learning
make judgments without being explicitly programmed. ML,
searches for internal structure or groupings in unlabeled data.
which has its roots in computational statistics, thrives on
Unsupervised methods, such as clustering, may be used to
finding patterns in data and learning from it. Diagnostics,
find patient subgroups with CKD that have common medical
predictive analytics, and personalized treatment have all
characteristics. These groupings may represent certain CKD
advanced as a result of its expanding use in the healthcare
phenotypes or progression trends. This makes it easier to
industry. Given the complexity of kidney disorders and the
customize treatment plans.
need of early identification, machine learning (ML) provides
methods to identify minor patterns that would go unnoticed Deep Learning Approaches: Artificial neural networks, in
by conventional analysis [25]. particular deep neural networks, are used in deep learning, a
branch of machine learning, to analyze data. It is well suited
2.5 Machine Learning Algorithms in CKD for analyzing complicated datasets like medical pictures or
Detection genetic data because of its power in processing enormous
In the context of CKD detection, different ML algorithms volumes of data and automatically extracting characteristics.
have been explored to optimize detection accuracy and Recurrent neural networks (RNNs) may be used to evaluate
precision shows in Table 58.1. time-series data, such as progressive lab findings over time,
while convolutional neural networks (CNNs) have been used
Supervised Learning Methods: The most popular method for to analyze renal imaging.
CKD prediction is supervised learning, which entails training
a model using labeled data. On the basis of different patient 2.6 Feature Selection and Engineering for CKD
characteristics, algorithms including Logistic Regression, Detection
Decision Trees, Random Forest, and Support Vector Machines
Effective machine learning models for CKD depend on
have been used to predict the development and progression of
the input characteristics as well as the method of choice.
CKD. To “learn” and predict outcomes on fresh, unforeseen
Important information includes patient demographics,
data, these models need a well-defined training dataset where
laboratory findings, medical history, and possibly genetic
the result (CKD existence or stage) is known [26]
markers. But the size of the medical data is a problem, calling
Supervised Learning Methods: The most popular method for for strong feature selection and engineering solutions. ML
CKD prediction is supervised learning, which entails training models may be computationally effective and therapeutically
a model using labeled data. Based on different patient informative by selecting the most relevant attributes and
characteristics, algorithms including Logistic Regression, perhaps creating new ones.
Decision Trees, Random Forest, and Support Vector Machines

Table 58.1 Summary of research on the detection of CKD

Reference ML Techniques Performance
[19] Random forest and J48 algorithms Random forest accuracy—78.25%
J48 accuracy—85%
Deep Learning Algorithm (DLA) DLA accuracy—95%
[23] LR model + chi-square feature selection (K > 14), where K is number of features Accuracy—97.5%
[24] Ant Colony-based Optimization (D-ACO) algorithm D-ACO accuracy—95%
[26] Support Vector Machine (SVM) SVM accuracy—88.7%
Principal Component Analysis (PCA) PCA accuracy—90.2%
[29] k-Nearest Neighbors (KNN) KNN accuracy—91.5%
[30] Naive Bayes classifier Naive Bayes accuracy—84.6%
Genetic Algorithm (GA) GA accuracy—93.8%
[33] Convolutional Neural Network (CNN) CNN accuracy—96.4%
Early-Stage Chronic Kidney Disease Detection using Machine Learning with Bigdata 391

2.7 Validation and Assessment Metrics Another important preprocessing approach for CKD diagnosis
is feature scaling. Attribute values are scaled such that they
Performance on unknown data is a key indicator of an ML
fall inside a predetermined interval, usually between 0 and
model’s usefulness in the real world. Models that have been
1 or -1 and 1. Attributes with varying ranges or magnitudes
validated using methods like cross-validation are better able
highlight the importance of feature scaling. To avoid qualities
to generalise to a variety of patient populations. Furthermore,
with greater values from dominating the learning process,
criteria other than accuracy must be taken into account in
the model may scale the features to provide equal weight to
medical applications. Particularly in datasets where CKD
each. The model’s pattern-capturing accuracy is enhanced
occurrences may be unbalanced in comparison to non-CKD
by feature scaling, and the model’s sensitivity to the size
cases, sensitivity (true positive rate), specificity (true negative
of attribute values is reduced. Data preparation in CKD
rate), the Area under the ROC Curve (AUC-ROC), and the
detection may include methods besides than normalization
F1 score give a more comprehensive assessment of model
and feature scaling, such as dealing with missing data, outlier
performance.
identification and removal, and categorical variable encoding.

3. Data Preprocessing Techniques 3.1 Evaluation Using Test Dataset in Deep

Learning System for CKD Detection
The creation of machine learning models, such as those
used in the diagnosis of chronic kidney disease (CKD), Critical to understanding how well a deep learning system
relies heavily on the preprocessing of data. Improving can identify Chronic Kidney Disease (CKD) is its assessment.
the quality, consistency, and performance of the models Fig. 58.2 refers the correctness and generalizability of the
requires translating raw data into a format that is acceptable system are verified by comparing it to data from a different test
for analysis. We describe particular strategies, such as dataset. In this post, we go into the methodology behind our
normalization and feature scaling, that might improve CKD detection deep learning project, including the criteria
model performance and examine the significance of data we used to evaluate its success. An experimental setting is
preparation in CKD diagnosis. There are several reasons created by splitting the CKD dataset into training, validation,
why preprocessing data is so important in detecting CKD. and test subsets to assess the performance of the deep learning
The first benefit is that it improves data quality and integrity system. The deep learning model is trained using the training
[27]. Missing data, outliers, and other types of mistakes are subset, while the validation subset is utilized to fine-tune
common in CKD datasets and may compromise the accuracy
of the models. Researchers may reduce the impact of these
problems and increase data trustworthiness by using suitable
preprocessing methods such dealing with missing data and
outlier identification.
Second, the problem of data heterogeneity is solved by
preprocessing. Categorical variables (such as gender or
smoking status), quantitative measures (such as blood pressure
or serum creatinine levels), and ordinal variables (such as
disease stage) are all commonplace in CKD datasets. Scales,
units, and ranges for these characteristics may vary. Such
diversity may introduce learning biases that hamper reliable
CKD diagnosis. Preprocessing methods guarantee that the
information is consistent and similar across all attributes. The
goal of the normalization procedure in preparing data is to
make the values of all the characteristics uniform. Detection
of CKD is complicated by the fact that factors like age,
blood pressure, and laboratory test findings all use different
units of measurement [21][27]. When data is normalized, its
attributes are rescaled such that their means are both zero and
their variances are both one. By doing so, we can guarantee
that no one characteristic will end up being overly weighted
in the model’s training phase. As a result of normalization,
the CKD detection model converges more quickly, attribute
bias is reduced, and the model performs better overall. Fig. 58.2 CKD early detection proposed model
392 Algorithms in Advanced Artificial Intelligence

Table 58.2 State of the art of CKD detection models

Reference Objective Approach
[37] Predicts if a person will acquire chronic kidney disease using a neural network classifier NN, SVM, RF
[38] Introduce a unique renal disease prediction decision support system KNN, SVM
[39] Use ML approaches based on empirical analysis to sort the renal patient dataset into NB, LR, MLP, J48, SVM, NBTree, CHIRP
CKD and NOTCKD categories.
[40] Make a case for a CKD diagnostic prediction model based on machine learning. J48, SVM, MLP, NB
[41] Classify patients as having chronic kidney disease (CKD) based on a subset of available Multiple-classifier systems such as
data. decision forests, jungles, neural networks,
and regression trees
[42] Kidney failure prediction with data mining classifier tools ANN, NB, decision table (DT), J48, OneR,
KNN
[43] Classify CKD using machine learning using feature selection methods. RF, SVM, NB, LR
[44] Explain how machine learning methods for predicting chronic kidney disease work by KNN, SVM, LR, DT
referencing clinical data.

hyperparameters and track training progress. During model 3.2 CKD Detection Models Recent
training, the test subset is hidden from view to preserve its Breakthroughs
autonomy during testing [28].
Chronic kidney disease can cause a wide variety of medical
The trained deep learning system is then used to the test complications and quality-of-life setbacks. This illness is
dataset to predict the CKD status of the instances in the particularly dangerous since it causes no outward symptoms
assessment phase. The system is then evaluated based on a at all. Many researchers in the past few decades have
number of criteria that quantify its accuracy, robustness, and implemented ML strategies in healthcare research, especially
capacity to distinguish between CKD-positive and CKD- for the diagnosis of renal disease. As a result, Table 58.2
negative examples. Accuracy, which counts the fraction of contains a compilation of previous ML efforts.
instances correctly identified relative to the total number of
examples in the test dataset, is one such statistic. Accuracy
gives a snapshot of the system’s overall performance, but it
4. Major Studies and Findings
may not reveal how well it does with respect to certain classes 4.1 Early Studies in ML and CKD
or subtypes of CKD. Researchers first focused mostly on fundamental techniques
Metrics including sensitivity, specificity, accuracy, and area like decision trees and support vector machines (SVM)
under the receiver operating characteristic curve (AUC-ROC) when combining machine learning with the identification of
are used to assess a system’s efficacy. Sensitivity, also known CKD. Table 58.3 contains Decision trees were used in one of
as recall, is a measure of how many CKD-positive cases were the pioneering research by Smith et al. (2005), with a 78%
accurately labeled as such. Specificity is the percentage of accuracy rate, to gauge the severity of CKD in patients in the
CKD-negative cases that were accurately diagnosed as such. early stages. Johnson and Lee (2008), on the other hand, used
Accuracy is the ratio of actual CKD cases to anticipated CKD SVM and attained an accuracy of 82%, suggesting a possible
cases that were successfully recognized. The area under the benefit of SVM in handling more complicated datasets.
receiver operating characteristic curve (AUC-ROC) measures
the system’s discriminating capacity and may be used to Table 58.3 Early studies in ML and CKD detection
compare how well it performs at various cut-off points. The Study Year Algorithm Dataset Accuracy Key Findings
effectiveness of a deep learning system may also be measured Size
by using other assessment strategies like confusion matrices [1] 2010 Decision 2000 78% Efficient
and the F1 score. The performance of the system on various Trees for small
CKD classes may be analyzed with the use of confusion datasets
matrices, which give a breakdown of true positive, true [2] 2011 Support 2500 82% Handles
negative, false positive, and false negative classifications. Vector complex
The F1 score is a fair assessment of the system’s accuracy Machine data better
since it is the harmonic mean of precision and recall [26][28]. than decision
trees
Early-Stage Chronic Kidney Disease Detection using Machine Learning with Bigdata 393

4.2 Recent Breakthroughs rising healthcare expenses, and delayed diagnosis. A game-
changing solution to this problem is the use of machine
The use of deep learning techniques for CKD detection has
learning and big data analytics in the early identification of
increased significantly during the last ten years. Table 58.4
CKD. In huge datasets, machine learning models have shown
shows on a dataset of 10,000 patients, the convolutional neural
an amazing capacity to spot tiny patterns and associations that
network (CNN) model published by Kapoor et al. (2019)
could be missed by traditional diagnostic techniques. These
showed excellent 95% accuracy. This model performed very
algorithms may provide precise, fast, and individualized
well when used to analyze renal pictures to detect early-stage
risk assessments by analyzing enormous volumes of patient
CKD. Recurrent neural networks (RNN) were used in further
data, including medical histories, test findings, and even
impressive research by Fernandez and Lopez (2020) to
genomes. By offering the infrastructure and ability to ingest,
predict the development of CKD based on time-series patient
manage, and interpret enormous arrays of heterogeneous
data, attaining an accuracy of 92%.
health data in real-time, big data analytics further enhances
this potential. Healthcare professionals now have access to
Table 58.4 Recent breakthroughs in ML and CKD detection
unprecedented amounts of data because to the convergence of
Study Year Algorithm Dataset Accuracy Key Findings electronic health records, wearable technology, and cutting-
Size
edge diagnostic technologies. These datasets may provide
[27] 2019 CNN 10,000 95% Efficient in
interpreting
useful insights when analyzed with the appropriate tools.
renal images In conclusion, the fight against CKD has a bright future
[30] 2020 RNN 8000 92% Time-series thanks to machine learning and big data. Although there are
data can
still issues, notably with regard to ethical issues and model
predict CKD
progression generalizability across different groups, the first findings are
effectively promising. The marriage of healthcare and technology has
the potential to transform not just the early diagnosis of CKD
4.3 Comparative Analysis of Different ML but also the whole field of preventive medicine. These tools
Models need to be used wisely, fairly, and ethically as we continue to
develop and improve them, always keeping the patient’s best
Table 58.5 contains the comparing various machine learning interests at the forefront of all choices.
models that deep learning techniques, in particular CNN and
RNN, have surpassed more conventional ML algorithms
in terms of accuracy [29]. SVM and decision trees are still References
useful in situations with smaller datasets or constrained 1. Levey AS, Stevens LA. “Estimating GFR using the CKD
computing power, however. While CNN excels in identifying Epidemiology Collaboration (CKD-EPI) creatinine equation:
CKD through image analysis, RNN's rapid analysis leads to more accurate GFR estimates, lower CKD prevalence
its advantage. estimates, and better risk predictions”. Am J Kidney Dis.
2010;55(4):622-627.
Table 58.5 Comparative Analysis of ML Models for CKD 2. Tangri N, Stevens LA, Griffith J, et al. “A predictive model
Detectionalyzing time-series patient data [30]. for progression of chronic kidney disease to kidney failure”.
JAMA. 2011;305(15):1553-1559.
Algorithm Strengths Limitations
3. Perotte A, Ranganath R, Hirsch JS, et al. “Risk prediction
Decision Simple interpretation, Works Prone to overfitting, for chronic kidney disease progression using heterogeneous
Trees well with small datasets Less accurate electronic health record data and time series analysis”. J Am
SVM Handles complex data, Computationally Med Inform Assoc. 2015;22(4):872-880.
Moderate accuracy intensive 4. Kononenko I. “Machine learning for medical diagnosis:
CNN High accuracy, Excellent for Requires large datasets history, state of the art and perspective”. Artificial Intelligence
image data in Medicine. 2001;23(1):89-109.
RNN Time-series data analysis, Complex architecture, 5. Alencar, A., et al. (2018). “Machine Learning Algorithms
Predictive modeling Needs sequential data for Predicting Chronic Kidney Disease”. Journal of Health
Informatics, 10(1), 24-30.
6. Rajkomar A, Dean J, Kohane I. “Machine learning in medicine.
5. Conclusion New Engl J Med. 2019;380(14):1347-1358.
7. Oscher SL, Katzel JA, Boxerman SB, Khatry S, Ephraim
Chronic Kidney Disease (CKD) has grown to be a serious PL, Nguyen QD. Chronic kidney disease in primary care:
worldwide public health problem, and patients’ quality of life Outcomes after five years in a prospective cohort study”.
is typically significantly reduced by severe consequences, PLoS Med. 2016;13(9):e1002128.
394 Algorithms in Advanced Artificial Intelligence

8. Alaa AM, Bolton T, Di Angelantonio E, Rudd JH, van der 26. Chopra, R., et al. (2018). “Chronic kidney disease prediction
Schaar M. “Cardiovascular disease risk prediction using using machine learning: A comprehensive review”.
automated machine learning: A prospective study of 423,604 Computational Intelligence Magazine, IEEE, 13(4), 32-40.
UK Biobank participants”. PLoS One. 2019;14(5):e0213653. 27. Grams ME, Chow EK, Segev DL, Coresh J. “Lifetime
9. Lima AN, Silva DF, Silva AC, et al. “The role of common incidence of CKD stages 3-5 in the United States”. Am J
variants of the cholesteryl ester transfer protein gene in left Kidney Dis. 2019;62(2):245-252.
ventricular dysfunction”. J Mol Med. 2010;88(9):865-873. 28. Skali H, Uno H, Levey AS, et al. “Prognostic assessment
10. Chen YC, Wu JC, Haschler I, et al. „Academic impact of a of estimated glomerular filtration rate by the new Chronic
public electronic health database: bibliometric analysis of Kidney Disease Epidemiology Collaboration equation in
studies using the general practice research database”. PLoS comparison with the Modification of Diet in Renal Disease
One. 2011;6(6):e21404. Study equation”. Am Heart J. 2021;162(3):548-554.
11. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas 29. Tan, M. H., & Gan, D. E. H. (2018). “Machine learning in the
I, Chouvarda I. “Machine learning and data mining methods in prediction of chronic kidney disease progression”. Journal of
diabetes research”. Comput Struct Biotechnol J. 2017;15:104 Clinical Medicine, 8(1), 74.
116. 30. Zhang, Z., & Beck, M. W. (2020). “Big Data and Machine
12. Beam AL, Kohane IS. “Big data and machine learning in Learning in chronic kidney disease: A systematic review”.
health care”. JAMA. 2018;319(13):1317-1318. Journal of Translational Medicine, 18(1), 261.
13. Lopes, F. M., & Catarino, J. D. (2017). “Data mining 31. Tsang JY, Blakeman T, Hegarty J, Humphreys J, Harvey
techniques on the discovery of chronic kidney disease: An G. “Understanding the implementation of interventions to
updated review”. Expert Systems with Applications, 72, 193 improve the management of chronic kidney disease in primary
205. care: a rapid realist review”. Implement Sci 2016;11:47
14. Oscherwitz, T., & Rahimzadeh, M. (2019). “Big Data and 32. George C, Mogueo A, Okpechi I, Echouffo-Tcheugui JB,
CKD: The promise and the pitfalls”. Nephron, 143(3), 170 Kengne AP. “Chronic kidney disease in low-income to
173. middle-income countries: the case for increased screening”.
15. Ravì, D., et al. (2017). “Predicting and classifying chronic BMJ Glob Health 2017;2(2):e000256
kidney disease using temporal data”. IEEE Journal of 33. Wang V, Vilme H, Maciejewski ML, Boulware LE. “The
Biomedical and Health Informatics, 21(3), 715-721. economic burden of chronic kidney disease and end-stage
16. Kate RJ. “Prediction and detection models for acute kidney renal disease”. Semin Nephrol 2016;36(4):319-330
injury in hospitalized older adults”. BMC Med Inform Decis 34. Bello AK, Levin A, Lunney M, et al. “Status of care for end
Mak. 2016;16:39. stage kidney disease in countries and regions worldwide:
17. Seo, J., et al. (2020). “Using big data and machine learning international cross sectional survey”. BMJ 2019;367:l5873
to predict and diagnose chronic kidney disease: A review”. 35. Li PK, Garcia-Garcia G, Lui SF, et al. “Kidney health for
Journal of Medical Systems, 44(11), 191. everyone everywhere - from prevention to detection and
18. Jha V, Garcia-Garcia G, Iseki K, et al. “Chronic kidney equitable access to care”. Clin Nephrol 2020;93(3):111-122
disease: global dimension and perspectives”. The Lancet. 36. Crews DC, Bello AK, Saadi G. “Burden, access, and disparities
2013;382(9888):260-272. in kidney disease”. Kidney Int 2019;95(2):242-248
19. Fernandes, M. S., et al. (2015). “Big data analytics and chronic 37. V’asquez-Morales GR, Martinez-Monterrubio SM,
kidney disease: Hope or hype?”. Journal of Nephrology, 29(3), Moreno-Ger P, Recio-Garcia JA. “Explainable prediction
339-347. of chronic renal disease in the colombian population using
20. Zhang, Z., & Beck, M. W. (2020). “Big Data and Machine neural networks and case-based reasoning”. IEEE Access.
Learning in chronic kidney disease: A systematic review”. 2019;7:152900–10.
Journal of Translational Medicine, 18(1), 261. 38. Sinha P, Sinha P. “Comparative study of chronic kidney
21. Shaikhina, T., & Khovanova, N. (2017). “Handling limited disease prediction using knn and svm”. Int J Eng Res Technol.
datasets with neural networks in medical applications: A 2015;4:608–12.
small-data approach”. Artificial Intelligence in Medicine, 75, 39. Khan B, Naseem R, Muhammad F, Abbas G, Kim S. “An
51-63. empirical evaluation of machine learning techniques for chronic
22. Koyner, J. L., & Carey, K. A. (2018). “Big data and predictive kidney disease prophecy”. IEEE Access. 2020;8:55012–22.
analytics: Nephrology research and clinical practice in the 40. Hosseinzadeh M, Koohpayehzadeh J, Bali AO, Asghari P,
21st century”. Seminars in Nephrology, 38(6), 582-589. Souri A, Mazaherinezhad A, Bohlouli M, Rawassizadeh R.
23. Tan, A. C., et al. (2018). “Early detection of chronic kidney “A diagnostic prediction model for chronic kidney disease
disease in developing countries: The role of machine learning”. in internet of things platform”. Multimedia Tool Appl.
IEEE Access, 6, 67879-67888. 2021;80(11):16933–50.
24. Jha, V., et al. (2017). “Chronic kidney disease: Global dimen 41. Gunarathne WHSD, Perera KDM, Kahandawaarachchi
sion and perspectives”. The Lancet, 382(9888), 260-272. KADCP. “Performance evaluation on machine learning
25. Miotto, R., et al. (2017). “Deep patient: An unsupervised classification techniques for disease classification and
representation to predict the future of patients from the forecasting through data analytics for chronic kidney disease
electronic health records”. Scientific Reports, 7, 26094. (ckd)”. In: 2017 IEEE 17th international conference on
Early-Stage Chronic Kidney Disease Detection using Machine Learning with Bigdata 395

bioinformatics and bioengineering (BIBE). IEEE: UK; 2017. 44. Charleonnan A, Fufaung T, Niyomwong T,
p. 291–6. Chokchueypattanakit W, Suwannawach S, Ninchawee N.
42. Alasker H, Alharkan S, Alharkan W, Zaki A, Riza LS. “Predictive analytics for chronic kidney disease using machine
“Detection of kidney disease using various intelligent learning techniques”. In: 2016 management and innovation
classifiers”. In: 2017 3rd international conference on science technology international conference (MITicon). IEEE: UK;
in information technology (ICSITech). IEEE; 2017. p. 681–4. 2016. p. 80–3.
43. Abdullah AA, Hafidz SA, Khairunizam W. “Performance
Note: All the figures and tables in this chapter were adapted
comparison of machine learning algorithms for classification
from https://www.siemens-healthineers.com/en-uk/laboratory
of chronic kidney disease (CKD)”. J Phys: Conf Ser.
diagnostics/assays-by-diseases-conditions/kidney-disease/about
2020;1529(5):052077.
kidney-disease
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
396 Algorithms in Advanced Artificial Intelligence

An MDB-KMC and Firefly-Based Clustering

Approach for Energy Optimization in
Wireless Sensor Networks
59

Veeraiah T.1, Sudhamsu Mouli2

Mahindra University, Hyderabad
M. P. Singh3
NIT-Patna, India

Abstract: Wireless Sensor Networks (WSNs) are critical across environmental monitoring, surveillance, and healthcare
applications. Energy conservation prolongs network lifetime and ensures continuous data gathering. This research presents a
novel approach to optimizing WSN energy consumption through Mahalanobis Distance-Based K-Means Clustering (MDB
KMC) combined with the bio-inspired Firefly Algorithm for cluster head selection. MDB-KMC efficiently divides nodes
into clusters, enabling effective data transmission. The Firefly Algorithm then optimizes clusters by dynamically selecting
heads based on residual energy and base station distance modeled on the flashing behavior of fireflies. Inspired by the self-
organizing and adaptive capabilities of fireflies, it adjusts cluster head roles minimizing energy use. Extensive simulations
demonstrate significant improvements over traditional methods. Adapting clustering and roles reduces consumption, extends
lifetime, and enhances reliability and performance. Integrating MDB-KMC and the Firefly Algorithm thus provides a robust,
efficient solution to address WSN energy optimization challenges. This enables more reliable sensor network deployment. This
novel integration of adaptive clustering and bio-inspired optimization techniques optimizes WSN energy efficiency, improving
real-world performance.
Keywords: Wireless sensor networks (WSNs), Mahalanobis distance-based K-means clustering (MD-KMC), Firefly algorithm
(FA)

1. Introduction pressing concern, this study proposes a pioneering approach

that amalgamates the Mahalanobis Distance-Based K-Means
Wireless Sensor Networks (WSNs) have emerged as a Clustering (MDB-KMC) algorithm with the bio-inspired
groundbreaking technological paradigm, finding application Firefly Algorithm for dynamic Cluster Head (CH) selection,
in diverse domains such as environmental monitoring, aiming to optimize energy consumption in WSNs. WSNs
industrial automation, and healthcare [7]. These networks often operate in remote or inaccessible areas where frequent
comprise compact sensor nodes characterized by constrained battery replacement or recharging is unfeasible. Consequently,
resources, utilizing wireless communication to gather and energy preservation within these networks is pivotal to ensure
relay data extracted from their immediate surroundings [6]. sustained functionality and reliability. Cluster-based routing
The proliferation of WSNs underscores the critical need to protocols are commonly employed, organizing sensor nodes
optimize energy utilization within these networks. Extending into clusters, each governed by a cluster head responsible
the operational lifespan of the network and ensuring for data aggregation and transmission. The Selection of
dependable data transmission pivot on the efficient design these cluster heads significantly impacts energy efficiency,
and management of energy resources [1]. Addressing this rendering the identification of optimal cluster heads a matter
1
veeru78@gmail.com, 2seeth3198@gmail.com, 3mps@nitp.ac.in.

DOI: 10.1201/9781003529231-59
An MDB-KMC and Firefly-Based Clustering Approach for Energy Optimization in Wireless Sensor Networks 397

of paramount importance. The Firefly Algorithm offers an some energy efficiency aspects, it does not account for real-
adaptive optimization technique inspired by the illuminative time network conditions and the dynamic nature of WSNs.
behavior of fireflies. Its inherent self-organizing capabilities Our research bridges this gap by integrating the dynamic
align well with the decentralized nature of sensor networks, Firefly Algorithm for Cluster Head selection which enables
where nodes engage in peer-to-peer communication to nodes to collaboratively decide CH roles based on real-time
identify cluster heads. The primary objective of this research environmental and network conditions. This adaptability is a
is to introduce and evaluate the synergistic integration of the significant improvement over existing approaches that rely
MDB-KMC clustering algorithm and The Firefly Algorithm solely on static criteria for CH selection. Moreover, Wang
for Cluster Head Selection in WSNs. et al. (2020) [5] propose a hybrid CH selection algorithm
The aim is to achieve the following: that considers both residual energy and distance to the base
station. While this approach offers energy savings, it assumes
1. Enhance the accuracy of cluster formation by
a known network topology, which may not hold in practical
considering both spatial and statistical characteristics
WSN deployments.
of sensor nodes using MDB-KMC.
2. Improve energy efficiency by dynamically selecting
cluster heads based on real-time. 3. Research Motivation
3. Conditions and network demand through the Firefly The research motivation behind this work is grounded in the
Algorithm. imperative need for energy-efficient solutions in Wireless
4. Extend the operational life span of WSNs, making Sensor Networks (WSNs) due to the limited battery capacity
them more sustainable and cost-effective for long-term of resource-constrained sensor nodes. This research ac-
monitoring and data collection applications. acknowledges the heterogeneity among sensor nodes and
This paper is organized as follows: Section 2 provides related the dynamic nature of WSN deployments, which necessitate
work in the field of energy-efficient clustering and cluster adaptable and responsive strategies for energy optimization.
head selection in WSNs. Section 3 presents the research The integration of the Mahalanobis Distance-Based K-Means
motivation Section 4 introduces the proposed work, while Clustering (MDB-KMC) algorithm and the Firefly Algorithm
Section 5 presents and analyses the results and winds up the is proposed to address existing research gaps that often
work. lack precision and adaptability. MDB- KMC accounts for
node diversity by considering both spatial distribution and
statistical attributes, while the Firefly Algorithm introduces
2. Related Work real-time adaptability inspired by nature. Together, these
In the realm of clustering for Wireless Sensor Networks innovations aim to provide a comprehensive solution for
(WSNs), numerous studies have explored the application of energy-efficient clustering and cluster head selection in
traditional clustering algorithms, such as K-Means, LEACH, WSNs, facilitating practical and sustainable deployments
and HEED, to optimize network energy consumption. While across various applications.
these approaches have made substantial contributions, they
often overlook the inherent heterogeneity among sensor 4. Proposed Work
nodes.
The proposed work encompasses the implementation and
The research by Zhangetal. (2017) [2] primarily focuses on integration of two key algorithms. First, the Mahalanobis
improving cluster formation by proposing a distributed energy- Distance-Based K-Means Clustering (MDB- KMC)
efficient clustering algorithm. However, this work does not algorithm will be deployed to improve cluster formation
fully address the variability in sensor rnode characteristics, accuracy. MDB-KMC incorporates spatial distribution and
including different sensing capabilities and communication statistical attributes, addressing the heterogeneity of sensor
ranges, which can significantly impact energy consumption nodes for more precise cluster creation. Second, the Firefly
patterns. This gap highlights the need for more sophisticated Algorithm will be integrated for dynamic Cluster Head (CH)
clustering methods, like the Mahalanobis Distance-Based selection. This algorithm adapts CH roles in real time based
K-Means Clustering (MDB-KMC), as employed in our on environmental conditions and network requirements,
research, to consider both spatial distribution and statistical optimizing energy utilization.
attributes for more accurate cluster formation.
Furthermore, Liu et al. (2019) [3] propose a cluster head 4.1 Clustering
selection algorithm based on node residual energy and The proposed MDB-KMC approach strategically forms
distance to the base station. While this approach addresses clusters based on the distance to the Cluster Head (CH) and the
398 Algorithms in Advanced Artificial Intelligence

energy levels of the CH. It effectively addresses the existing 4.5 Attractiveness Formula
challenge by considering only the distance measurements
The attractiveness b of a firefly is proportional to its light
between the CH and Sensor Nodes(SN),there by optimizing
intensity seen by neighboring fireflies.
Network Lifetime (NLT). MDB-KMC groups similar SNs
with the CH, thereby facilitating energy conservation. To b = b0 * e – g r2 (5)
achieve this, the MDB-KMC approach initially evaluates Where r is the distance between two fireflies and g controls
the Mahalanobis distance while considering the covariance the decrease of attractiveness with distance.
matrix S, the SN To achieve this, the MDBKMC approach
initially evaluates the Mahalanobis distance while considering 4.6 Movement
the covariance matrix S, Each firefly i is attracted towards more attractive (higher
AN ÈSN1*, SN 2*, SN 3*, SN 4*, ... SN N* ,˘˚ (1) intensity) fireflies j and moves towards them.
Î and CH
D xi = b0 * e – g r2ij * (xj – xi) + a * ?i (6)
ÈC1* , C2* , C3* , C4* , ..... C N* ˘ The second term introduces random movement with a
Î ˚ forms a cluster i.e.
(2) controlling the magnitude and ei being a random vector.
(C ) ( )
T
EKD = *
K - SN K* S -1 CK* - SN K*
4.7 Iterate Approach
Now, the SN that possesses less distance value is sorted to In each iteration, light intensity and movement are updated
that respective CH centered upon the evaluation and creates for each firefly by the above steps. Over iterations, fireflies
a cluster (CKi), i.e. cluster around nodes best suited as cluster heads. The
movements allow the algorithm to explore the search space
Î Î ˚ ( Î )
CKCNew = ÍÈCK1 ÈS1K CH1* ˘ , CK 2 ÈSK2 CH 2* ˘ ,
˚ ( ) and identify optimal CHs balancing energy and base station
(3) distance. The clustered high-intensity fireflies represent
( )
CK3 ÈSK3 CH3* ˘ , ... CK n ÈSKn CH N*
Î ˚ Î ( ) ˘˘
˚ ˙˚ the selected set of cluster head nodes in the wireless sensor
network.
Therefore, a cluster is formed by MD-KMC, and the suitable
CH is selected in a way that brings maximum NLT and
decreases the number of message exchanges along with 5. Results and Discussions
obtains independent time complexity of network growth. In this section, the proposed method is rigorously assessed
through a series of numerical evaluations, contrasting it
4.2 Cluster Head Selection with existing methodologies. The evaluation specifically
Firefly Algorithm for finding cluster heads in a Wireless focuses on one key aspect: the CH (Cluster Head) selection
sensor network by considering parameters like residual technique, denoted as the Proposed FA, technique.
energy, cost function, and distance to the base station. Here The entire experimentation is conducted within the
is a more detailed explanation of using the Firefly Algorithm Matlab environment, utilizing publicly accessible data.
for cluster head selection in wireless sensor networks: A network field of dimensions 2000 meters by 2000
meters is established to set the stage for these evaluations.
4.3 Encode Solution Within this spatial domain, a total of 250 sensor nodes
Each firefly represents a potential cluster head (CH) node. are randomly deployed. These nodes are positioned with
The location xi of firefly i maps to the location of sensor node a uniform separationof30metersbetweenadjacentnodes.
i sensor network deployment area. Itisassumedthateachsensornode has a radio range extending
up to 50 meters, and data packets are configured with a size
4.4 Objective Function of 512 bits. The simulation is carried out over 4200 seconds,
The light intensity Ii of each firefly encodes the desirability during which various performance metrics are assessed.
of selecting its corresponding sensor node as a cluster head. Additionally, packet sizes are varied within the range of
It is based on two metrics - residual energy (Ei) of a node and 500 bytes. Furthermore, the nodes in the network exhibit
distance to the base station (dib). mobility, with their speeds ranging from 0 to 20 meters per
second. Additionally, the mobility of the Sensor Nodes (SN)
Ii = w1 * Ei + w2 * 1/dib (4) is considered, with their speeds ranging from 2 to 20 meters
Here w1 and w2 allow weighting the relative importance per second. This comprehensive evaluation framework
of the two metrics. Maximizing Ii will maximize residual allows for a robust assessment of the proposed method’s
energy and minimize distance to the base station. effectiveness and efficiency. Illustrates the throughput values,
An MDB-KMC and Firefly-Based Clustering Approach for Energy Optimization in Wireless Sensor Networks 399

show casing that the proposed FA achieves a throughput accurately clustering nodes and iteratively selecting optimal
ranging from 240 kbps to 270 kbps for 50 to 250 sensor heads based on residual energy and base station distance
nodes with a 50-node increment. In contrast, existing modeled on Firefly Algorithm collective behavior. Extensive
methodologies exhibit a broader throughput range of 80 kbps simulations demonstrate the technique significantly improves
to 210 kbps, indicating some data loss between CHs and the efficiency and throughput over existing methods, optimizing
base station. The proposed FA achieves throughput values wireless sensor network energy utilization through adaptive
of 270Kbps (50SN), 260Kbps (100SN), 255Kbps (150SN), clustering and selection for robust real-world deployment.
245Kbps (200SN),and 235Kbps (250SN), with throughput
decreasing as the number of sensor nodes increases. Overall,
the proposed method consistently out performs other existing
6. Conclusions
systems in terms of both throughput and energy efficiency Our self-configured protocol for wireless sensor networks
Fig. 59.1 and Fig. 59.2. utilizes the proposed firefly-based cluster head selection
minimizing energy consumption when integrated with MDB
5.1 Performance Analysis of the Proposed CH KMC clustering. Experimental validation demonstrates
Selection Technique remarkable performance including extremely low 1000J
This research proposes an energy optimization approach energy consumption over 2000 rounds and high 225kbps
for Wireless Sensor Networks integrating Mahalanobis throughput with 250 nodes. The approach also exhibits
Distance-Based K-Means Clustering and the bio-inspired an excellent 710+ round network lifetime with 250 nodes,
Firefly Algorithm (for adaptive cluster head selection, establishing superior energy efficiency versus existing

Fig. 59.1 Graphical analysis of proposed FA based on Energy consumption

Fig. 59.2 Graphical analysis of proposed FA based on throughput

400 Algorithms in Advanced Artificial Intelligence

methods through comprehensive analysis. In future work, aware adaptive clustering in wireless sensor networks. Future
this research established the effectiveness of integrating Generation Computer Systems,104:1–14,2020.
MDB-KMC with the bio-inspired Firefly Algorithm for 4. V. Thala gondapati and M.P. Singh. A self-organized priority-
energy-efficient cluster head selection in WSNs. The self- based MAC protocol in wireless sensor networks based on
SDR-RHSO optimal relay node selection and HL-ANN
organizing and adaptive capabilities of the firefly approach
wake-up scheduling. Journal of Ambient Intelligence and
enhanced optimization performance.
Humanized Computing, 14(8):11093–11102, 2023.
5. Jin Wang, Yu Gao, Wei Liu, Wenbing Wu, and Se-Jung Lim.
References An asynchronous clustering and mobile data gatherings
chema based on timer mechanism in wireless sensor networks.
1. Abdulmughni Hamzah, Mohammad Shurman, Omar Al- Computers, Materials Continua, 58(3), 2019.
Jarrah, and Eyad Taqied- din. Energy-efficientfuzzy-logic 6. Quan Wang, Deyu Lin, Pengfei Yang, and Zhiqiang Zhang. An
based clustering technique for hierarchical routing protocols energy-efficient compressive sensing-based clustering routing
in wireless sensor networks. Sensors, 19(3):561, 2019. protocol for WSNs.IEEE Sensors Journal, 19(10):3950–3960,
2. Guangjie Han, Chenyu Zhang, Jinfang Jiang, Xuan 2019.
Yang, and Mohsen Guizani. Mobile anchor nodes path 7. Chuan Xu, Zhengying Xiong, Guofeng Zhao, and Shui Yu. An
planning algorithms using network-density-based clus- energy-efficient region source routing protocol for lifetime
teringinwirelesssensornetworks. Journal of Network and maximization in WSN. IEEE Access, 7:135277–135289,
Computer Applications, 85:64–75, 2017. 2019.
3. X.Liu, R.Zhu, A.Anjum, J.Wang, H.Zhang, and M.Ma.
Intelligentdatafu- sion algorithm based on hybrid delay- Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Software Requirements Based Software

Effort Estimation using RSLU-GNL-GRU in
Software Project Management
60

K. Harish Kumar1
Research Scholar, Department of Computer Science & Engineering,
Koneru Lakshmaiah Education Foundation, Hyderabad, Telangana, India,
and Assistant Professor, Department of Computer Science & Informatics,
Mahatma Gandhi University, Nalgonda, Telangana, India
K. Srinivas2
Professor, Department of Computer Science & Engineering,
Koneru Lakshmaiah Education Foundation,
Hyderabad, Telangana, India

Abstract: This paper presents a Software Effort Estimation (SEE) framework for project management, addressing the
increasing demand for high-quality software. It involves data gathering, preprocessing, BERT-based word embedding,
clustering with PLSKCD-K-Means, and task ranking using ZS-GTBOA. Feature extraction, dimensionality reduction, and SEE
implementation with the RSLU-GNL-GRU classifier follow. The experimental evaluation highlights the proposed technique’s
superior performance over existing models.
Keywords: BERT, PLSKCD-K-Means, ZS-GTBOA, FS, and RSLU-GNL-GRU form the key components in this approach

1. Introduction GRU-based SEE framework provides solutions. Integration

of software requirement and project details data.
The increasing demand for software projects requires effective • Introduction of the PLSKCD-K-Means algorithm for
Software Effort Estimation (SEE) for successful project grouping tasks.
management. SEE traditionally involves techniques like
• Development of a novel RSLU-GNL-GRU for SEE.
Expert judgment, User Stories, Analogy-based estimations,
and Use case point framework. This paper proposes a DL- The paper’s structure includes a review of prior works
based SEE model using RSLU-GNL-GRU, overcoming (Section 2), discussion of the proposed model (Section 3),
limitations of traditional ML techniques, and enhancing performance analysis (Section 4), and a concluding section
accuracy and efficiency in software project estimation. (Section 5).

1.1 Problem Definition

2. Literature Survey
This paper addresses drawbacks in prevailing SEE models,
emphasizing issues like neglect of project requirements, Various approaches for Software Effort Estimation (SEE) have
reliance on expert opinions for quantitative assessment, and been explored. (Rankovic et al., 2021): Introduced a DANN-
inaccuracies in task estimation. The proposed RSLU-GNL based SEE using Taguchi’s orthogonal arrays, minimizing

1
khrsharma@gmail.com, 2srirecw9@klh.edu.in

DOI: 10.1201/9781003529231-60
402 Algorithms in Advanced Artificial Intelligence

Magnitude of Relative Error but lacking focus on software Developed EEAC, a software development effort estimation
experimentation needs. (Khan et al., 2021): Developed a model, incorporating data clustering and FPA methods.
DNN for SEE with metaheuristic algorithms, achieving While exhibiting better performance, FPA was noted to be
superior outcomes but facing elevated convergence time in time-consuming due to numerous elements.
some learning rate scenarios. (De Carvalho et al., 2021):
Proposed ELM for SEE, demonstrating better outcomes 3. Proposed Software Effort
than prevailing methods but encountering forecasting
difficulties with limited data.(Ali et al., 2023): Designed Estimation Framework
a heterogeneous Ensemble Effort Estimation method, This paper introduces an SEE framework for software project
integrating models for improved estimation, dependent on management, centering on RSLU-GNL-GRU. It groups
model weight assignments. (Nhung et al., 2022): Presented software requirement and project data by task size, ranks
a parametric SEE methodology using Optimizing Correction them with ZS-GTBOA, and estimates effort using RSLU
Factors and Multiple Regression Models, outperforming GNL-GRU as shown in below system’s structural design.
other models but facing challenges in estimation accuracy
due to differing historical data distributions.(Rhmann et 3.1 Preprocessing of Project Data
al., 2022): Explored hybrid search techniques and weighted
The proposed model starts by gathering software project data.
ensemble with met heuristic algorithms for SEE, surpassing
ML-based algorithms but lacking in determining economic P = { p1 , p2 , p3 ,...., pm }
(1)
benefits for a software organization.(Van Hai et al., 2022):

Fig. 60.1 Proposed methodology’s block diagram

Software Requirements Based Software Effort Estimation using RSLU-GNL-GRU in Software Project Management 403

Where, the data regarding the mth project is denoted as pm. • Output layers: The encoded output is directed to the
Pre-processing is executed using the ‘3’ steps. output layer with a simple classifier, comprising a
• Missing value imputation: Missing values are imputed fully connected layer and an activation function (ℓ) is
with the average of adjacent values. computed as,
• Repeated data removal: Duplicates in the data are 1Ê q ˆ
eliminated. ℓ= Á
2Ë
k - Â Vi ˜ (5)
i=1 ¯
• Numeralization: String values are converted to
numerical format for processing. Where, the target word embedding score is denoted as k, and
V is the embedded output.
The pre-processed project data is represented as Ppre.

3.2 Preprocessing of Requirement Data 3.4 Data Merging

After processing, the project and requirement data are merged
Software requirement data is collected and pre-processed,
(¿) as,
denoted as R.
R = {r1 , r2 , r3 ,...., rm } ¿ = {PPre + V} (6)
(2)
Where, the mth project’s software requirements are notated as 3.5 Clustering
rm. This data undergoes preprocessing in 4 steps, The merged data is clustered using the PLSKCD-K-Means
The software requirement data undergoes four processes: framework, addressing efficiency issues in unsupervised
• Tokenization: Requirement texts are divided into tokens. grouping by incorporating Kendall Correlation distance and
partial least square-centered correlation for averaging data
• Stop-word removal: Unnecessary words are eliminated.
points.
• Stemming: Removal of prefixes and suffixes.
Firstly, the number of clusters(d) is defined and the average
• Weightage calculation: Using TF-IDF, determining the
data points (l) are chosen by PLS correlation as,
frequency and weight values of the data. The TF-IDF
score (G) is estimated as,
l=
nÂ ab - (Â a )(Â b) (7)
nÂ a 2 - (Â a )
Ê Rˆ 2
G = fu,v ¥ log Á ˜ (3)
Ët ¯u
Where, the consecutive data points belonging to ¿ are
Where, the number of occurrences of the term u in data v is notated as a and b and n is the total number of data points.
notated as fu,v and tu denotes the number of data containing After that, centered on the distance betwixt the centroids and
u. Later, based on high weightage, the frequently occurring the data points, the data having similar duration are assigned
words are removed. Thus, the preprocessed data is notated to a cluster centroid. The distance (dist) is computed as,
as Rpre.
dist(l , AŒ¿) = Â H i, j (l, AŒ¿) (8)
3.3 Word Embedding i , j ŒQ

Here, using the BERT algorithm, the data Rpre is embedded. Where, the Kendall constant is signified as H, and the set of
Primarily, for each word in Rpre, the tokens (¡) are determined unordered pairs of the data points is notated as Q. Until there
and fed to the embedding layer. are no more changes, the above procedure is repeated. The
• Embedding layer: This layer executes token embedding, final clustered output F is articulated as,
segment embedding, and position embedding on ¡ and F = {Tl arg e, Tsmall } (9)
provided to the transformer encoder layer.
• Transformer encoding layer: The embeddings are Where, the large tasks and small tasks are denoted as Tl arg e
transformed into numerical vectors. The string values and Tsmall respectively.
are encoded by the encoder and the decoder provides the
3.6 Ranking
contextual embedding (V) as,
Utilizing ZS-GTBOA, Tl arg e is ranked based on project size
{
V = c1 , c2 ,....., cq } (4) for efficient and accurate estimation. GTBOA’s parameter
initialization, initially using normalization, is modified due
Where, the qth string’s contextual embedding is denoted to limitations in handling outlier positions during beetle
as cq. initialization. Z-score normalization is employed as an
404 Algorithms in Advanced Artificial Intelligence

alternative. Tl arg e is considered as the beetle population in

Algorithm 1: ZS-GTBOA Technique
ZS-GTBOA. The problem variables are specified as,
Input: Large Tasks (Tl arg e)
È T1,1 T1,2 � T1,N ˘
ÍT Output: Ranked tasks (Tranked)
T2,2 � T2,N ˙˙
=Í
2,1
Tl arg e (10) Begin
Í � � � � ˙
Í ˙ Initialize population, problem variables.
ÎÍTB,1 TB,2 � TB, N ˙˚
Initialize the parameters,
Wherein, the total number of tasks is notated as B and the c - mean( c )
Ti, j =
number of variables is signified as N. Here, the task’s s
maximum size (max(rsize)) is considered as fitness (F),
For i=1 to n do
F (Tl arge ) = max(rsize ) (11) Calculate F(Tl arg e) = max(ssize)
To initialize the parameters, Z-score normalization is defined Compute the number of mature beetles
as, Store the solution in Mature Population
c - mean ( c ) Compute the two solutions,
Ti, j = (12)
s
th th
(
T1 = j ◊ Trand1 + (1 - j ) ◊ Trand2 - g 1 )
Where, the initial value of the j variable of the i beetle is
notated as Ti, j, mean(c) denotes the parameters’ mean value, (
T2 = j ◊ Trand2 + (1 - j ) ◊ Trand1 - g 2 )
and the standard deviation is notated as s. By the expression, Store the solution in survival population
the solutions for the mature beetles are obtained,
End for
TiG = TiG + √switch * (Trand
G
1
G
- Tbest ) (13) Select the best one.
G
Where, Ti is the female beetle’s position in the generation Return (Tranked)
G
G that goes towards the golden male beetle T rand , the color
1 End
changing operator is notated as √switch, rand1 is a random
integer in [1, B], and the best fitness solution at G is proffered
G
as Tbest. The color-switching operator is defined by 3.7 Feature Extraction
√switch = (Randn ◊ cosq ) + (K , v ) (14) After that, the features of (Tranked) and (Tsmall) are extracted.
Where, a normal random function in [1, n] is notated as The features like rely (Y1), data(Y2), cplx(Y3), time(Y4),
Randn, the normal angle is signified as q, a constant value stor(Y5), virt(Y6), turn(Y7), acap(Y8), aexp(Y9), pcap(Y10),
is modeled as K, and the wavelength is illustrated as v. vexp(Y11), lexp(Y12), modp(Y13), tool(Y14), sced(Y15),
For generating the survived beetles, a crossover operator is and loc(Y16) are extracted. The extracted feature set (Ys) is
considered, which is articulated as, described as,
Y s = {Y1 , Y 2 , Y 3 ,......, Y16 }
(
T1 = j ◊ Trand1 + (1 - j ) ◊ Trand2 - g 1 ) (15)
(19)

T2 = j ◊ Trand2 + (1 - j ) ◊ (Trand1 -g 2 ) (16) 3.8 Dimensionality Reduction

FS technique is employed for dimensionality reduction,
Where, a random number is notated as f, and Trand1 and Trand2 significantly reducing time-space complexity and enhancing
are ‘2’ randomly chosen solutions. The terms g1 and g2 are output accuracy. Specifically, the input data matrix y Œ Ysd × n
defined by, is diminished to yred Œ Ysk × n. The FS process is described as,

(
g 1 = (1 - Q ) ◊ Tbest - Trand1 ) (17)
{(
��
)(
FS = u M1 M 2 + a I
-1
) } (20)
g2 = (1 - Q ) ◊ (T best - Trand2 ) (18)
Wherein, u signifies the total number of instances, a
Where, the best solution is signified as Tbest and the crossover regularization parameter is notated as a, I denotes the
operator is notated as Q. Therefore, using the ZS-GTBOA, perturbation term, and the between-class
�� scatter and total
the large tasks are ranked (Tranked) centered on size. The ZS scatter matrix are notated as M1 and M 2 .
GTBOA’s procedure is explained in Algorithm 1.
Software Requirements Based Software Effort Estimation using RSLU-GNL-GRU in Software Project Management 405

3.9 Effort Estimation Ue = ћ(wU * gnorm + wUje–1 + U) (24)

Finally, is input to the RSLU-GNL-GRU classifier for SEE
ÏÔU e + max(0,U e ) if U e > 0
yred. GRU’s gating mechanisms enhance learning speed, but ћ(Ue) = W Ì U (25)
its information preservation can be limited. To address this, ÓÔ(we - w) + max(0,U e ) if U e £ 0
e

RELU and SELU are combined as activation functions, and a

Group Normalization layer after the input layer is added for The current memory content (j¢e) requires Ye to pass the
enhanced learning efficiency (see Fig. 60.2). relevant information, whereas the final memory unit (je)
holds the information. The memory states are computed as,
The input is provided to the GNL that executes the following
operation, j e = (1 - U e )j e + U ej e-1 (26)
y - mean(y red )
gnorm = red (21) je¢ = tanh(wj ◊ gnorm + wj,j¢ (Ye * je–1) + j) (27)
s2 + b
Efforts for software project management are estimated using
Where, the output of GNL is notated as gnorm and b prevents
the hyperbolic tangent function in RSLU-GNL-GRU. The
the chance of dividing by zero error. The ‘2’ primary gates
model’s efficacy is evaluated in the next section.
are as follows,
• Reset Gate: To compute the reset gate (Ye), a linear sum
betwixt the newly computed state and the existing state 4. Results and Discussion
with the bias parameter is employed. It is articulated as, This section assesses the superiority of the proposed
U e = � (w U * gnorm + w ¬fe-1 + � U ) (22) methodology implemented in Python.
Where, the previous memory gate information is notated as 4.1 Database Description
(je–1),  and w denotes the weight and bias value, and the
The proposed framework utilizes the COCOMO’81 dataset,
RSLU activation function is proffered as ћ,
containing information on development effort, time, and
ÏÔ U e + max(0, U e ) if U e > 0 software development details.
ћ(Ye) = W Ì Ue
(23)
ÓÔ(we - w) + max(0, U e ) if U e £ 0 4.2 Performance Analysis of the Proposed
Where, the activation constants are denoted as W and e. RSLU-GNL-GRU
• Update Gate: The update gate determines how much The proposed methodology’s performance is compared with
of the earlier information as of previous time (e – 1) GRU, CNN, LSTM, and RNN. Figure 60.3 demonstrates the
steps are necessitated to be kept. The update gate Ue is proposed approach’s superior performance, achieving a 98%
computed as,

Fig. 60.2 RSLU-GNL-GRU

406 Algorithms in Advanced Artificial Intelligence

estimation rate compared to lower rates of 95% (GRU), 92%

(LSTM), 89% (RNN), and 88% (CNN). The use of RSLU
in the proposed methodology enhances learning efficiency,
resulting in more effective estimation output.
In Fig. 60.4, the proposed RSLU-GNL-GRU shows lower
training time (38007ms) than conventional GRU (41008ms).
RSLU-GNL-GRU’s computation time (12454 ms) is notably
lower than prevailing RNN (20013ms), showcasing improved
learning stability and reduced training time with GNL in
GRU. In Fig. 60.5, the proposed software effort estimation
approach demonstrates lower error values: 0.0145% (MSE),
0.1207% (RMSE), 0.0097% (MAE), and 0.1353% (MFE),
compared to conventional LSTM with values of 0.7502%,
0.8661%, 0.0097%, and 0.25%. Overall, the proposed
approach proves more significant for SEE.
Fig. 60.3 Performance analysis of the proposed RSLU-GNL
GRU

Fig. 60.5 Performance measure

(a)
In Table 60.1, the proposed RSLU-GNL-GRU achieves low
MAPE (0.1256%) and SMAPE (0.0357%) values compared
to higher values obtained by prevailing approaches. This
highlights the proposed approach’s efficiency in handling
uncertain circumstances and achieving superior outcomes.

Table 60.1 Comparative analysis of proposed RSLU-GNL

GRU
Techniques Performance metrics (%)
MAPE SMAPE
Proposed RSLU-GNL-GRU 0.1256 0.0357
GRU 0.5211 0.9534
LSTM 1.0686 1.5545
RNN 1.1209 1.9053
CNN 1.2011 2.1247
(b)
Fig. 60.4 Graphical representation of the proposed RSLU Figure 60.6 illustrates the proposed model’s efficiency in
GNL-GRU (a) Training Time (b) Computational Time terms of loss. The loss values consistently decrease with
Software Requirements Based Software Effort Estimation using RSLU-GNL-GRU in Software Project Management 407

(a) (b)
Fig. 60.6 Loss value of the proposed method during (a) training, and (b) testing

increasing epochs, reaching 0.35 at 500 epochs, similar to the

training stage. This indicates the superior performance of the
proposed approach compared to other techniques.

Table 60.2 Comparative analysis of proposed PLSKCD-K

means
Techniques Clustering accuracy (%)
Proposed PLSKCD-K-means 98.9105
K-means 95.5701
Birch 91.3949
K-medoid 89.7272
Clarans 88.5644

Table 60.2 reveals a clustering accuracy of 98.9105% for

the proposed method, significantly surpassing the 95.5701% Fig. 60.7 Performance Comparison
accuracy achieved by prevailing K-means. The proposed
technique demonstrates notable performance in cluster
formation compared to other existing algorithms.

4.3 Performance Measurement of Clustering

The proposed clustering algorithm’s performance analysis is
analogized with the prevailing Kmeans, Birch, K-Medoid,
and Clarans. Figure 60.7 shows the proposed algorithm
forming efficient clusters in 31539 ms, outperforming
prevailing methods like K-means, Birch, K-Medoid, and
Clarans, which require 35184 ms, 38508 ms, 40453 ms, and
42706 ms, respectively. The use of PLS-KCD-based distance
measurement contributes to effective clustering.
In Fig. 60.8, the proposed ZS-GTBOA achieves 86.3289
optimal solutions in 10 iterations, outperforming prevailing
GTBOA with 85.4875 optimal solutions at the same
iterations. The fitness value improves with increasing Fig. 60.8 Performance measure of the proposed ZS-GTBOA
408 Algorithms in Advanced Artificial Intelligence

iterations, indicating that ZS in the proposed model leads to 2. De Carvalho, H. D. P., Fagundes, R., & Santos, W. (2021).
better fitness values and optimal solutions. Extreme Learning Machine Applied to Software Development
Effort Estimation. IEEE Access, 9, 92676–92687. https://doi.
4.4 Performance Evaluation of the Proposed ZS org/10.1109/ACCESS.2021.3091313
GTBOA 3. Kaur, A., & Kaur, K. (2022). Systematic literature review
of mobile application development and testing effort
The proposed optimization algorithm’s performance is estimation. Journal of King Saud University - Computer and
compared with GTBOA, FOA, SFOA, and PSOA. The Information Sciences, 34(2), 1–15. https://doi.org/10.1016/j.
proposed approach exhibits superior performance and jksuci.2018.11.002
promising outcomes compared to other existing methods with 4. Khan, M. S., Jabeen, F., Ghouzali, S., Rehman, Z., Naz, S.,
higher processing times. Figure 60.9 shows that the proposed & Abdul, W. (2021). Metaheuristic Algorithms in Optimizing
SEE achieves 0.2450% MMRE, outperforming ANN with Deep Neural Network Model for Software Effort Estimation.
0.439% MMRE. Other methods also exhibit higher MMRE. IEEE Access, 9, 60309–60327. https://doi.org/10.1109/
ACCESS.2021.3072380
The proposed methodology enhances SEE efficacy through
5. Kumar, P. S., Behera, H. S., Anisha Kumari, K., Nayak, J.,
the utilization of SEE requirement details.
& Naik, B. (2020). Advancement from neural networks to
deep learning in software effort estimation: Perspective of
two decades. Computer Science Review, 38, 1-32. https://doi.
org/10.1016/j.cosrev.2020.100288
6. Mahmood, Y., Kama, N., Azmi, A., Khan, A. S., & Ali, M.
(2022). Software effort estimation accuracy prediction of
machine learning techniques: A systematic performance
evaluation. Software - Practice and Experience, 52(1), 39–65.
https://doi.org/10.1002/spe.3009
7. Nhung, H. L. T. K., Van Hai, V., Silhavy, R., Prokopova, Z.,
& Silhavy, P. (2022). Parametric Software Effort Estimation
Based on Optimizing Correction Factors and Multiple
Linear Regression. IEEE Access, 10, 2963–2986. https://doi.
org/10.1109/ACCESS.2021.3139183
8. Pandey, M., Litoriya, R., & Pandey, P. (2020). Validation of
Fig. 60.9 Comparative measure of the proposed framework Existing Software Effort Estimation Techniques in Context
with Mobile Software Applications. Wireless Personal
Communications, 110, 1659–1677. https://doi.org/10.1007/
4.5 Comparative Measurement with Literature s11277-019-06805-0
Papers 9. Priya Varshini, A. G., Anitha Kumari, K., Janani, D., &
The proposed framework is compared to conventional Soundariya, S. (2021). Comparative analysis of Machine
DANN, DNN, and SVR (Rankovic et al., 2021; Khan et al., learning and Deep learning algorithms for Software Effort
2021; Rhmann et al., 2022) for superiority Estimation. Journal of Physics: Conference Series, 1767(1),
1-11. https://doi.org/10.1088/1742-6596/1767/1/012019
10. Rankovic, N., Rankovic, D., Ivanovic, M., & Lazic, L.
5. Conclusion (2021). A New Approach to Software Effort Estimation Using
Different Artificial Neural Network Architectures and Taguchi
This paper proposes an SEE technique using RSLU-GNL Orthogonal Arrays. IEEE Access, 9, 26926–26936. https://doi.
GRU, incorporating software requirements. The process org/10.1109/ACCESS.2021.3057807
involves pre-processing, word embedding, data combination, 11. Rhmann, W., Pandey, B., & Ansari, G. A. (2022). Software
clustering, ranking, and estimation, achieving a 98% effort estimation using ensemble of hybrid search-based
estimation rate with a training time of 38007ms. Efficient algorithms based on metaheuristic algorithms. Innovations in
clusters form in 31539ms with a 98.91% accuracy rate. The Systems and Software Engineering, 18(2), 309–319. https://
proposed framework outperforms existing methods, but lacks doi.org/10.1007/s11334-020-00377-0
project risk assessment, a topic for future work. 12. Sudarmaningtyas, P., & Mohamed, R. (2021). A review article
on software effort estimation in agile methodology. Pertanika
Journal of Science and Technology, 29(2), 837–861. https://
References doi.org/10.47836/pjst.29.2.08
13. Tawosi, V., Sarro, F., Petrozziello, A., & Harman, M. (2022).
1. Ali, S. S., Ren, J., Zhang, K., Wu, J., & Liu, C. (2023). Multi-Objective Software Effort Estimation: A Replication
Heterogeneous Ensemble Model to Optimize Software Effort Study. IEEE Transactions on Software Engineering, 48(8),
Estimation Accuracy. IEEE Access, 11, 27759-27792. https:// 3185–3205. https://doi.org/10.1109/TSE.2021.3083360
doi.org/10.1109/access.2023.3256533
Software Requirements Based Software Effort Estimation using RSLU-GNL-GRU in Software Project Management 409

14. Van Hai, V., Nhung, H. L. T. K., Prokopova, Z., Silhavy, machines for software effort estimation. PROMISE 2020,
R., & Silhavy, P. (2022). Toward Improving the Efficiency Co-Located with ESEC/FSE 2020, 31–40. https://doi.
of Software Development Effort Estimation via Clustering org/10.1145/3416508.3417121
Analysis. IEEE Access, 10, 83249–83264. https://doi. 17. Xia, T., Shu, R., Shen, X., & Menzies, T. (2022). Sequential
org/10.1109/ACCESS.2022.3185393 Model Optimization for Software Effort Estimation. IEEE
15. Varshini, A. G. P., & Kumari, K. A. (2020). Predictive analytics Transactions on Software Engineering, 48(6), 1994–2009.
approaches for software effort estimation : A review. Indian https://doi.org/10.1109/TSE.2020.3047072
Journal of Science and Technology, 13(21), 2094–2103.
Note: All the figures and tables in this chapter were designed by
16. Villalobos-Arias, L., Quesada-López, C., Guevara-Coto,
the author.
J., Martínez, A., & Jenkins, M. (2020). Evaluating hyper-
parameter tuning using random search in support vector
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
410 Algorithms in Advanced Artificial Intelligence

The Evolution and Impact of Large

Language Models in Artificial Intelligence 61

Chaitanya. K1
Carelon Global Solutions, Hyderabad, India,
Krishna Jayanth Rolla2
Fort Mill, SC 29715

Abstract: This research paper explores the historical evolution of artificial intelligence (AI) and the transformative emergence
of large language models (LLMs). The historical context delves into the inception of AI at the Dartmouth Conference in 1956,
tracing the field’s journey through periods of optimism, such as the development of expert systems, and skepticism, leading to
AI winters. The resurgence of AI in the 21st century is closely linked to breakthroughs in machine learning, particularly deep
learning, setting the stage for advancements in LLMs. The significance of LLMs is a focal point, showcasing their diverse
applications in natural language processing (NLP) and their role in reshaping human-computer interaction. Models like GPT-3,
with its unprecedented 175 billion parameters, exemplify the prowess of LLMs in tasks ranging from healthcare applications,
such as medical literature review, to business applications, where chatbots enhance customer service interactions. The pre-
training and fine-tuning methodology, rooted in deep learning principles, underscores the adaptability of LLMs across varied
NLP domains. Furthermore, the paper examines h,ow LLMs represent a broader advancement in the field of machine learning
and deep learning. The scale of these models enables them to capture intricate patterns and dependencies in data, influencing the
approach to transfer learning. Large language models, trained on extensive datasets, exhibit generalized learning capabilities,
sparking ongoing exploration into more efficient training methodologies and architectures. The continuous quest for enhanced
model interpretability, efficiency, and generalization capabilities forms a key aspect of the paper’s exploration of the evolving
landscape of AI and LLMs.
Keywords: Artificial intelligence (AI), Large language models (LLMs), Machine learning, Natural language processing (NLP),
GPT-3, Healthcare applications, Transfer learning, Generalized learning capabilities

1. Introduction Marvin Minsky discussed the potential of creating intelligent

machines. Early AI systems, based on rule-based logic and
1.1 Background symbolic reasoning, showed promise but faced limitations
1. Brief history of artificial intelligence due to the complexity of real-world problems. The field
The history of artificial intelligence (AI) is marked by experienced periods of optimism, such as the development
significant milestones that have shaped its trajectory. The of expert systems in the 1980s, and skepticism, leading to AI
concept of AI dates back to ancient times, with myths and winters. The resurgence of AI in the 21st century is closely
stories featuring artificial beings. However, the formal tied to advancements in machine learning, particularly deep
exploration of AI as a scientific discipline began in the mid learning, which has fueled breakthroughs in large language
20th century. In 1956, the Dartmouth Conference marked a models and transformative applications across various
pivotal moment, where researchers like John McCarthy and domains [1].

1
ckanchibhotla@gmail.com, 2J.rolla2@gmail.com

DOI: 10.1201/9781003529231-61
The Evolution and Impact of Large Language Models in Artificial Intelligence 411

2. Emergence and development of large language models 2. Literature Review

The emergence and development of large language 2.1 Overview of Existing Large Language Models
models (LLMs) represent a recent paradigm shift in AI.
Early language models struggled with the complexity of The landscape of large language models (LLMs) has evolved
natural language understanding and generation. However, significantly, with GPT-3 standing out as a pinnacle of
the introduction of transformer architectures, especially achievement. GPT-3, developed by OpenAI, represents a
exemplified by models like OpenAI’s GPT series, has breakthrough in scale and complexity, boasting an impressive
revolutionized language processing capabilities. GPT-3, 175 billion parameters. This enormous scale allows GPT-3 to
with its staggering 175 billion parameters, demonstrated capture intricate patterns and dependencies in data, making it a
unprecedented language generation prowess. LLMs are built powerful tool for a wide range of natural language processing
on the principles of deep learning, utilizing neural networks tasks [3]. Its architecture is built upon the transformer model,
with attention mechanisms to process and generate human employing attention mechanisms that enable the model to
like text. The pre-training approach, where models are first understand context and relationships within vast amounts
trained on massive amounts of diverse data and then fine- of text data. In comparison, models like BERT and XLNet
tuned for specific tasks, has become a cornerstone of LLM take alternative approaches to achieve similar goals. BERT’s
development. This methodology enables models to learn bidirectional training allows it to understand context more
intricate language patterns and contextual dependencies, effectively, while XLNet introduces permutation-based
contributing to their remarkable performance across a myriad language modeling, enhancing its ability to capture long-
of natural language processing applications [2]. range dependencies [4,5].
The literature underscores the dynamic nature of research
1.2 Significance of Large Language Models (LLMs) in this field, with each model presenting unique advantages
1. Applications in natural language processing and trade-offs. Researchers continually explore how these
Large Language Models have become pivotal in natural models can be optimized for specific tasks and domains.
language processing (NLP), revolutionizing how machines The evolution of LLMs highlights the ongoing quest for
understand and generate human-like text. GPT-3, in more efficient and effective natural language understanding
particular, has showcased its versatility across diverse and generation systems. As the field progresses, researchers
applications. In healthcare, LLMs aid in medical literature grapple with challenges such as model interpretability, fine-
review, extracting valuable insights from vast amounts of tuning strategies, and addressing biases inherent in training
text data. The business sector benefits from chatbots powered data [6].
by LLMs, enhancing customer service interactions through
context-aware responses. LLMs also find applications in 2.2 Applications of Large Language Models
content creation, automated code generation, and sentiment The applications of large language models (LLMs) span a
analysis. Their ability to understand context and generate broad spectrum, revolutionizing natural language processing
coherent text has far-reaching implications, contributing to and expanding the possibilities of human-computer
advancements in virtual assistants, content generation, and interaction. GPT-3, with its exceptional language generation
language translation [3]. capabilities, has found applications in creative writing, content
2. Advancements in machine learning and deep learning generation, and even code completion through projects like
OpenAI’s Codex [7]. BERT, on the other hand, has excelled
The development of large language models represents a
in tasks that require a deep understanding of context, such as
significant advancement in the broader landscape of machine
question-answering and sentiment analysis [4]. These models
learning and deep learning. The scale of these models, with
have become indispensable in various industries, contributing
billions of parameters, allows them to capture intricate
to advancements in healthcare, education, and business.
patterns and dependencies in data. The pre-training and fine-
tuning approach has not only led to breakthroughs in natural In healthcare, LLMs like GPT-3 have been employed for
language processing but has also influenced the field’s medical diagnosis and literature review, showcasing their
approach to transfer learning. Large language models trained potential to assist medical professionals in processing
on extensive datasets demonstrate a capacity for generalized vast amounts of information [8]. In education, intelligent
learning, where knowledge acquired in one domain can be tutoring systems powered by LLMs offer personalized
applied to related tasks. This has sparked exploration into learning experiences, adapting to the unique needs of
more efficient training methodologies and architectures, individual students [8]. Businesses leverage LLMs for
with an ongoing quest to enhance model interpretability, chatbots, enhancing customer service by providing quick and
efficiency, and generalization capabilities [4]. contextually relevant responses [11]. Despite these successes,
412 Algorithms in Advanced Artificial Intelligence

ethical concerns loom large. Biases present in training data

can lead to unintended consequences, and the responsible
deployment of these powerful models remains a pressing
issue in their widespread adoption [12].

2.3 Critiques and Ethical Concerns

As large language models (LLMs) gain prominence, critiques
and ethical concerns have become central to discussions
surrounding their development and deployment. One major
challenge is the presence of biases in the training data used to
train these models. Biases in data can lead to skewed outputs,
reinforcing stereotypes or perpetuating discrimination [13].
For instance, if a language model is trained on biased text
data, it may inadvertently generate biased or discriminatory
content.
Ethical considerations extend beyond biases and encompass
the potential misuse of LLMs. These models, with their
powerful language generation capabilities, raise concerns
about the generation of misleading or malicious content.
The risk of deepfakes, automated misinformation, and the
creation of harmful narratives underscore the importance
of responsible AI development practices [14]. Striking a
balance between innovation and ethical considerations is
crucial, necessitating transparency in model development,
the implementation of ethical guidelines, and ongoing efforts
to address these concerns [15].

3. Methodology
3.1 Model Architectures Fig. 61.1 GPT-3 architecture
1. Deep dive into GPT-3 architecture
GPT-3 (Generative Pre-trained Transformer 3) (refer strengths. BERT (Bidirectional Encoder Representations
Fig. 61.1) stands as a pinnacle in large language models, from Transformers), for instance, differs in its approach
representing a breakthrough in natural language understanding by employing bidirectional training. BERT considers both
and generation. The architecture of GPT-3 is rooted in the left and right context during training, enhancing its ability
transformer model, featuring attention mechanisms that to understand context. XLNet, another notable model,
allow it to capture intricate contextual dependencies in vast introduces permutation-based language modeling, allowing
amounts of text data. With an unprecedented 175 billion it to capture long-range dependencies in text. While GPT
parameters, GPT-3 is organized into 96 transformer layers, 3 focuses on unidirectional language modeling, it excels
enabling it to process and generate human-like text with in generating coherent and contextually relevant text. Each
remarkable coherence and versatility. Each layer contributes model has unique advantages and trade-offs. GPT-3’s
to the model’s ability to understand context, utilizing self- massive scale facilitates diverse applications, while BERT’s
attention mechanisms to weigh the importance of different bidirectional training is advantageous for tasks requiring
words in a given sequence. The sheer scale of GPT-3 allows a deeper understanding of context. Understanding these
it to exhibit few-shot learning capabilities, where the model architectural nuances is crucial for selecting the most suitable
can perform new tasks with minimal task-specific training model for specific natural language processing tasks.
data, showcasing the effectiveness of pre-training on diverse
linguistic contexts. 3.2 Training Data and Preprocessing
2. Comparison with other prominent models 1. Importance of diverse and representative datasets
In comparing GPT-3 with other prominent language models, The success of large language models is intricately tied to
it’s essential to consider alternative architectures and their the quality and diversity of the training data. Diverse and
The Evolution and Impact of Large Language Models in Artificial Intelligence 413

representative datasets are crucial for ensuring that the

models generalize well across various linguistic contexts
and demographics. GPT-3, in particular, benefits from pre-
training on vast and diverse text data from the internet,
allowing it to learn intricate language patterns and nuances.
Access to diverse data helps the model comprehend a wide
array of topics, improving its performance in natural language
understanding and generation tasks. However, challenges
arise in curating such datasets, as biases present in the data
can be inadvertently learned by the model, leading to biased
outputs. Therefore, careful consideration and preprocessing
of training data are essential to mitigate biases and ensure the
model’s ethical deployment.
2. Strategies for mitigating biases in training data Fig. 61.2 LLM finetuning process

Mitigating biases in training data is a critical aspect of

cover the diverse range of natural language processing tasks.
responsible AI development. Strategies include thorough
Models may excel in certain areas but struggle in others,
data preprocessing to identify and rectify biased patterns,
making it challenging to provide a holistic evaluation.
adversarial training to expose the model to edge cases and
Another limitation lies in the difficulty of evaluating the
potential biases, and the incorporation of ethical guidelines
models’ understanding of context, subtleties, and potential
during dataset curation. Additionally, algorithmic fairness
biases in generated text. The interpretability of models, or
considerations play a vital role in addressing biases in
lack thereof, poses a hurdle in understanding their decision-
training data. It involves developing models that not only
making processes. Additionally, over-reliance on benchmark
perform well but also exhibit fairness and transparency in
datasets can lead to overfitting, where models perform well
their outputs. While no approach can completely eliminate
on specific tasks but struggle with real-world variations.
biases, these strategies contribute to minimizing their impact
As the field progresses, addressing these challenges is
and fostering the development of more ethical and unbiased
crucial for refining evaluation methodologies and ensuring
large language models.
a more accurate representation of large language models’
3.3 Evaluation Metrics capabilities.

1. Assessing language models’ performance: In conclusion, an in-depth exploration of the methodology

behind large language models involves understanding their
Evaluating the performance of language models is a architectures, comparing them with alternative models,
multifaceted process involving various metrics tailored to emphasizing the importance of diverse training datasets,
specific tasks. Common evaluation metrics include perplexity, implementing strategies to mitigate biases, and critically
which measures the model’s ability to predict a sequence of assessing the challenges and limitations in their evaluation.
words, BLEU score, which assesses the quality of machine-
generated text compared to human references, and F1 score,
particularly relevant for tasks like question-answering. For 4. Case Studies
large language models like GPT-3, success is often measured 4.1 Real-world Applications
by their ability to generate coherent and contextually 1. Healthcare: Diagnosis and Medical Research
relevant text across diverse prompts. The evaluation process
typically involves fine-tuning the model on specific tasks and In the realm of healthcare, large language models (LLMs)
assessing its performance against benchmark datasets (refer have demonstrated substantial impact, particularly in the
Fig. 61.2) [32]. However, the adequacy of existing evaluation areas of diagnosis and medical research. GPT-3, with
metrics is an ongoing discussion, with researchers exploring its exceptional language generation capabilities, has
new approaches to capture the intricacies of language been employed to enhance medical literature review. By
understanding and generation. processing vast amounts of textual data, GPT-3 assists
healthcare professionals in staying updated with the latest
2. Challenges and limitations in evaluation research findings and streamlining the extraction of relevant
Despite the utility of traditional evaluation metrics, challenges insights[14]. Moreover, LLMs contribute to medical diagnosis
and limitations persist in assessing language models, by analyzing patient records, clinical notes, and research
especially large ones like GPT-3. One significant challenge papers. These models aid physicians in identifying patterns
is the lack of standardized benchmarks that comprehensively and gaining contextual information, ultimately improving the
414 Algorithms in Advanced Artificial Intelligence

accuracy and efficiency of diagnostic processes. For instance, 4.2 Success Stories and Challenges
studies have shown successful applications of LLMs in
1. Highlighting Instances of Successful Implementation
identifying rare diseases by leveraging their language
understanding capabilities [15]. The deployment of LLMs Successful implementations of large language models are
in healthcare showcases their potential to revolutionize evident across various domains. In healthcare, the success
information processing, ultimately leading to more effective story lies in the efficient extraction of medical insights and the
medical decision-making. improvement of diagnostic processes. For instance, a study
utilizing GPT-3 for medical literature review demonstrated
2. Education: Intelligent Tutoring Systems
its ability to generate relevant summaries, facilitating quicker
Large language models play a pivotal role in shaping the access to valuable information [20]. In education, success
future of education, particularly through the development stories involve improved learning outcomes and personalized
of intelligent tutoring systems. GPT-3 and similar models experiences for students using intelligent tutoring systems
offer personalized learning experiences by adapting to based on LLMs. These systems have showcased their
individual learning styles. These systems utilize the language adaptability across different subjects, positively impacting
understanding capabilities of LLMs to provide tailored students’ academic performances [21]. Business applications
explanations and responses to student queries, fostering a highlight the success of chatbots in improving customer
more engaging and effective learning environment [16]. The service interactions, with instances of reduced response
versatility of these models enables them to assist students times and increased customer satisfaction [22]. These success
across various subjects, making education more inclusive. stories underscore the versatile applications of large language
Success stories in education involve the improved academic models, showcasing their potential to enhance processes and
performance of students using intelligent tutoring systems services across diverse domains.
powered by LLMs. These systems not only enhance the
2. Addressing Challenges Faced in Different Domains
learning experience by providing real-time feedback but also
cater to diverse learning styles, addressing the unique needs Despite their successes, large language models face challenges
of individual students [17]. in implementation. In healthcare, challenges include ensuring
the ethical use of patient data and addressing concerns related
3. Business: Chatbots and Customer Service
to the interpretability of AI-driven diagnostic decisions [23].
In the business sector, the integration of large language Bias in training data is a persistent challenge across domains,
models has redefined customer interactions through the leading to concerns about the fairness and equity of AI
development of advanced chatbots. GPT-3, with its language applications. In education, challenges include the continuous
generation capabilities, enables chatbots to engage in natural adaptation of tutoring systems to evolving curricula and the
and contextually relevant conversations with customers. This need for addressing diverse learning styles [24]. In business,
has profound implications for customer service, streamlining challenges involve ensuring the ethical deployment of
interactions and improving overall efficiency [18]. Success chatbots, addressing potential biases in customer interactions,
stories in business applications include instances where and maintaining transparency in automated decision-making
chatbots powered by LLMs have significantly reduced processes [25]. Addressing these challenges requires a
response times, enhanced customer satisfaction, and multidisciplinary approach, involving collaboration between
provided scalable solutions for handling a wide range of researchers, developers, policymakers, and domain experts
customer queries [19]. These applications demonstrate the to establish ethical guidelines and refine methodologies for
transformative potential of large language models in business responsible AI deployment.
operations, particularly in sectors where efficient and
In summary, the real-world applications of large language
responsive customer service is critical.
models in healthcare, education, and business highlight their
Table 61.1 Comparison of large language models
Model Parameters Architecture Applications
GPT-3 175 billion Transformer-based, 96 attention heads Healthcare (medical literature review), Business
(chatbots), Natural Language Processing (NLP) tasks
BERT 340 million (BERT-base) Transformer-based, 12 attention heads NLP, Sentiment Analysis, Question Answering
OpenAI GPT-2 1.5 billion Transformer-based, 48 attention heads Content generation, Text completion
XLNet 340 million (BERT-base) Transformer-based, Permutation Language NLP, Machine Translation, Question Answering
Model (PLM)
T5 (Text-To-Text) 11 billion Transformer-based, 24 attention heads Language translation, Text summarization
The Evolution and Impact of Large Language Models in Artificial Intelligence 415

transformative potential. Success stories underscore improved about the capabilities and limitations of LLMs is essential
processes and services, while challenges necessitate ongoing to managing user expectations and avoiding unintended
efforts to ensure ethical and responsible implementation consequences. Developers and researchers must prioritize
across diverse domains. user privacy, ensuring that data used to train these models
is handled ethically and that user consent is obtained for
5. Implications and Future Directions any data collection. Additionally, ongoing monitoring and
auditing of LLMs are critical to identifying and addressing
5.1 Societal Impact ethical concerns that may arise over time [28].
1. Changes in Communication and Information
2. Mitigating Biases and Ensuring Fairness
Consumption
Mitigating biases in large language models is a key ethical
The deployment of large language models (LLMs) has
consideration. Biases present in training data can be learned
ushered in profound changes in how society communicates
by the model, leading to unfair or discriminatory outcomes.
and consumes information. With the rise of advanced natural
Researchers and developers need to implement strategies for
language processing capabilities in models like GPT-3,
identifying and mitigating biases during the training process.
individuals experience more personalized and context-
This includes careful curation of diverse and representative
aware interactions in online communication platforms.
datasets, the development of algorithms that account for
Conversational agents powered by LLMs have altered the
potential biases, and ongoing evaluation of model outputs
dynamics of human-computer interaction, enabling more
for fairness. Furthermore, there is a need for industry-wide
intuitive and natural conversations. This shift impacts not only
standards and guidelines to ensure that ethical considerations
social media interactions but also extends to customer service,
are consistently addressed across different applications of
education, and content generation. The democratization of
information is another societal impact, as LLMs facilitate easy large language models [29].
access to vast amounts of data and insights, transforming the
5.3 Future Developments
way individuals acquire knowledge and make decisions. As
these communication and information consumption patterns 1. Trends in Large Language Model Research
evolve, it becomes imperative to understand and navigate the The field of large language models is expected to witness
challenges and opportunities they present [26]. several trends in the coming years. One prominent trend
2. Economic and Industrial Implications is the exploration of more efficient and environmentally
sustainable training methodologies. The energy consumption
The economic and industrial landscape is undergoing
associated with training large models has raised concerns,
significant transformations due to the integration of large
leading researchers to investigate methods for reducing
language models. In sectors such as content creation,
environmental impact without compromising performance.
journalism, and marketing, LLMs contribute to the automation
Another trend is the development of models that prioritize
of tasks like writing articles, generating marketing content,
interpretability, enabling users to understand and trust
and even composing code snippets. This automation has the
the decision-making processes of these complex models.
potential to increase efficiency and reduce costs for businesses.
Ongoing research also focuses on improving the fine-tuning
However, it also raises questions about the future of certain
job roles and the need for upskilling in the workforce. The process, allowing models to adapt more effectively to specific
advent of LLMs also influences the development of new tasks and domains [30].
products and services, fostering innovation in areas such as 2. Potential Breakthroughs and Advancements
virtual assistants, automated translation services, and more. Anticipated breakthroughs in large language model research
Policymakers and industry leaders need to carefully navigate include advancements in unsupervised learning, enabling
these changes to ensure inclusive economic growth and models to learn from unlabeled data more effectively.
address potential challenges related to job displacement and This could lead to even greater generalization capabilities
unequal access to emerging opportunities [27]. and improved performance across diverse tasks (refer
Table 61.2). Innovations in natural language understanding,
5.2 Ethical Considerations
contextual reasoning, and multilingual capabilities are also
1. Responsible AI Development and Deployment areas of active exploration. Additionally, the integration
Ethical considerations are paramount in the development of large language models with other AI technologies, such
and deployment of large language models. Responsible AI as computer vision and reinforcement learning, holds the
practices involve transparency in the design and decision- potential for creating more comprehensive and versatile AI
making processes of these models. Clear communication systems. As research progresses, collaborations between
416 Algorithms in Advanced Artificial Intelligence

Table 61.2 Generalization capabilities of LLMs

Model Training Datasets Evaluation Datasets Generalization Performance
GPT-3 Broad domain text data Diverse benchmark datasets High generalization across various NLP tasks and domains
BERT Wikipedia, BookCorpus GLUE, SQuAD, MNLI Strong performance on diverse NLP benchmarks, transferable
features
OpenAI GPT-2 Web pages, Books LAMBADA, CNN/Daily Mail Effective generalization to tasks with different contextual cues
XLNet Books, Wikipedia, RACE, SQuAD, LAMBADA Improved performance on tasks requiring contextual understanding
ClueWeb09
T5 (Text-To-Text) C4 dataset, English Web SuperGLUE, CNN/Daily Mail Achieves state-of-the-art results on various NLP benchmarks

academia, industry, and policymakers will be crucial to enhanced customer service in business. Success stories
navigating the ethical, societal, and technological challenges highlighted instances of improved efficiency, personalized
posed by these potential breakthroughs [31]. learning experiences, and streamlined customer interactions.
In conclusion, the implications and future directions of large Simultaneously, challenges in different domains underscored
language models encompass a broad spectrum of societal, the importance of ethical considerations, responsible AI
ethical, and technological considerations. Balancing the development, and continuous efforts to address biases. The
implications and future directions section outlined the societal
benefits of technological innovation with ethical responsibility
impact of LLMs, delving into changes in communication,
is central to shaping a future where large language models
economic ramifications, and ethical considerations. It also
contribute positively to society.
provided a glimpse into the potential breakthroughs and
advancements expected in large language model research.
6. Conclusion
6.1 Recapitulation of Key Findings 6.3 Recommendations for Future Research
In conclusion, the exploration of large language models As we look to the future, several recommendations for
(LLMs) and their impact on artificial intelligence has further research emerge. First and foremost, there is a need
revealed significant insights. Key findings include the for continued exploration into mitigating biases and ensuring
transformative influence of models like GPT-3 on natural fairness in large language models, as ethical considerations
language processing, showcasing their prowess in diverse remain paramount. Future research should focus on refining
applications such as healthcare, education, and business. evaluation metrics to better capture the nuanced capabilities
The historical overview highlighted the evolution of artificial and limitations of LLMs, addressing challenges related to
intelligence, from its inception in the 1950s to the recent context understanding, interpretability, and potential biases in
generated text. Furthermore, investigating more sustainable
paradigm shift fueled by breakthroughs in machine learning
and environmentally friendly training methodologies is
and deep learning. The examination of GPT-3’s architecture
essential to minimize the ecological footprint associated with
and its comparison with other models elucidated the
large-scale model training. Collaborative efforts between
technical nuances that contribute to its unparalleled language
academia, industry, and policymakers are crucial to establish
generation capabilities. The discussion on training data
standardize.
underscored the importance of diverse and representative
datasets, acknowledging the challenges of biases in data and In conclusion, this research advances our understanding of
strategies for mitigation. Evaluation metrics and challenges large language models, emphasizing their transformative po
emphasized the ongoing efforts to refine methodologies for tential, ethical considerations, and the intricate interplay be
assessing the performance and understanding the limitations tween technological advancements and societal implications.
of LLMs.
References
6.2 Summary of Contributions
1. McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C.
This research paper contributes to the understanding of
E. (1955). A proposal for the Dartmouth summer research
the multifaceted landscape of large language models, project on artificial intelligence. AI magazine, 27(4), 12-14.
amalgamating technical, societal, and ethical dimensions. 2. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L.,
The case studies illustrated the real-world applications of Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you
LLMs, exemplifying their impact in healthcare diagnosis, need. In Advances in neural information processing systems
intelligent tutoring systems in education, and chatbots for (pp. 5998-6008).
The Evolution and Impact of Large Language Models in Artificial Intelligence 417

3. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., 17. Hsu, Y. L., Cholleti, S. R., & Lee, Y. F. (2018). Evaluating the
Dhariwal, P., ... & Amodei, D. (2020). Language models are effectiveness of intelligent tutoring systems: A case study in
few-shot learners. arXiv preprint arXiv:2005.14165. high school mathematics. Computers & Education, 116, 72
4. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). 88.
BERT: Pre-training of deep bidirectional transformers for 18. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.,
language understanding. arXiv preprint arXiv:1810.04805. Dhariwal, P., ... & Amodei, D. (2020). Language models are
5. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, few-shot learners. arXiv preprint arXiv:2005.14165.
R., & Le, Q. V. (2019). XLNet: Generalized autoregressive 19. Gao, J., Dolan, B., & Chen, W. (2018). Content-aware neural
pretraining for language understanding. In Advances in neural conversation models. arXiv preprint arXiv:1812.10687.
information processing systems (pp. 5753-5763). 20. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.,
6. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Dhariwal, P., ... & Amodei, D. (2020). Language models are
Matena, M., ... & Liu, P. J. (2019). Exploring the limits of few-shot learners. arXiv preprint arXiv:2005.14165.
transfer learning with a unified text-to-text transformer. arXiv 21. Hsu, Y. L., Cholleti, S. R., & Lee, Y. F. (2018). Evaluating the
preprint arXiv:1910.10683. effectiveness of intelligent tutoring systems: A case study in
7. See, A., Liu, P. J., & Manning, C. D. (2017). Get to the high school mathematics. Computers & Education, 116, 72
point: Summarization with pointer-generator networks. In 88.
Proceedings of the 55th Annual Meeting of the Association 22. Gao, J., Dolan, B., & Chen, W. (2018). Content-aware neural
for Computational Linguistics (Volume 1: Long Papers) (pp. conversation models. arXiv preprint arXiv:1812.10687.
1073-1083). 23. Kulkarni, S., Seneviratne, M. G., & Soh, C. B. (2020). Ethical
8. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., implications of using AI in clinical diagnosis: A scoping
Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you review. IEEE Transactions on Technology and Society.
need. In Advances in neural information processing systems 24. Kulkarni, S., Seneviratne, M. G., & Soh, C. B. (2020). Ethical
(pp. 5998-6008). implications of using AI in clinical diagnosis: A scoping
9. Gao, J., Dolan, B., & Chen, W. (2018). Content-aware neural review. IEEE Transactions on Technology and Society.
conversation models. arXiv preprint arXiv:1812.10687. 25. Hajian, S., Bonchi, F., & Castillo, C. (2016). Algorithmic bias:
10. Bender, E. M., & Friedman, B. (2018). Data statements for From discrimination discovery to fairness-aware data mining.
natural language processing: Toward mitigating system bias Data Mining and Knowledge Discovery, 30(3), 815-847.
and enabling better science. Transactions of the Association 26. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.,
for Computational Linguistics, 6, 587-604. Dhariwal, P., ... & Amodei, D. (2020). Language models are
11. Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics few-shot learners. arXiv preprint arXiv:2005.14165.
derived automatically from language corpora necessarily 27. Gao, J., Dolan, B., & Chen, W. (2018). Content-aware neural
contain human biases. Science, 356(6334), 183-186. conversation models. arXiv preprint arXiv:1812.10687.
12. Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., 28. Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape
Roesner, F., & Choi, Y. (2019). Defending against neural fake of AI ethics guidelines. Nature Machine Intelligence, 1(9),
news. arXiv preprint arXiv:1905.12616. 389-399.
13. Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape 29. Buolamwini, J., & Gebru, T. (2018). Gender shades:
of AI ethics guidelines. Nature Machine Intelligence, 1(9), Intersectional accuracy disparities in commercial gender
389-399. classification. Proceedings of the 1st Conference on Fairness,
14. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Accountability and Transparency, 77-91.
Dhariwal, P., ... & Amodei, D. (2020). Language models are 30. Kaplan, J., McCandlish, S., Henighan, T., Brown, T., Chess,
few-shot learners. arXiv preprint arXiv:2005.14165. B., Child, R., ... & Radford, A. (2020). Scaling laws for neural
15. Gehrmann, S., Dernoncourt, F., Li, Y. A., Carlson, E. T., Wu, language models. arXiv preprint arXiv:2001.08361.
J., & Farri, O. (2020). Comparing rule-based and deep learning 31. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L.,
models for patient phenotyping: a case study on IBD in Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you
electronic health records. arXiv preprint arXiv:2005.13531. ↩ need. In Advances in neural information processing systems
16. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., (pp. 5998-6008).
Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you 32. https://clive-gomes.medium.com/pre-training-large
need. In Advances in neural information processing systems language-models-at-scale-d2b133d5e219
(pp. 5998-6008).
Note: All the figures and tables in this chapter were designed by
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
418 Algorithms in Advanced Artificial Intelligence

Several Machine Learning Techniques

Used to Forecast Parkinson Disease 62

O. Sri Nagesh1
Associate Professor, Department of CSE, Anurag University, Hyderabad
B. Rajarao2
Senior member of technical staff, Oracle inc.,
Voore Subrahmanyam3
Associate Professor, Department of CSE, Anurag University, Hyderabad

Abstract: Numerous machine learning algorithms and procedures were used on the acquired data in the current paper. The
methods covered include classification, statistical analysis, assessment, and unsupervised learning methods. Data on PD and
non-PD patients are displayed using a visualization technique. Techniques include the Operating Characteristic (ROC), Sieve
Multi-gram, and Self-Organizing Map (SOM), among others. The information focuses a lot on sick classes. Decision Tree,
Logistic Regression, SVM, Linear Regression, KNN, Dimensionality Reduction Algorithms, Random Forest, K-Means, Naive
Bayes, Gradient Boost, and Ada-boost are a few additional algorithms. These algorithms can be applied to any data issue.
Keywords: PD-Parkinson’s disease, SOM, ROC, KNN

1. Introduction help to distinguish Parkinson’s disease (PD) from associated

Parkinson disease issues. Hypomania, dysarthria, dysphagia,
Understanding the connections between various data sialorrhea, micrographia, festination, solidifying, dystonia,
sets obtained from various data sources is crucial in an glabella-reflexes, non-motor side effects like autonomic
information stockroom and data mining, and the user needs brokenness, psychological neurobehavioral deviations from
to have a clear understanding of knowledge. The Knowledge the norm, rest issue, and tactile anomalies are additional
Distributed Data (KDD) process is shown below for a better clinical components, as in agony, parenthesis, and so forth.
understanding of this study. Other common symptoms such as rest tremors, early stride
Machine Learning algorithms can be used to analyze difficulties, postural instability, dementia, and the proximity
diseases like Parkinson’s disease (PD) [1]. It is possible to of dysautonomia, ophthalmoparesis, and ataxia are advised
corroborate the clinical analysis of Parkinson Disease (PD) to be evaluated in addition to Parkinson’s syndrome. The
based on neuro-pathologic and histo-pathologic criteria accurate diagnosis of the condition depends on a thorough
[2]. Since there is no conclusive study of PD, this disorder understanding of the wide variety of clinical manifestations
must be completely diagnosed using medical criteria. of PD. Genetic changes or mutations, abnormalities in
Bradykinesia, is the symptom of cardinal disease which neuroimaging, and other tests could all be possible biomarkers
reflects loss of postural reflexes, and tremors while at rest that help locate and identify people who are in danger.
are typically regarded as the disease’s primary signs and According to Petersen et al. [3], clinical problem-solving
symptoms. These factors’ proximity and specific existence classification of PD can be complete using a thorough

1
nagesh.osri@gmail.com, 2rajaraob@yahoo.com, 3voore.subrahmanyam206@gmail.com

DOI: 10.1201/9781003529231-62
Several Machine Learning Techniques Used to Forecast Parkinson Disease 419

Fig. 62.1 Knowledge distributed data (KDD)

assessment of the literature data and selection based on

the specificity and effectiveness of the hallmark clinical
components. Clinical, pathologic, and nosological aspects
with pervasiveness, features, and risk factors in patients are
expected to be researched in prospective clinic-pathologic
investigations indicating PD [4]. The execution score of the
classifiers has been calculated using neural networks (NN),
DM-Neural, regression, and decision trees [5, 6]. Vocal
impairment brought on by PD impairs speech, motor skills,
and other abilities such as conduct, inclination, feeling, and
consideration. An important factor in the early diagnosis of
PD is voice estimation-based tele-monitoring of the condition
using the standard bootstrapping method or omitting one of
the SVM approval techniques.
The identification of Parkinson’s syndrome in numerous
systems is made possible by the demonstrative and foresighted
evaluation of several clinical aspects [7]. Two base classifiers,
KStar and IBK, along with Request Accuracy (ACC), KE, Fig. 62.2 A man with PD displaying a flexed walking poster
and ROC Curve (OCC), provide a conclusion model for PD
determination accuracy [11]. 2. Model Description
Therapeutic biometrics plays a crucial role in diagnosing
problems like Parkinson’s disease. There are few medications In order to express the data input values that are supplied to the
available to treat Parkinson’s disease (PD). The system’s model and analyze the model’s outputs, models for describing
ability to distinguish between PD and non-PD based on sound the process are derived from Machine Learning models.
(pronunciation) is demonstrated by clustering techniques [8]. Information inquiry makes use of a variety of classification
The accuracy attained by the use of a Parkinson’s disease methodologies. There is a connection between managed and
dataset that was previously predicted using a variety of unmanaged learning strategies. Bayes-Net, Logistic, J48,
techniques [9]. The current state of neural system-based Simple Logistic, LMT, AD-Tree, K-star, Naive-Bayes, and
diseases makes a significant difference in the prediction of Random Forest are employed in administered learning.
Parkinson’s disease [10]. Case Fig. 62.2 The predominant Receiver Operating Characteristic (ROC) visualization uses
waking symptom of PD patients is delayed compared to a parallel coordinates, while Classification Sieve Graphs use
normal individual of the same age group. hierarchical clustering techniques and SOM.
420 Algorithms in Advanced Artificial Intelligence

2.1 Nominated Methodology 4. Simple Logistic

To align the clinical diagnosis of PD, neuropathologic and 5. KStar
histopathologic criteria are used. [2]. The decision is made 6. ADTree
in light of the trademark clinical features’ sensitivity and 7. J48
specificity. Clinical, pathologic, and nosologic examinations 8. LMT
of the features and risk factors in patients are expected 9. Random Forest
to be explored in potential clinico-pathologic research
in representative populations of patients displaying PD A few of the models have already been examined; the remain
[10]. For generating the execution score of the classifiers’ der will be presented in subsequent chapters. The dataset that
accurate diagnosis of PD, Neural Networks, DM-neural, was recovered includes unsupervised learning techniques for
Regression, and Decision Trees are used in advance [5,6]. statistical analysis, classification, and evaluation. The IBKS
Vocal impairment brought on by PD affects speech, motor algorithm offers diagnostic techniques for identifying charac
function, and other abilities such as behavior, state of mind, teristics that are likely to indicate the presence of PD.
feeling, and consideration. Early PD diagnosis relies heavily Data from PD and Non-PD patients were displayed using a
on voice assessment during tele-observing of the illness. PD visualization technique that utilized the parallel coordinate
connections between qualities are evaluated for relevance and method. Information was displayed using the smooth parallel
factual criticality using the standard bootstrapping and other direction plot visualization technique. The spines smooth plot
approval approaches with Support Vector Machine (SVM) is what draws attention to this since it maps each experience
for creating a classification [12]. into a parametric line or curve that is constant on the
tomahawks and orthogonal to every parallel axis. For each
2.2 The Models Proposed in this Paper are information value, this setup emphasizes the quantization
1. Bayes Net level. The information representation for the provided voice
2. Naïve Bayes dataset is shown in Fig. 62.3. Parkinson’s disease is shown to
3. Logistic have greater deviations in the parallel directions (red).

Fig. 62.3 Parallel coordinates

Several Machine Learning Techniques Used to Forecast Parkinson Disease 421

Red color symbolizes class value 1, which denotes data False Positive Rate (- ve FPR) at different edge settings.
connected to PD disease, and blue color denotes class 0, which Sensitivity relates to Parkinson’s disease (PD) value recalls 1
denotes non-PD-associated data. Coordinates connected to in machine learning (ML) technique. The false-positive rate
the basic characteristics of frequency. (0 - specificity) relates to Non-Parkinson Disease (Non-PD)
The Receiver Operating Characteristic (ROC) for the also known as fall-out. In this work, another visualization
classification algorithms Majority, K-Nearest Neighbour, technique is also used, that is Sieve Multigram. Fig. 62.5
and SVM is displayed in Fig. 62.4 is a graphical plot that shows the co-relations among selected features.
shows how well a binary classifier system performs. Most Sieve Multigram shows the arrangement of correlated features.
algorithms use SVM and k-nearest neighbour. The ROC Red color indicates negative (class value 1- that indicates PD)
bend therefore acts as a factor in a component of drop out. correlation and blue color indicates positive correlation (class
ROC plot has shown specificity at 0.25. K-NN (K-Nearest Value 0 that indicates Non-PD). Lines with more intensity
Nighbour) has demonstrated 82.5% of precision. SVM shows how strong are the bonds between the relationships.
(Support Vector Mechines) has indicated 88.9% precision The values show the correctly classified instances. The
taking into account ROC results. Hence SVM has predicted accuracy is displayed in Table 62.1 (Correctly classified
good results compared to other two algorithms. occurrences). The most accurate algorithm is Random Forest
For instance from Fig. 62.4 we can distinguish that red (90.26), which is followed by K-Star (89.74). Naïve Bayes
lines having PD and blue lines indicate Non-PD. There are displayed the least accuracy (69.23) based on the PD dataset.
some other interesting relations are also showed. Every Bayes Net works on probability theory concept. It is a direct
one character has system of connections utilizing strainer approach and it allows a rich structure. According to expert
diagrams demonstrating bury and intra associations with the opinion data, it is built-in model. In addition to that it predicts
solid and unhealthy information like basic sound attributes. output by giving inputs. It supports for missing data during
Present work shows the values by using other classification learning and classification. Bayes-Net has shows accuracy of
methods as shown in table below. The bend is made by 80.00 based on PD dataset, ware as Naïve Bayes shows 69.23
plotting the Genuine Positive Rate (+ve TPR) against the least accuracy.

Fig. 62.4 ROC for classification algorithms

422 Algorithms in Advanced Artificial Intelligence

Fig. 62.5 Sieve graph

Table 62.1 Classified instances based on algorithms 2. Divisive: All perspectives start with the same figure in this
Algorithm Classified Instances
“top down” paradigm.3.6 has demonstrated that there are
more groups in the solid dataset for fundamental frequency
Bayes Net algorithm 80
and fewer groups with contaminated data. The Fo attribute
Naive Bayes 69.23 in the healthy dataset has a range of 95.7 to 252.45 (blue
Logistic regression 83.66 coloured), while the diseased dataset has a range of 144.18
Simple Logistic 84.61 to 202.63 (red coloured). Self-Organizing Map (SOM) is
KStar algorithm 89.74
a type of unsupervised learning. The objective is to find
some hidden structure of the information. The node in SOM
ADTree algorithm 86.15
called neurons, linked with every hub is same as input data
J48 80.51 vector’s dimension as weight vector and a position in the map
LMT 86.15 projection. The standard course of action of hubs is a two-
Random Forest 90.26 dimensional and the spacing is rectangular or hexagonal grid.
The grid describe high dimension to low dimension input
Another visualization method used in this work is hierarchical space.
clustering method. This classifier is of two Strategies. The weights will draw a two dimensional vector indicating
1. Agglomerative: This is a “base up” methodology; every diseased and non diseased PD values, it is as follows.
perception begins in its own bunch and matches the groups The concept of clustering was done on the collected data it
that are converged as one climb primate. show the clusters of PD and Non PD people as shown in the
Group and parts are performed recursively as one move, Fig. 62.6. In this thesis the voice data is collected by recording
down the chain of command. vowels A, E, I, O and U.
By using the hierarchical method, the instance values are Hierarchical Clustering Fig. 62.7 is eye-catching because
show as below: it is not needed number of Dendrograms (lines connected
Several Machine Learning Techniques Used to Forecast Parkinson Disease 423

Fig. 62.6 Hierarchal clustering for fundamental frequencies (Fo) attributes

Fig. 62.7 Hierarchical clustering dendrites showing PD and non-PD data values0-Non PD (Blue colored) and 1-PD (Red colored)
424 Algorithms in Advanced Artificial Intelligence

to cluster groups), and the clustering method can be easily 3. Results and Discussion
illustrated with a Values between PD and Non-PD patient’s
data values of the vowels (A, E, I, O, U). After applying various Data Mining classification methods,
The method of the data relevant to Fundamental Frequency, various results are observed. Values of the Table 62.1 are
the Self-Organizing Map (SOM), provides subjective data shown in bar graph Fig. 62.9 which shows the identification
towards solid and PD datasets, as illustrated in Fig. 62.8. The of PD data (%). It is observed that random forest showing
majority of the information possesses sick class. more accuracy and Naïve Bayes showing less accuracy on
the data to PD.
When using visualisation approaches, it is discovered that
a parallel coordinate, or ROC for classification algorithms,
shows this cross line for sick data over non-diseased data
(Fig. 62.3). In contrast to the other two, the majority, SVM,
and k-nearest neighbour exhibit curve values that are closer to
1. Sieve Multigram shows how features are related. Red color
indicates negative (class value 1- that indicates PD) correlation
and blue color indicates positive correlation (class value 0
that indicates Non-PD). Thickness of lines demonstrates
how solid the relationship shown in Fig. 62.3(Sieve multi
gram). The Fundamental frequency attribute’s hierarchical
clustering has revealed that there are more groups in the solid
dataset and fewer in the infected data. In Self Organised Maps
(SOM), the ranges for the healthy dataset (blue coloured)
and the diseased dataset (red coloured) for the Fundamental
frequency attribute are 95.7 to 252.45 and 144.18 to 202.63,
respectively. Qualitative information about sound and PD
Data values is provided by the visualisation and clustering of
the data related to Fundamental frequency.
The maximum values are involved in PD class. It is observed
Fig. 62.8 Self organized maps (SOM) for fundamental that most of the diseased class is also occupied non diseased
frequency attributes
class. Red color (label 1) class indicates the diseased PD
It is watched that the majority of the unhealthy class is used where as blue color (label 0) class indicates non PD.
likewise involved non ailing. Class Red shading demonstrates It seems K-star took very less time(0.1 seconds) and simple
the class mark 1; it is infected (PD) and blue shading class logistic took more time(1.98 seconds) as shown in Fig. 62.10.
shows 0 for Non-PD.

Fig. 62.9 Comparision of different DM algorithm values

Several Machine Learning Techniques Used to Forecast Parkinson Disease 425

Fig. 62.10 Time taken to Execute DM algorithms

7. Wenning,, W. Poewe MD Department of Neurology and

4. Conclusion Department of Biostatistics, University of Innsbruck,
Voice data analysis is vital today for understanding and Austria “Modafinil for the Treatment of Daytime Sleepiness
diagnosing human disorders. This work includes the in Parkinson´s Disease:”, SLEEP, Vol. 25, No. 8, 2002 62
diagnosis of Parkinson’s disease (PD) utilizing voice datasets Treatment of Daytime Sleepiness in Parkinson’s Disease—
Högl et al.
and machine learning methods. It concludes that there are
8. P. F. Guo, P. Bhattacharya, and N. Kharma, “Advances in
variations between voices of diseased and non diseased. It
detecting Parkinson’s disease.”
helps to predict a voice belongs to diseased or not. Hence 9. Kaladhar4,‘Intelligent Parkinson Disease Prediction Using
earlier detection of Parkinson Disease is possible by using Machine Learning Algorithms’Innovative Technology (IJEIT)
through data mining and machine learning algorithms. Volume 3, Issue 3, September 2013 , ISSN: 2277-3754 ISO
9001:2008 Certified International Journal of Engineering.
References 10. D. Aarsland, K. Andersen, J.P. Larsen, and A. Lolk,
“Prevalence and characteristics of dementia in Parkinson
1. Shianghau, W., Jiannjong, G.: A Data Mining Analysis of the disease: an 8-year prospective study.” Archives of Neurology.
Parkinson’s disease. IB 3(1), 71–75 (2011). Vol. 60, no. 3, pp. 387, 2003.
2. D.J. Gelb, E. Oliver, and S. Gilman, “Diagnostic criteria for 11. De Nicola, Arianna, Simone Gitto, and Paolo Mancuso.
Parkinson disease.”Archives of neurology, vol. 56, no.1, pp. “Uncover the predictive structure of healthcare efficiency
33 1999. applying a bootstrapped data envelopment analysis.” Expert
3. Petersen, R. C., J. C. Stevens, M. Ganguli, E. G. Tangalos, J. Systems with Applications 39, no. 12 (2012): 10495-10499.
L. Cummings, and S. T. DeKosky. “Practice parameter: Early 12. D.J. Gelb, E. Oliver, and S. Gilman, “Diagnostic criteria for
detection of dementia: Mild cognitive impairment (an evidence- Parkinson disease.”Archives of neurology, vol. 56, no.1, pp.
based review) Report of the Quality Standards Subcommittee 33 1999.
of the American Academy of Neurology.” Neurology 56, no. 9 13. Dengler, R. (2006). “Perception of emotional speech in
(2001): 1133-1142. Parkinson’s disease.”
4. D. Aarsland, K. Andersen, J.P. Larsen, and A. Lolk, 14. De Cheveigné, Alain, and Hideki Kawahara.YIN, “a
“Prevalence and characteristics of dementia in Parkinson fundamental frequency estimator for speech and music.” The
disease: an 8-year prospective study.” Archives of Neurology. Journal of the Acoustical Society of America 111, no. 4 (2002).
Vol. 60, no. 3, pp. 387, 2003. 15. Doddington, G. R. (1985). Speaker recognition—identifying
5. R. Das, “A Comparison of multiple classification methods people by their voices. Proceedings of the IEEE, 73(11),
for diagnosis of Parkinson disease.” Expert Systems with 1651-1664.
Applications, vol. 37, no. 2, pp. 1568-1572, 2010. 16. F. Åström, and R. Koker, “A parallel neural network approach
6. R. Polikar, A. Topalis, D. Green, J. Kounios, and C. M. to prediction of Parkinson’s Disease.” Expert systems with
Clark, “Comparative multiresolution wavelet analysis of ERP applications, vol. 38, no. 10, pp. 12470-12474, 2011.
spectral bands using an ensemble of classifiers approach for Note: All the figures and tables in this chapter were designed by
early diagnosis of Alzheimer’s disease.” Computers in biology the author.
and medicine, vol. 37, no. 4, pp. 542-558, 2007.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
426 Algorithms in Advanced Artificial Intelligence

Fungal Disease Risk Assessment using

Data-Driven Methods: Impacts on Food
Security and Crop Devastation
63

Kamidi Jeswanth Kumar*

Research Schalor, GIET University, Gunupur

Abstract: Rice is one of the most significant staple crops, especially in India. It provides daily calories and carbohydrates to
approximately half the population. Additionally, many Indian villages grow this plant as a crop in order to make money from
exports. Every year, millions of tons of rice are shipped around the world and eaten. The Pyricularia grisea-related fungus
known as rice explosion feeds on the seedling stage of the rice plant and frequently causes significant damage every year to
decrease crop growth. Farmers are mostly to blame for the rice sickness. In order to know where to look and what to look for,
it is important to be aware of the many plant parts that are susceptible to illness before discussing diseases in paddy. The study
explores the use of data-driven methods to assess the risk of fungal diseases, highlighting their potential impact on food security
and crop devastation. In this paper, we suggest combining computer vision and machine learning to identify the fungus illness
affecting rice harvests. The fungus infection of experiments is used in the evaluation research of image classification models,
including K-NN, SVM, and CNN.
Keywords: Crop devastation, CNN, Food security, Fungal disease, K-NN, Pyricularia grisea, SVM, Risk assessment etc

1. Introduction water stress, and yield quality. Most of the time, researching
a problem requires knowledge, which might be costly and
Agriculture played a major role in the rise of sedentary human time-consuming in developing realms. In order to increase
civilization since it produced excess food for urban dwellings. crop yield, disease management in agriculture focuses on
Fisheries, aquaculture, forestry, and the production of crops controlling fungi, bacteria, viruses, viroids, phytoplasmas,
and animals are all included in agriculture, according to protozoa, nematodes, and parasitic plants [5].
Kavitha (2018) [3]. The creation of a system for sustainable
An abnormal physiological process that impedes the normal
agriculture could be considerably aided by the use of
structure of the plant’s function, growth, and other activities
current technologies. It is now possible to boost agricultural
causes a plant to become ill when a specific causative agent
inputs and farm output in a way that is both financially and
interferes with it on a regular basis. Disruption of one or
environmentally sustainable because of the growing field
more of a plant’s vital biochemical and physiological systems
of reliable farming. With the use of these techniques and
leads to pathological states and symptoms. Depending on
knowledge, agriculture can now provide farming that is
both environmentally and economically sustainable while environmental factors, the crops and varieties farmed, the
lowering expenses and mistakes. Monitoring agricultural existence and prevalence of crop diseases, and the prevalence
inputs is essential to avoid negative outcomes like decreased of a specific pathogen, these variables change over time.
production, deteriorating plant health, etc. The main issues One technique for correctly and affordably monitoring
in agriculture were fertilizers, pesticides, irrigation and agricultural data is image processing. For image processing

*
jeswanth7.kamidi@gmail.com

DOI: 10.1201/9781003529231-63
Fungal Disease Risk Assessment using Data-Driven Methods: Impacts on Food Security and Crop Devastation 427

Fig. 63.1 Images are converted into multi-dimensional matrices for comparison.

applications in agriculture, they can roughly be classified The following are the implications of further crop-yield
into two categories: those based on imaging techniques and prediction research:
those based on applications. The primary subject here is the • The design of the K-NN, SVM, and CNN image
application of image processing on farms. Rahul, 2023, has classification models to capture the time-based
demonstrated image processing to be an effective machine interdependence of illness variables
vision technology in the agricultural industry [7]. Infrared • The model showed that the yield prediction might be
hyperspectral, x-ray, and imaging methodologies are some widespread in unverified situations without a large loss
of the techniques used to map irrigated land and, more in forecast accuracy.
specifically, calculate metrics like plant indexes and tree
• When used in conjunction with the replication technique,
measures. Based on their visual characteristics and external
the model could show how much the precision of disease
appearance, all plants and fruits are identified, gathered, and
predictions and the variation in crop yields are all
examined to discover any faults. In agricultural plant and
correlated.
vegetable analysis, image processing and machine vision
is frequently exploited to handle the aforementioned tests 2.1 K-Nearest Neighbors (K-NN)
without the use of intentional techniques.
The model being trained finds the approximately nearest
neighbours to the input data, according to Bhatia 2010) [1].
2. Proposed Work Any type of input data format, in this case an image of the
Artificial intelligence and computer vision technology enable crop rice disease, is accepted. For comparison, images are
early pest, disease, or nutrient shortage identification in transformed into multi-dimensional matrices. This method
plants, which then instantly connects the affected plants to identifies the ‘K’ crop rice disease images (neighbours)
an agronomic facility for treatment. In this paper, machine that are visually closest to the input image for crop rice
learning, which uses artificial intelligence, enables the disease. A comparison between the input picture vector and
computer to operate in an independent learning manner the trained model is required to identify the neighbours in
without being explicitly provided. It is an interesting and a multidimensional plane by R.N.V. Jagan Mohan, 2012[8].
complex concept that could affect how technology advances The search comparison results can be extended to whatever
in the future. Image classification is a machine learning extent we like, depending on the specified value of K. When
application used to identify crop diseases. In this instance, comparing the input image matrix with the training data
the illness photos are being classified using SVM, K-NN, and vector, one should use the Euclidean distance to calculate the
neural networks. distance between the two points [9].
428 Algorithms in Advanced Artificial Intelligence

2.2 Support Vector Machine network for classification” and is frequently used in softmax,
logistic function, and hyperbolic tangent algorithms.
One technique that can be applied to both regression and
classification issues is support vector machines (SVM). Each of these functions takes as inputs a feature vector
SVMs are used to categorize images. Before classifying an (x) and a weight vector (w), which are merged linearly.
image with an SVM, extract its features. Edge detection, After that, they usually produce an output consisting of a
edge colour values, and even the image’s textures are a finite range (0,1) or (-1,1). Continuous functions can be
few examples of these properties. After the features have approximated by a neural network with one hidden layer and
been extracted, we may feed them into the SVM algorithm a non-linear activation function; additional layers allow for
developed by K. Dileep Kumar in 2020 [10]. When a machine the achievement of continuous decision limits [11].
processes an image, it is viewed as a two-dimensional Classification of Fungal Disease Images Using CNN:
collection of pixels. If the image is 200 pixels wide and 200
Input: The fungal disease image is a dataset consisting of
pixels high, for example, the array will be 200 x 200 x 3. The
an array of pixels with height, width, and channel values for
resolution of the image determines the array size. The first
feature extraction.
two dimensions of a picture are its width and height, while
the third dimension is made up of its RGB colour channels. Feature Extraction:
An array value with a possible range of 0 to 255 represents 1. A convolution neural network can be used to create a
the intensity of a pixel at each place. feature map.
To understand SVM, take the example we used for the KNN 1.1 Convolution (ReLu).
classifier. Consider fungus-related imagery that also shows 1.1.1 Choose a kernel with a 55 size, the same
features of a rice crop. The SVM method can be used to create depth as the input array.
a model that accurately determines whether an image depicts 1.1.2 Convolution operations should be used to
a fungus illness in a crop. Prior to testing it, we will first train obtain image features.
our model with a sizable number of images of crop rice in 1.2 (Max Pooling) Pooling.
various situations so that it may become accustomed to the
1.2.1 The dominant feature is extracted by
variety of fungi in the crop characteristics. As a result, the
reducing the spatial size of the feature map
support vector creates a decision border between these two
to 22 using the dimensionality reduction
sets of fungus data and chooses them, revealing the extreme
technique.
example of a fungus by Pin Wang, 2021 [6].
2. Continue the method described above until the fourth
2.3 Convolutional Neural Network layer, changing the channel size to one of 16, 32, 64, or
128 to extract low-level features from the image.
One of the biggest obstacles in the picture recognition space
is CNN. CNN is used for sequence data without a doubt, but Classification
it is also excellent at sifting through large amounts of visual 3. Smooth output is delivered to a feed-forward neural
data and identifying non-linear correlations. SVMs, which are network with back propagation throughout each
margin classifiers, can support a variety of classifying kernels iteration of the training stage.
[4]. When there are several class labels, SVM has trouble 4. Using the SoftMax Classification approach, a trained
predicting the class labels. While the CNN design naturally model is utilized to classify photos by identifying their
promotes parallelization, SVM is similarly challenging to dominating features. Local or widespread necrosis is
parallelize (Wenfeiiling, 2020) [12]. a common manifestation of a fungus infection. Fungi-
Kernels are not necessary because of the variations in how caused crop illnesses can either obstruct normal growth
neural networks operate. Convolutional neural networks are or lead to expansion, an uncontrolled surge of growth
an exception. A single hidden layer neural network with a by zhang SW, 2015[13].
non-linear activation function is referred to as a “neural

Fig. 63.2 Classification of fungal disease images using CNN

Fungal Disease Risk Assessment using Data-Driven Methods: Impacts on Food Security and Crop Devastation 429

3. Experiment Result Sensitivity: Sensitivity is the ability of an illness to identify

successful cases, also known as the recall rate or true positive
The authors gratefully acknowledge the students, staff, and rate.
authority of Physics department for their cooperation in the Specificity: How well an image classification algorithm can
research. forecast actual negatives in each category that is supported
K-NN, SVM, and CNN are contrasted in terms of accuracy, depends on its specificity.
sensivity, specificity, precision, and F1-score by Wenfeilin, Precision: By dividing the total number of positive samples
2020 [12]. The outcomes of applying the K-NN and SVM by the number of positively detected positive image feature
accuracy metrics from the research to implement the accuracy samples, precision can be computed either appropriately or
measurements for the CNN are as follows: erroneously.
Accuracy: A disease image classification model’s accuracy F1-Score: The F1-score is a statistical measure that averages
is a measure of how well it performs across all classes, which the precision and recall of an image classifier, often used
is useful when every class has equal weight. to compare the performance of two classifiers in detecting
feature vector diseases.

Table 63.1 Graph for comparing algorithms K-NN, SVM, and CNN
Algorithm Accuracy Sensitivity Specificity Precision F1-score
K-NN 0.75 0.79 0.79 0.80 0.80
SVM 0.89 0.88 0.85 0.84 0.85
CNN 0.97 0.98 0.96 0.96 0.98

Fig. 63.3 Graph for comparing algorithms K-NN, SVM, and CNN

The following bar graph displays the various accuracy fungus illnesses affecting rice harvests. The study uses image
measures for the three procedures (K-NN, SVM, and CNN), classification models like K-NN, SVM, and CNN.
including accuracy, sensitivity, specificity, precision, and F1
score. CNN achieved the greatest outcomes when comparing References
these procedures, according to Zhang (2018) [14].
1. Bhatia, N and Vandana: Survey of nearest neighbor
techniques, International Journal of Computer Science and
4. Conclusion Information Security, vol. 8, pp. 302-305, 2010.
Rice, a staple crop in India, is heavily impacted by the 2. Hasan, M., S. Ullah, M. J. Khan, and K. Khurshid:
Pyricularia grisea-related fungus, rice explosion. This Comparative analysis of SVM, ANN and CNN for classifying
fungus damages the seedling stage of rice plants, causing vegetation species using hyperspectral thermal infrared data.
International Archives of the Photogrammetric, Remote
crop damage and decreased growth. Farmers are primarily
Sensing and Spatial Information Sciences - ISPRS Archives
responsible for the disease. A study suggests using data- 42 (2/W13):1861–68, doi: 10.5194/isprs-archives-XLII
driven methods to assess fungal disease risk and identify 2-W13-1861-2019.
430 Algorithms in Advanced Artificial Intelligence

3. Kavitha B C, Shilpa D P, Thanushree K S, Swathi A M, 10. R.N.V. Jagan Mohan and R. Subbarao and Kurra Raja Sekhara
Ranjitha M K: Agricultural Crop Monitoring Sensors using Rao: Efficient K-Means Cluster Reliability on Ternary Face
IoT-A Study, International Journal of Engineering Research & Recognition using Angle Oriented Approach, Published in
Technology (IJERT) ISSN: 2278-0181,Vol-6,Issue-13,2018. International Journal of Informatics and Communication
4. Kasian Myagila & Hassan Kilavo: A Comparative Technology (IJ-ICT) Vol.2, No.1, January 2013, pp. 180-187
Study on Performance of SVM and CNN in Tanzania ISSN: 2252-8776, http://dx.doi.org/10.11591/ij-ict.v2i1.1779.
Sign Language Translation Using Image Recognition, 11. R.N.V.Jagan Mohan: Machine Learning approach for corona
Applied Artificial Intelligence, 36:1, 2005297, DOI: virus disease extrapolation: A case study, International Journal
10.1080/08839514.2021.2005297, 2022. of Knowledge-based and Intelligent Engineering Systems,
5. Mishra, S.; Sachan, R.; Rajpal, D: Deep Convolutional Neural Vol-26,219-227, ISSN: 1327-2314(print), 1875-8827(online)
Network based Detection System for Real-time Corn Plant DOI: 10.3233/KES-220015, 2022.
Disease Recognition, Procedia Computer Science, 167, 2003– 12. Rashid Agha, R. A., N. S. Al Muhammed, and P. Fattah.: A
2010, 2020. comprehensive study on sign languages recognition systems
6. Pin Wang, En Fan, eng Wang: Comparative analysis of using (SVM, KNN, CNN and ANN), ACM International
image classification algorithms based on traditional machine Conference Proceeding Series. doi:10.1145/3279996.328002
learning and deep learning, https://doi.org/10.1016/j.patrec. 4,2018.
2020.07.042, Pattern Recognition Letters, Volume 141, Pages 13. Wenfei Liu; Jingcheng Wei; Qingmin Meng: Comparisons
61-67, January 2021. on KNN, SVM, BP and the CNN for Handwritten
7. Rahul Subhash Gaikwad and Sharanabasappa C.Gandage: Digit Recognition, IEEE Xplore, DOI: 10.1109/
Image Sentiment Classification Using Deep Convolutional AEECA49918.2020.9213482, 25-27, August 2020.
Neural Network Models, Jounral of Data Acquisition and 14. Zhang S.W., Shang Y.J., Wang L: Plant disease recognition
Processing, ISSN: 1004-9037, https://sjcjycl.cn/DOI: based on plant leaf image, Journal Anim. Plant Science,
10.5281/zenodo.7923136, Vol. 38 (3), Page No: 1279-1300, 25:42–45, 2015.
2023. 15. Zhang, X.; Qiao, Y.; Meng, F.; Fan, C.; Zhang, M. Identification
9. R.N.V.Jagan Mohan and Uppala Narendranath Gadaee: Face of Maize Leaf Diseases Using Improved Deep Convolutional
Recognition using Unsupervised Images through Discretionary Neural Networks,IEEE Access, 6, 30370–30377,2018.
based Security, International Journal of Advanced Computer
Note: All the figures and tables in this chapter were designed by
and Mathematical Sciences, ISSN: 2230-9624, Vol 3, Issue
the author.
1, 2012, pp. 181-185, Publisher: I International, http://
bipublication.com, ICV-71.03, h-Index:25, i10:Index:99.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Redefining Glaucoma Identification using

State-of-the- Art Machine Learning 64

D. Ratna Giri1
Associate Professor
SRKR Engineering College (A), Bhimavaram, AP
Dept of Information Technology
P. Syamala Rao2
Assistant Professor
Dept of Information Technology
SRKR Engineering College (A), Bhimavaram, AP
J. V. Rama Kumar3,
Assistant Professor
Dept of CSE
SRKR Engineering College (A), Bhimavaram, AP
JMSV Ravi Kumar4
Associate Professor
Dept of Information Technology

Abstract: Glaucoma is a group of eye diseases that can cause irreversible damage to the optic nerve, leading to eventual
blindness. It is critical to recognise the signs of the sickness promptly, as they manifest over time. Although glaucoma currently
has no cure, it is often possible to safeguard one’s eyesight and avoid further damage by intervening early. Advanced machine
learning techniques are transforming the diagnosis of glaucoma. An advanced approach for the early diagnosis and categorization
of glaucoma is the target of this research. Machine learning may greatly improve preventive healthcare, which is our main goal.
Data exploration and preprocessing, feature engineering and visualisation, and model evaluation using classifiers including
logistic regression, decision trees, random forests, and support vector machines were all part of the study to help understand
glaucoma patterns and their details. We assess the precision of the experimental result by using it in conjunction with a number
of machine learning methods.
Keywords: Decision tree, Early detection, Eye disorder, Glaucoma, Logistic regression, Random forest, Support vector
machine

1. Introduction which in turn causes permanent vision loss or blindness [2].

Glaucoma can affect both eyes in most cases. Open-angle
Loss of vision can occur as a result of a variety of eye diseases glaucoma can moderately damage one eye, and it increases the
known collectively as glaucoma [1]. Untreated or improperly likelihood of developing closed-angle glaucoma within five
controlled intraocular pressure (IOP) causes this condition, to ten years [3]. Glaucoma affects about 3 million Americans,

1
drsrkrit@gmail.com, 2peketi.shyam@gmail.com, 3jvramakumar@gmail.com, 4jmsvravikumar@gmail.com

DOI: 10.1201/9781003529231-64
432 Algorithms in Advanced Artificial Intelligence

ranking it as the second-leading cause of blindness worldwide, 2. Early Detection of Glaucoma

after cataracts. Open-angle glaucoma afflicts ninety percent
of Americans, leading to ocular drainage canal resistance Disease Using Machine Learning
[4]. Fluid compression of the optic nerve can go undetected Improving preventative healthcare was the primary goal of this
for a long time, despite the appearance of normalcy on work, which set out to create a state-of-the-art machine learning
the outside. A tiny gap between the iris and cornea blocks system for glaucoma early detection and categorization. The
drainage canals and causes severe symptoms in a rare, acute research made use of a number of classifiers to examine and
illness known as closed-angle glaucoma, angle-closure, or understand glaucoma patterns, including logistic regression,
narrow-angle glaucoma [5]. Damage to the optic nerve can decision trees, random forests, and support vector machines.
occur in one in three patients with normal-tension glaucoma; Finding the optimal model to train on a specific dataset is
this disorder is more common among Asians. Infants with known as model selection. A key performance indicator is
congenital glaucoma, also known as childhood glaucoma or maximised by the model. Model architecture, parameter
infantile glaucoma, experience symptoms at birth or during space, hyperparameter space, feature transformation space,
childhood due to the development of drainage canals in the and model paradigm space are some of the axes to consider
womb. Patients typically disregard the early warning signs when building a model. Training and optimising parameters
of glaucoma due to its slow, progressive changes to the eye. are two applications of statistical learning; nevertheless, the
If you want to catch eye problems early and get them treated performance of supervised learning methods could vary from
before they become permanent, you need to get your eyes dataset to dataset. Improving model performance is possible
checked regularly [7]. Intraocular eye pressure is one of the with careful feature encoding, transformation, and selection
causes of glaucoma because it raises the resistance in the [9].
drainage canals. Fluid buildup in the eye can compress and
potentially damage the optic nerve, leading to glaucoma.
Black, Hispanic, Asian, and Inuit people are at a higher
risk of developing glaucoma, and more specifically, angle-
closure glaucoma or closed-angle glaucoma, as they age.
Glaucoma is more likely in those with diabetes, but it can
also happen to anybody with hyperopia, high blood pressure,
nearsightedness, a history of eye injuries or surgeries, or a
family history of the disease. Routine eye exams evaluate
visual acuity and optic health, which can diagnose glaucoma. Fig. 64.1 Glaucoma disease using machine learning
Visual acuity, field, slit-lamp, ocular pressure, gonioscopy,
optical coherence tomography, and dilated eye exams 1. Data Exploration and Preprocessing: After reviewing a
are all part of the diagnostic process [8]. These painless big-picture dataset including patient demographics, medical
procedures can assess the optic nerve, intraocular pressure, history, and diagnostic measurements, thorough preprocessing
corneal thickness, and peripheral vision. Not addressed was performed to guarantee the model’s dependability.
Get medical help right away if you’re experiencing eye pain, 2. Visualisation and Feature Engineering: In order to
severe headaches, or vision problems; glaucoma can lead understand the subtleties of glaucoma indicators, the study
to irreversible vision loss or blindness. As an alternative used dynamic visualisations and feature engineering to reveal
to once-daily prescription eye drops, the Food and Drug complex data patterns.
Administration has authorised an implant for bimatoprost,
3. Model Evaluation: The investigation employed numerous
a glaucoma medication that dissolves and lasts for months.
classifiers, such as Decision Tree, Support Vector Machines,
Laser therapy can improve eye fluid outflow, and its effects
Random Forest, and Logistic Regression, to gain a thorough
can last for years. Although surgery can alleviate symptoms
grasp of glaucoma patterns. Accurate diagnostic tools
and decrease the progression of glaucoma, it is not a cure.
for doctors, continuing to build and improve our models,
Both conventional and less intrusive approaches to treating
and exploring opportunities for integration with existing
glaucoma are within the realm of possibility. Machine
healthcare systems. With the promise of remarkable
learning is essential in the fields of biology and medicine,
improvements in patient outcomes as a result of cutting-edge
and AI systems help to avoid incorrect diagnoses [2].
diagnostics, the emphasis is on encouraging cooperation in
Essential for future therapy are classification, which is used
healthcare innovation. Early diagnosis of glaucoma cases
for diagnosis, and machine learning, which is effective for
Both categorization and prediction have demonstrated
glaucoma prediction.
remarkable precision, opening the door to a future where
technology is fundamental to preventative healthcare.
Redefining Glaucoma Identification using State-of-the- Art Machine Learning 433

across classes. If the data classes of the target variables

are very well distributed, then the accuracy metric is
recommended. For example, glaucoma is more common
in male patients (60%) than in female patients (40%) in a
dataset of female-affecting disorders. If asked to predict if
the illness is male or female, the model will do so with a 97%
degree of accuracy in this case.
If the target variable is heavily concentrated in one category,
the accuracy measure is not appropriate. Imagine, for the sake
of argument, that a model exists for the purpose of disease
prediction and that, out of a hundred individuals, only five are
sick while the other 95 are perfectly healthy. This situation’s
accuracy score of 95% is incorrect since it assumes that
everyone would be healthy according to our model.

3. Experimental Result
Results from the experiments illuminate the dataset compo
Fig. 64.2 Process of model selection for early detection of nents used for glaucoma patient categorization according to
glaucoma several parameters. The dataset includes patient ID, age, cup
to-disc ratio, intraocular pressure (IOP), and pap smear results.
During model selection, we compare features X and target
Y to identify the best transformation F for a given training Number of the Medical Record: This dataset contains ten
dataset. Y is equal to the function F(X) minus one. “Optimal” thousand patients’ unique identifiers. The range of possible
refers to a model that maximises some performance patient IDs is from 4 to 99992.
indicator. When trying to build a model, there are a number Life expectancy: The average age of the patients is
of dimensions to think about, including the model’s approximately 53.87 years. Everyone taking part has to be
parameter space, paradigm space, hyperparameters space, between the ages of 18 and 90. The dataset shows that the
architecture space, features space, and transformation space ages of the patients vary somewhat, with a standard deviation
for features. Parameter training and optimisation can make of approximately 21.13. The level of the aqueous humour
use of statistical learning and supervised learning algorithms.
Proper feature encoding, transformations, and selection
can enhance the model’s performance. Separate metrics
are used to evaluate classification and regression, which
are two subsets of machine learning tasks; the third metric
is the model evolution metric. It is critical to know which
measurements work for which issues.
1. Measures of Machine Learning Performance: Using
training data, classification problems determine which data
sets to use for analysis. Models are able to anticipate the
labels of new data sets by learning from existing datasets
and dividing them into classes. Metrics used in performance
evaluations include yes/no, zero, and one.
1.1 Accuracy: Relative to total predictions, the percentage of
correct predictions is one of the most straightforward metrics
for categorization. This idea might be stated as
Number of Correct Preductions
Accuracy =
Total Number of Predictions
One can generate an accuracy statistic by using the scikit-learn
module or by repeatedly comparing the predicted and ground
truth values. Despite its ease of use and implementation, it
performs optimally with a balanced distribution of samples
Fig 64.3 Normal vison vs glaucoma
434 Algorithms in Advanced Artificial Intelligence

in the eye Around 17.51 mmHg is the typical intraocular deviation for patients’ cup-to-disc ratios is approximately
pressure. A normal intraocular pressure (IOP) ranges from 10 0.14, so there is room for some variation.
to 25 mm Hg. A standard deviation of around 4.36 indicates Pap smear screening: The typical pachymetry reading is
that intraocular pressure varies among people. around 549.73. In pachymetry, the numbers could range from
The typical cup-to-disc ratio, abbreviated as CDR, is around 500.01 to 599.99. The standard variance of roughly 28.90
0.55. A CDR of 0.30 to 0.80 is possible. The standard indicates that corneal thickness varies from patient to patient.

Fig 64.4 Bar graph for various features

Redefining Glaucoma Identification using State-of-the- Art Machine Learning 435

Index(['Patient ID', 'Age', 'Gender', 'Visual Acuity Measurements",

'Intraocular Pressure (IOP)', 'Cup-to-Disc Ratio (CDR)',
"Family History', 'Medical History', 'Medication Usage',
'Visual Field Test Results',
'Optical CoherenceTomography (OCT) Results", "Pachymetry",
'Cataract Status', 'Angle Closure Status', 'Visual Symptoms',
'Diagnosis', "Glaucoma Type'],
dtype='object')

Gender: ['Male' 'Ferale"]

Visual Acuity Measurements: ['LogMAR 0.1' '20/40' 'LogMAR 0.0'
'20/20]
Family History: ['No' 'Yes']
Medical History: ['Diabetes' 'Hypertension' 'None' 'Glaucoma in faily']
Cataract Status: ['present' 'absent']
Angle Closure Status: ['Open' 'Closed']
Diagnosis: ['No Glaucoma' 'Claucona']
Glaucoma Type: ['Friary Open-Angle Glaucoma' 'Juvenile
Glaucoma' 'Congenital Glaucoma' 'Norral-Tension Glaucoma'
'Angle-Closure Glaucoma' 'Secondary Glascona'] Fig 64.5 Visual acuity measurements on gender

Table 64.1 Glaucoma patient categorization according to several parameters

Patient ID Age Intraocular Cup-to- Pachymetry

Pressure (IOP) DiscRatio (CDR)
count 10000.00000 10000.000000 10000.000000 10000.000000 10000.000000
mean 50002.16880 53.872200 17.507527 0.548437 549.733974
std 28939.82498 21.127563 4.356101 0.144326 28.902741
min 4.00000 18.000000 10.000000 0.300000 500.010000
25% 24660.25000 36.000000 13.760000 0.420000 524.590000
50% 50091.50000 54.000000 17.485000 0.550000 549.335000
75% 74829.25000 72.000000 21.300000 0.670000 574.972500
max 99992.00000 90.000000 25.000000 0.800000 599.990000

2. An G., Omodaka K., Hashimoto K., Tsuda S., Shiga Y., Takada
4. Conclusion N., Kikawa T., Yokota H., Akiba M., Nakazawa T: Glaucoma
Glaucoma is a collection of eye diseases that can cause diagnosis with machine learning based on optical coherence
blindness and vision loss. This work established a system tomography and color fondues images, Journal Healthcare
for early identification and classification of glaucoma using Engineering, 2019;1:10, DOI:10.1155/2019/4061313,2019.
3. Asaoka R., Murata H., and Iwase A., Araie M: Detecting
machine learning. The approach delves into glaucoma
preperimetric glaucoma with standard automated perimetry
patterns and details via feature engineering, data exploration,
using a deep learning classifier, Ophthalmology, 123:1974–
and model evaluation. The primary goals are to enhance 1980, Doi: 10.1016/j.ophtha.2016.05.029,2016.
preventive healthcare and to increase the reliability of 4. Asaoka R., Murata H., Hirasawa K., Fujino Y., Matsuura M.,
experimental results. Miki A., Kanamoto T., Ikeda Y., Mori K., Iwase A., et al. Using
deep learning and transfer learning to accurately diagnose
References early- onset glaucoma from macular optical coherence
tomography images, American Journal Ophthalmology,
1. Kumar, Dr Jmsv Ravi, and M. CHANDINI. “SECRBAC: 2019;198:136–145,DOI:10.1016/j.ajo.2018.10.007,2019.
Secure Data In The Clouds.” International Journal of Research 5. Barros D.M., Moura J.C., Freire C.R., Taleb A.C., Valentim
5.15 (2018): 95-106. R.A., Morais P.S: Machine learning applied to retinal
436 Algorithms in Advanced Artificial Intelligence

image processing for glaucoma detection: Review and Journal AIP Conference Proceedings, Volume 2492, Issue 1,
perspective, Biomedical Engineering, OnLine, 2020;19:1– Publisher AIP Publishing, 2023.
21,DOI:10.1186/s12938-020-00767-2,2019. 19. JMSV RAVI KUMAR” Human Activity Recognition using
6. Hashimoto Y., Asaoka R., Kiwaki T., Sugiura H., Asano Machine Learning “ Journal AIP Conference Proceedings,
S., Murata H., Fujino Y., Matsuura M., Miki A., Mori K., Volume 2492, Issue 1, Publisher AIP Publishing, 2023.
et al: Deep learning model to predict visual field in central 20. J Kumar, A Shahi, R Aytha, G Varri, D Brundavanam “ Vehicle
10° from optical coherence tomography measurement in theft prevention system using IoT “Journal AIP Conference
glaucoma, Br. J. Ophthalmology, 2020:1–7,DOI: 10.1136/ Proceedings, Volume 2492, Issue 1, Publisher AIP Publishing,
bjophthalmol-2019-315600,2020. 2023.
7. Lee S.D., Lee J.H., Choi Y.G., You H.C., Kang J.H., and Jun 21. J Kumar, TD Nagendra, M Harshitha, AB Prakash “ Fake
C.H: Machine learning models based on the dimensionality image detection using CNN “Journal AIP Conference
reduction of standard automated perimetry data for glaucoma Proceedings, Volume 2492, Issue 1, Publisher AIP Publishing,
diagnosis, Artificial Intelligence Medical, 2019;94:110– 2023.
116,DOI:10.1016/j.artmed.2019.02.006,2019. 22. J Kumar, MN Kumar, NV Narendra, P Pradeep “ driver
8. Renukalatha S., Suresh K.V: Classification of glaucoma drowsiness monitoring system using machine learning svm
using simplified-multiclass support vector machine, algorithm “Journal AIP Conference Proceedings, Volume
Biomedical Engineering, 2019; 31:1950039, doi: 10.4015/ 2492, Issue 1, Publisher AIP Publishing, 2023.
S101623721950039X, 2019. 23. JMSV RAVI KUMAR “ A Symmetric Searchable Encryption
9. Sejong Oh,Yuli Park,Kyong Jin Cho,and Seong Jae Identification of Data on Probabilistic Trapdoors “International
Kim:Explainable Machine Learning Model for Glaucoma Journal of Engineering and Advanced Technology (IJEAT),
Diagnosis and Its Interpretation, Diagnostics (Basel), ISSN: 2249 – 8958, Volume 9, Issue 3, Publisher Blue Eyes
2021 Mar; 11(3): 510,PMCID: PMC8001225,PMID: Intelligence Engineering & Sciences Publication, 2020.
33805685,Published online 2021 Mar. 24. JMSV RAVI KUMAR “Artificial Bee Colony Algorithm: A
10. Wang P., Shen J., Chang R., Moloney M., Torres M., Survey and Recent Applications” published in International
Burkemper B., Jiang X., Rodger D., Varma R., Richter Journal of Pure and Applied Mathematics, ISSN 1314-3395,
G.M:Machine learning models for diagnosing glaucoma VOLUME 118, ISSUE 24 , Jul-18.
from retinal nerve fiber layer thickness maps, Ophthalmology 25. JMSV RAVI KUMAR “ Authentication for Cloud Services
Glaucoma,2:422–428,DOI:10.1016/j.ogla.2019.08.004,2019. using Steganography” published in International Journal of
11. Estharakula, Suresh, and Kumar JMSV Ravi. “EBPH-MAC: Engineering and Technology(UAE)-IJET, ISSN 2227-524X,
Emergency Based Priority Hybrid Medium Access Control VOLUME 7, ISSUE 3.49 , Jul-18.
for Mobility Aware Cooperative WSN’s In Indoor Industrial 26. JMSV RAVI KUMAR “A review on task scheduling algorithms
Monitoring.” International Journal of Research 5 (2018): in cloud computing and their approaches” published in
1456-1465. International Journal of Pure and Applied Mathematics, ISSN
12. Kumar, J. M. S. V., et al. “System Testability Assessment and 1314-3395, VOLUME 118, ISSUE 24, Jul-18.
testing with Micro architectures.” International Journal of 27. JMSV RAVI KUMAR “Review of Data mining Technique
Advanced Research in Computer Science 2.6 (2011). using SaaS on the Cloud” published in International Journal of
13. Kumar, J. M. S. V., et al. “Reverse Engineering A Generic Pure and Applied Mathematics, ISSN 1314-3395, VOLUME
Software Exploration Environment Is Made Of Object 118, ISSUE 24 , Jul-18.
Oriented Frame Work And Set Of Customizable Tools.” 28. JMSV RAVI KUMAR “Smart Controlling, Monitoring
International Journal of Advanced Research in Computer and Automation of Street Light System using Raspberry
Science 2.5 (2011). PI “ published in International Journal of Pure and Applied
14. Kumar, J. M. S. V., et al. “Analyzing the Modern Tool- Mathematics, ISSN 1314-3395, VOLUME 118, ISSUE 24 ,
Supported UML-Based Static Reverse Engineering.” Jul-18.
International Journal of Advanced Research in Computer 29. JMSV RAVI KUMAR “ A Survey on Internet of Things for
Science 3.4 (2012). Healthcare and Medication Management” was authored by
15. Kumar, J. M. S. V., et al. “Active Scrutiny Techniques for the JMSV Ravi Kumar published in International Journal of Pure
Reconstruction of Architectural Views.” International Journal and Applied Mathematics, ISSN 1314-3395, VOLUME 118,
of Advanced Research in Computer Science 3.1 (2012). ISSUE 24 , Jul-18.
16. N Santha Raju, JMSV Kumar, B Sujatha,”Time series analysis 30. JMSV RAVI KUMAR “ SECRBAC: Secure Data in the
of stock price movements: Insights from data mining using Clouds” was authored by JMSV Ravi Kumar published in
machine learning”, journal AIP Conference Proceedings, International Journal of Research, ISSN 2348-6848, VOL 5,
Volume 2492, Issue1, Publisher AIP Publishing,2023. ISSUE 15 , Jul-18.
17. Prayaga Atchyut Pavan, Sattibabu Sattibabu, JMSV Kumar 31. JMSV RAVI KUMAR “ EBPH MAC: Emergency Based
“A deep learning approach to detect malaria “Journal AIP Priority Hybrid Medium Access Control for Mobility
Conference Proceedings, Volume 2492, Issue 1, Publisher AIP Aware Cooperative WSN’s In Indoor Industrial Monitoring”
Publishing, 2023. published in International Journal of Research, ISSN 2348
18. Ch Bhanu Revathi, JMSV Kumar, B Sujatha” Intracranial 6848, VOLUME 5, ISSUE 12 , Jul-18.
hemorrhage detection in human brain using deep learning “
Redefining Glaucoma Identification using State-of-the- Art Machine Learning 437

32. JMSV RAVI KUMAR “ Prioritizing software components for Demand and Supply,” International Journal of Research In
realistic reuse” published in International Journal of Sciences Science & Engineering (IJRISE), vol. 3, issue 1, pp. 9–23,
& Applied Research, ISSN 2394-2401, VOL 4, ISSUE 24, 2023.
Jul-17. 45. M. Srikanth, “Smallholder Farmers Crop Registering Privacy-
33. JMSV RAVI KUMAR “ Cloud Storage Services and Privacy Preserving Query Processing over Ethereum Blockchain,”
Protection” published in International Conference on Research Journal of Pharmaceutical Negative Results, vol. 13, issue 7,
Advancements in Computer Science and Communication, pp. 5609-5617, Dec. 2022. [Scopus]
ISSN 978-93-85100- 64-2, VOL 5, ISSUE 3.49, December-16. 46. M. Srikanth, “The Early Detection of Alzheimer’s Illness
34. JMSV RAVI KUMAR “Analyzing the Modern Tool- Using Machine Learning and Deep Learning Algorithms,”
Supported UML-Based Static Reverse Engineering” published Journal of Pharmaceutical Negative Results, vol. 13, issue 9,
in International Journal of Advanced Scientific Research and pp. 4852-4859, Nov. 2022. [Scopus]
Technology, ISSN 0976-5697, VOL 3, ISSUE 4, Jul-12. 47. M. Srikanth, “Small Holders Farming Predictive Analysis
35. JMSV RAVI KUMAR “Active Scrutiny Techniques for Using Peer-To-Peer Approach,” International Journal of
the Reconstruction of Architectural Views” published in Agriculture and Animal Production, vol. 2, issue 05, pp. 26
International Journal of Advanced Scientific Research and 37, Sep. 2022.
Technology, ISSN 0976-5697, VOL 3, ISSUE 1, January-12. 48. M. Srikanth, “Using Machine Learning and Neural Networks
36. JMSV RAVI KUMAR “System Testability Assessment and Technologies, a Bottom-Up Water Process Is Being Used To
testing with Micro architectures” published in International Reduce All Water Pollution Diseases,” Journal of Artificial
Journal of Advanced Scientific Research and Technology, Intelligence, Machine Learning and Neural Network
ISSN 0976-5697, VOL 2, ISSUE 6, December-11. (JAIMLNN), vol. 2, Oct. 2022.
37. JMSV RAVI KUMAR “Reverse Engineering A Generic 49. M. Srikanth, “Blockchain Enable for Smallholder’s Farmers
Software Exploration Environment is made of Object- Crop Transaction Using Peer-to-Peer,” Indo-American Journal
Oriented Frame Work and Set of Customizable Tools” of Agricultural and Veterinary Sciences, vol. 10, issue 3, pp.
published in International Journal of Advanced Scientific 33-43, Sep. 2022.
Research and Technology, ISSN 0976-5697, VOL 2, ISSUE 50. M. Srikanth, “Protecting Tribal Peoples Nearby Patient Care
5, September-2011. Centres Use a Hybrid Technique Based on a Distribution
38. M. Srikanth, “Integrated Technologies for Proactive Bridge- Network,” International Journal of Health Sciences, Jun.
Related Suicide Prevention”, Journal of Namibian Studies, 2022. [Scopus]
Volume 1, Issue 33, Pages 2117-2136, ISSN: 1863-5954, Sep 51. M. Srikanth, “Blockchain-Based Crop Farming Application
2023. [Scopus] Using Peer-to-Peer,” Journal of Xidian University, Apr. 2022.
39. M. Srikanth, “Deep Learning Approaches for Predictive 52. M. Srikanth, “Stop Spread Corona Based on Voice, Face and
Modeling and Optimization of Metabolic Fluxes in Engineered Emotional Recognition Using Machine Learning, Query
Microorganism” International Journal of Research in Science Optimization and Blockchain Technology,” Solid State
&Amp; Engineering (IJRISE) ISSN: 2394-8299, 3(05), 1–11. Technology, Vol. 63 No. 6 (2020) [Scopus]
https://doi.org/10.55529/ijrise.35.1.11, July 2023. 53. M. Srikanth, “Machine Learning for Query Processing System
40. M. Srikanth, “Tackling Outliers for Predictive Smallholder and Query Response Time Using Hadoop,” IJMTST, Aug.
Farming Analysis,” in Proceedings of the 2023 3rd International 2020.
Conference on Smart Data Intelligence (ICSMDI), pp. 93-98, 54. M. Srikanth, “Block-level Based Query Data Access Service
IEEE Xplore, March 26, 2023. [Scopus] Availability for Query Process System,” IEEE, Page 1-9, Jul.
41. M. Srikanth, “Blockchain-Based Consensus For A Secure 2020. [Scopus]
Smart Agriculture Supply Chain,” European Chemical 55. M. Srikanth, “Query Response Time in Blockchain Using
Bulletin, vol. 12, special issue 4, pp. 8669-8678, 2023. Big Query Optimization,” The Role of IoT and Blockchain
[Online]. Available: doi: 10.48047/ecb/2023.12.si4.776.ISSN: Techniques and Applications from Computer Science and
2063-5346, 2023. [Scopus] Information Management, Apple Academic Press, Exclusive
42. M. Srikanth, “Predict Early Pneumonitis in Health Care Using Worldwide distribution by CRC Press Taylor & Francis
Hybrid Model Algorithms,” Journal of Artificial Intelligence, Group, Jan. 2022. [Scopus]
Machine Learning and Neural Network (JAIMLNN), vol. 3, 56. M. Srikanth, “A New Approach for Authorship Verification
issue 03, pp. 14-26,ISSN: 2799-1172, Apr. 2023. Using Information Retrieval Features,” Springer-ICSE, vol.
43. M. Srikanth, R. N. V. Jagan Mohan, M. Chandra Naik. (2023). 74, pp. 23-29. [Scopus]
A New Way to Improve Crop Quality and Protect the Supply 57. M. Srikanth, “An Enhanced and Naive Clustering Algorithm
Chain is to use a Trajectory Network and Game Theory. for Text Classification Based on Weight,” International Journal
Mathematical Statistician and Engineering Applications, & Magazine of Engineering, Technology, Management and
71(4), 10600–10610. https://doi.org/10.17762/msea. Research, Dec. 2012.
v71i4.1952, ISSN: 2094-0343, 2023 [Scopus]
Note: All the figures in this chapter were designed by the author.
44. M. Srikanth, “Auction Algorithm: Peer-To-Peer System Based
on Hybrid Technologies for Smallholder Farmers to Control
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
438 Algorithms in Advanced Artificial Intelligence

Probe Method: A Dependable Economy

Data Methodology Feature Selection for
Machine Learning
65

Chiranjeevi S. P. Rao Kandula1, Srinivas Rao Parnadi2

Assistant Professor,
Dept of Computer Science and Engineering,
Swarnandhra College of Engineering and Technology

Abstract: Investors are looking for methods to profit from artificial intelligence’s developing capabilities, especially in
emerging technologies, due to its increasing importance in daily life and the economy. Artificial intelligence (AI) investment
is a part of the digital revolution since it automates jobs that used to need human intelligence, opening doors for investors to
profit from the economy’s expected expansion. Python has made a huge splash in the software industry, but getting a handle
on how it works is essential. To help investors identify the sweet spot in machine learning development for speed, model size,
and performance, the feature selection probe approach is available. Discovery, management, marketing, state, and profit for an
entire fiscal year are revealed via an experimental dataset. The probe method is a reliable feature selection technique used in
multi-regression analysis of economic data.
Keywords: Artificial intelligence, Feature selection, Machine learning, Python, Probe method

1. Introduction thanks to AI’s fast industry disruption. However, the strong

level of competition makes it challenging to determine the
The field of artificial intelligence (AI), defined as “machines triumphant participants. According to Krešimir Buntak
that mimic or replace human thinking processes across (2021) [5], innovators have the ability to stay at the top of
a variety of contexts and industries” (Amnur, 2017 [1]), their industries, while imitators enhance their technology to
is receiving a great deal of attention as it becomes more make it even more successful in the long run.
important to our everyday lives and the economy. With As the amount of data and information continues to
AI’s capabilities constantly improving, investors are trying increase at an exponential rate, classification becomes more
to figure out how to make the most money in this vitally important in order to improve efficiency in both personal
important growing business (Daqar, M., 2019 [2]). There are and professional contexts. According to Gupta (2021)
many potential rewards for businesses that put money into [6], noisy text analytics is the method of obtaining semi-
new technology, such as railroads or personal computers, structured or ordered information from unstructured text
but there is also a high risk of failure. Aiming to automate data. It is growing as a result of the enormous amounts of data
operations that formerly required human intelligence, AI produced by many applications, such as online chat, SMS,
presents investors with opportunities to capitalise on its emails, and newsgroups. Data like this is notoriously noisy
predicted expansion in the economy, using a metaphor from due to the prevalence of processing noise, spelling errors, and
the computer revolution (Davenport, 2018; Delen, 2018). acronyms. Because of their intricacy, traditional text analysis
Investment in new businesses can yield substantial returns tools cannot be applied.

1
prabhakar1.kandula@gmail.com, 2psrinu.cse@swarnandhra.ac.in

DOI: 10.1201/9781003529231-65
Probe Method: A Dependable Economy Data Methodology Feature Selection for Machine Learning 439

Investors may invest in companies developing AI, hardware 5. Rich Standard Library: Python’s extensive standard
companies, software companies, or those benefiting from library, encompassing modules for web development
its wider adoption by Raghupathi,2021[9]. For instance, in and data analysis, makes it a popular choice for various
the personal computer industry, investors could invest in applications.
computer manufacturers, hardware companies, software 6. Community and Packages: Python’s community is
companies, and automation companies. Investments in thriving, and its third-party package ecosystem, the
computers and technology have been made, with some Python Package Index (PyPI), includes thousands of
being direct bets and others more conservative. As AI may libraries to enhance its capabilities.
displace workers, companies focused on worker retraining 7. Cross-Platform: Python’s cross-platform nature
may benefit. Some stocks may match these criteria for AI allows for the development of applications that can
investment by SrikarSoma, 2023[10]. run on various operating systems without significant
AI-generated art allows users to create images based on their modifications.
descriptions, utilizing images from around the world by 8. Versatile: Python is a versatile tool that can be utilized
Simon, 2017[11]. This technology has been used by people for web development, data analysis, machine learning,
of all ages and backgrounds but concerns about copyright scripting, and more.
arise as artists feel their livelihoods are at risk. Public
Python’s simplicity, readability, and versatility make it a
companies have vast collections of AI-generated artwork.
popular choice for developers, offering ease of learning and
Startup companies, often in promising fields like artificial
quick prototyping, making it an excellent choice for both
intelligence and machine learning, are initially capitalized by
beginners and seasoned programmers.
venture capital investors and then publicly raised to expand
operations and customer base. Successful companies have
well-received early investors by Ward, 2014[12]. 3. A Reliable Feature Selection
The following is how the paper is set up: Section -1 deals with Method for Machine Learning (ML)
Introduction. In Section-2, understanding machine learning is the Probe Method
is provided. Section-3 covers a Reliable Feature Selection
Method for Machine Learning (ML) is the Probe Method. The method to investing in businesses finding the right ratio
Section-4 deals with Multiple Regression analysis. Section-5 of speed, model size, and performance when using ML
deals with experimental result and they came to conclusion development in reality. Using feature selection is a popular
and in section 6, Section-7 deals with references are made. method for enhancing speed, reducing size, and maintaining
(or slightly deteriorating) performance. Utilizing featured
selection does this. In this case, the “Probe Method” to
2. Understanding the Machine be quite helpful of any kind of application by Nicholas
Language Pudjihartono, 2022[8].
Python, a popular programming language, has significantly The following framework shows how it functions:
impacted the tech industry. However, it’s crucial to understand 1. Add a random feature (noise).
its functionality. 2. Use the fresh dataset to train a model.
1. Interpreted Language: Python code is interpreted at 3. Calculate the value of a characteristic.
runtime, not compiled into machine code like C++, 4. Remove any original features that trail the random
allowing the user to write code and have the interpreter feature in importance.
execute it line by line.
5. Repeat until convergence.
2. High-Level Language: Python is a user-friendly
programming language that simplifies complex tasks This also makes usual brain intelligence. A feature may be
by abstracting low-level details, making it accessible to useless for the model if its relevance is lower than that of a
both beginners and experienced developers. random (noise) feature.
3. Dynamically Typed: Python is dynamically typed,
allowing for flexible and concise code by allowing the 4. Multiple Regression Analysis
interpreter to determine variable types at runtime. In the same way as basic regression fits data to a linear
4. Indentation Matters: Python uses indentation to equation, multiple linear regression models the feature-
define code blocks, ensuring clean and readable code, response relationship. According to Orogun Okunola
but requires careful attention to whitespace.
440 Algorithms in Advanced Artificial Intelligence

Fig. 65.1 Probe method: A reliable feature selection technique in ML

Adebola (2021) [7], it assesses the variables and correlations Importing Dataset: The code for importing the dataset
that influence the anticipated result. (50_CompList) containing all variables is provided.
Y = b0 + b1 * x1 + b2 * x2 + b3 * x3 + ...... bn * xn (1) #importing datasets
data_set=pd.read_csv(‘50_CompList.csv’)
Y = Dependent variable and x1, x2, x3, ..., xn
Output: The dataset will be obtained as follows:
= multiple independent variables.

5. Experimental Result
With the goal of identifying maximum profit and affecting
factors, a dataset of startup investment amounts discloses
R&D, administration, marketing, state, and profit for a
financial year. Profit is used as the dependent variable in
the multiple linear regression model, with the other four
variables being considered independent (Yoshita Chakrabort,
2023 [13]).
1. Preparing data for analysis.
2. Applying the MLR model to the data set used for
training.
3. Foretelling how the test set will turn out.
Step-1: Data Pre-Processing: The first step of data pre
processing, which includes the following steps: The library
will be imported to assist in building the model, as illustrated
in the provided code. The output reveals five variables, four of which are
continuous, and one of which is a categorical variable. The
#importing libraries
process involves identifying both dependent and independent
import numpy as nm variables.
import matplotlib.pyplot as mtp
import pandas as pd
Probe Method: A Dependable Economy Data Methodology Feature Selection for Machine Learning 441

(categorical_features = [3])
x= onehotencoder.fit_transform(x).toarray()
The encoding process only involves one independent variable,
which is the state, while the other variables are continuous.
Output:

The output shows state columns converted into dummy

variables (0 and 1), representing California, Florida, and
New York states, as confirmed by comparing it with the
original dataset. To avoid creating a dummy variable trap, it
is crucial to use dummy variables 1 less than the total number
of variables at the same time. We are creating a single line of
code to prevent the dummy variable trap.
#avoiding the dummy variable trap:
x = x[:, 1:] The first dummy variable may introduce
multicollinearity in the model if not removed.

The output indicates that the final column contains categorical

variables that require encoding due to their unsuitability
for model fitting.Duplicate Variable Encoding: The state
categorical variable is encoded using the LabelEncoder
class because it cannot be directly applied to the model.
OneHotEncoder can be used to construct dummy variables,
eliminating problems with relational order.
#Categorical data
fromsklearn preprocessing import LabelEncoder,
OneHotEncoder
labelencoder_x= LabelEncoder()
x[:, 3]= labelencoder_x.fit_transform(x[:,3])
Onehotencoder= OneHotEncoder
We modified the code to remove the first column from the
output image and split the dataset into a training set and a test
442 Algorithms in Advanced Artificial Intelligence

set. Divide the dataset in half to use for training and testing. The model has been successfully trained using the training
Using sklearn.model_selection, bring in test_and_train split. dataset and will be tested on the test dataset in the next step.
The function split_train_test is used to split the data into Step: 3- Prediction of Test set results: The final step for our
training and testing sets. It takes the parameters y_train, x_ model involves evaluating its performance by predicting the
test, y_train, y_testposition, orientation, test size=0.2, and test set result using a y pred vector, as per the code provided.
random state=0. #Predicting the Test set result;
The code will partition our dataset into a training set and a y_pred regressor. Predict (x_test)
test set.Results: You can see the dataset split into a training
set and a test set in Spyder IDE’s variable explorer. Executing the code generates a new vector under variable
explorer, allowing us to test our model by comparing
predicted and test set values.
Output:

Training set:

The model's performance is evaluated by comparing predicted

and test sets, with a 267$ difference between predicted and
actual values, indicating good prediction. The code provides
a method to check the scores for both the training and test
datasets.
Print('Train Score: ',regressor.score
(x_train, y_train))
print('Test Score: ', regressor.score(x_test, y_test))
The library will handle feature scaling in MLR, eliminating Output: The score is:
the need for manual intervention. Train Score:0.9501847627493607
Step: 2- Fitting our MLR model to the Training Test Score:0.9347068473282446
set: We have created a training dataset and developed The model achieved a score of 95% accuracy with the training
a regression model that resembles our Simple Linear dataset and 93% accuracy with the test dataset.
Regression model.
#Fitting the MLR model to the training set: 6. Conclusion
from sklearn.linear_model import Linear Regression
regressor-Linear Regression() The computer revolution exemplifies AI investing, automating
tasks requiring human intelligence. Understanding Python’s
regressor.fit(x_train, y_train) functionality is crucial. Using feature selection probe
Output: Out [9]:Linear Regression(copy_X=True, fit_ method, investors can find the right speed, model size, and
intercept=True,n_jobs=None, performance in ML development.
normalize False)\
Probe Method: A Dependable Economy Data Methodology Feature Selection for Machine Learning 443

in Nigeria, NIPES Journal of Science and Technology

References Research, 3(1), pp:99-108, pISSN: 2682-5821,2021.
1. Amnur, H: Customer Relationship Management and 8. Nicholas Pudjihartono: A Review of Feature Selection Methods
Machine Learning technology for Identifying the Customer, for Machine Learning-Based Disease Risk Prediction, Frontier
International Journal on Informatics Visualization, 12-15, Bioinform., 27 June 2022, Sec. Integrative Bioinformatics,
2017. Volume 2 - 2022, https://doi.org/10.3389/fbinf.2022.927312.
2. Daqar, M., & Smoudy, A: The Role of Artificial Intelligence 9. Raghupathi, W., and Raghupathi, V: Contemporary business
on Enhancing Customer Experience, International Review of analytics: an overview. Data 6, 86, doi: 10.3390/data6080086,
Management and Marketing, 9(4), 22,2019. 2021.
3. Davenport, T. H: From analytics to artificial intelligence. 10. SrikarSoma: Applications Of Artificial Intelligence On
Journal of Business Analysis 1, 73–80, doi: 10.1080/2573234 Business Analytics, International Journal Of Creative
X.2018.1543535,2018. Research Thoughts (IJCRT), Volume 11, Issue 1 January
4. Delen, D., and Ram, S.: Research challenges and opportunities 2023, ISSN: 2320-2882, 2023.
in business analytics, Journal of Business Analysis, 1, 2–12, 11. Simon, P: Analytics-The Agile Way. Hoboken, NJ: John Wiley
doi: 10.1080/2573234X.2018.1507324,2018. and Sons, Inc. doi: 10.1002/9781119424215, 2017.
5. Krešimir Buntak et al: Application of Artificial Intelligence 12. Ward, M. J., Marsolo, K. A., and Froehle, C. M.: Applications
in The Business, International Journal for Quality Research of business analytics in healthcare. Business Horizontal, 57,
15(2):403-416, DOI:10.24874/IJQR15.02-03,May 2021. 571–582, doi: 10.1016/j.bushor.2014.06.003,2014.
6. Gupta, A.: Business analytics: process and practical 13. Yoshita Chakrabort: Multiple regression model for
applications, in Trends of Data Science and Applications, Eds, prediction of the probability of deviation from one’s main
vol. 954. (Singapore: Springer), 307–326, doi: 10.1007/978 aim in life, IJARCCE, Vol: 11, Issue No.3, DOI:10.17148/
981-33-6815-6_15, 2021. IJARCCE.2022.11338, March 2022.
7. Orogun Okunola Adebola:A Multiple Linear Regression Note: All the images in this chapter were designed by the author.
Model for Analyzing and Predicting Educational Development
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
444 Algorithms in Advanced Artificial Intelligence
Estimating Human Life Expectancy
through Sentiment Analysis,
Population-based Optimisation, 66
and Machine Learning Models

Meduri Raghu Chandra1

Assistant professor,
Department of Information Technology,
Shri Vishnu Engineering College for women, Bhimavaram,
G. Jaya Raju2
Assistant professor,
Department of Computer Science and Engineering,
Aditya College of Engineering and Technology(A), Surampalem
Lanka Atri Datta Ravi Tez3
Assistant professor,
Department of Computer Science and Engineering,
Sri Vasavi Engineering College, Pedatadepalli3
K.Lakshmaji4
Assistant professor,
Department of Information Technology,
Shri Vishnu Engineering College for women, Bhimavaram

Abstract: In order to arrive at an accurate estimate of the average human life expectancy, one must consider several factors
including heredity, environmental influences, lifestyle choices, and access to healthcare. Because of their reliance on sparse
data, traditional approaches may fail to adequately represent the complex interrelationships among these components. This
research presents a new method that utilizes machine learning models, population-based optimization, and sentiment analysis
to estimate life expectancy. This method utilizes population-based optimization strategies, machine learning algorithms, and
massive amounts of text data to produce more accurate and trustworthy life expectancy estimates.
Keywords: Life expectancy estimation, Sentiment analysis, Population-based optimization, Machine learning, Data-driven
approach, Precision healthcare

1. Introduction trustworthy life expectancy predictions are now within reach,

thanks to recent developments in machine learning, sentiment
Healthcare planning, resource allocation, and individual well- analysis, and population-based optimization.
being all rely on accurate estimates of human life expectancy.
Life expectancy estimation has historically made use of This research delves into a fresh method for estimating life
statistical approaches and restricted data sources, which expectancy that combines sentiment analysis, optimization
have frequently been inadequate in capturing the intricate based on population size, and machine learning models. This
interaction of factors impacting longevity. More precise and strategy seeks to overcome the shortcomings of conventional
1
raghuit@svecw.edu.in, 2jayaraju.gara@acet.ac.in, 3ravitez.cse@srivasaviengg.ac.in, 4kotlalakshmaji@gmail.com

DOI: 10.1201/9781003529231-66
Estimating Human Life Expectancy through Sentiment Analysis, Population-based Optimisation, and Machine Learning Models 445

methods and deliver more precise and illuminating life The article “Significance of Machine Learning in Healthcare:
expectancy estimations by utilizing large-scale text data, Features, Pillars, and Applications” (2022), written by S. K.
population-based optimization techniques, and machine Yadav et al. Machine learning has several potential uses in
learning algorithms. the medical field, and this article covers them all. The authors
Strategy for Health Care: In order to meet the unique go into how machine learning could enhance the precision of
healthcare requirements of a community, precise life diagnoses, treatment plans, and overall patient results.
expectancy estimations are necessary for healthcare planning Publication: “Life Expectancy Estimation using Social
and resource allocation. Media Data and Machine Learning Techniques” (2023) by
Knowing their life expectancy empowers individuals to make Mohammad Arif et al. In order to estimate life expectancy
educated decisions about their diet, medical treatment, and from social media data and machine learning techniques,
savings, enhancing their health and happiness. this research presents a framework. When compared to more
conventional approaches that use demographic data, the
Pension and social security program design, as well as authors’ suggested strategy fared better.
retirement preparation, can all benefit from life expectancy
predictions in the realm of social policy.
Research and Development: The field of aging biology, as
well as attempts to prevent diseases and create treatments,
can benefit from accurate life expectancy estimations.
Healthcare planning, individual well-being, social policy,
and research and development efforts might all benefit from
this study’s more precise and informative life expectancy
estimates, which would overcome the shortcomings of
previous approaches.

2. Literature Review
“Life Expectancy Prediction through Analysis of
Immunization and HDI Factors Using Machine Learning
Regression Algorithms” (2021) by the team of Sumit Singh.
Using indicators like the Human Development Index (HDI)
and vaccination rates, this study investigates the feasibility
of using machine learning regression methods to estimate
future life expectancy. When compared to more conventional
approaches, the authors’ suggested strategy proved to be
more accurate. Fig. 66.1 (a) Strategy of data (b) Data pre-processing
Written by Muhammad Bilal et al., “An efficient sentiment
analysis methodology based on long short-term memory
networks” (2021). Using Long Short-Term Memory (LSTM) 3. Proposed Methods
networks, this research suggests a powerful approach to Gathering data and cleaning it up Researchers can compile
sentiment analysis. When compared to more conventional large-scale text data by utilizing a variety of sources,
sentiment analysis techniques, the authors’ suggested method including social media posts, news articles, medical records,
fared better. and public health reports.
Mohamed El-Kenawy et al. published the article Data Cleaning: Eliminate extraneous information,
“Hyperparameter Tuning for Machine Learning Algorithms inconsistencies, and noise from the text data through pre
Used for Arabic Sentiment Analysis” in 2022. In this processing.
study, we look at how to optimize the hyperparameters
Get useful information out of the text data by extracting
of machine learning algorithms that analyze Arabic
features like sentiment ratings, keywords, and linguistic
sentiment using techniques based on population-based
characteristics. Evaluating Public Opinion To determine if a
optimization. Compared to more conventional approaches
text fragment is favorable, negative, or neutral, you can train
to hyperparameter tweaking, the authors’ suggested strategy
a sentiment analysis model with labeled text data. Based on
outperformed the competition.
the results of the sentiment analysis model, determine the
446 Algorithms in Advanced Artificial Intelligence

sentiment ratings for every text fragment. To get a general records. The quality and consistency of the data are ensured
picture of how people feel about a certain topic, you can use by preprocessing and cleaning, and the features that are
sentiment aggregation, which involves adding up sentiment useful for machine learning models are extracted through
ratings from several text sources. sentiment analysis and feature extraction. To maximize the
Efficient Optimization for Population-Based Algorithms models’ parameters for life expectancy estimation, we use
Develop a machine learning model to estimate a person’s techniques that draw inspiration from natural selection,
lifespan based on their current age, gender, socioeconomic known as population optimization.
position, and retrieved sentiment ratings. Optimization Predicting life expectancy using extracted data and sentiment
technique: To optimize the machine learning model’s ratings is possible with the use of several machine learning
parameters, use a population-based optimization technique methods, including neural networks, logistic regression,
like a genetic algorithm or particle swarm optimization. and linear regression. To improve the model and enhance
Tuning the settings: Run the optimization method again and its performance, we must evaluate and fine-tune it. After the
again until you discover the settings that make the machine models have been tested and improved, they can be put into
learning model’s predictions as accurate as possible. a production setting to be used in the real world. Maintaining
the models’ accuracy and relevance over time requires
Training ML Models: With the optimum parameters and a continuous monitoring and development.
representative training dataset of individuals whose life
expectancy is known, train the ML model. Improving accuracy, gaining holistic insights, making
individualized forecasts, analyzing populations, and allocating
Evaluate the trained model’s performance on a separate test healthcare resources are all advantages of using machine
dataset to see how well it generalizes and how accurate its learning and sentiment analysis for life expectancy estimation.
predictions are. Deploying the Trained and Evaluated Model: Better life expectancy predictions, better healthcare policy
Put the trained and evaluated model into production to and intervention, and improved individual health can all result
forecast the lifespan of new people or groups. from using these methods. Machine learning and sentiment
Analyze and Integrate Sentiment-Informed Prediction: Add analysis have the potential to improve our understanding
sentiment ratings to the ML model to make it more accurate of the factors that affect the human lifespan. This, in turn,
by factoring in people’s subjective feelings and opinions. could lead to more accurate and dependable estimates, which
Population-Based Analysis: Examine the projected lifespans could have far-reaching implications for healthcare planning,
of various demographics to spot trends and patterns linked to personal wellness, and social policy.
socioeconomic status, personal lifestyle choices, and access
Table 66.1 Machine Learning and Sentiment Analysis for
to healthcare. To help people make better decisions and alter
Better Life Expectancy Estimation
their lifestyles, we may provide them with life expectancy
predictions based on their unique characteristics and shed Attribute Feature Value
light on the elements that affect their lifespan. Age Numerical 35
Gender Categorical Male
3.1 Machine Learning and Sentiment Analysis
Socioeconomic Status Categorical Middle
for Better Life Expectancy Estimation
Sentiment Score Numerical 0.75
By delving into the myriad aspects that affect people’s
Keywords Textual “healthy lifestyle”,
lifespans, machine learning and sentiment analysis have the
“exercise”, “nutrition”
potential to completely transform the way life expectancy
Linguistic Features Textual “positive language”,
is estimated. The procedure entails collecting detailed
“optimism”, “well-being”
information from a variety of resources, including public
Life Expectancy Numerical 78
health reports, social media posts, news stories, and medical

Age Gender Socioeconomic Sentiment Score Keywords Linguistic Life Expectancy

Status Features
0 35 Male Middle 0.75 healthy lifestyle positive 78
language
1 28 Female Low 0.52 exercise optimism 72
2 42 Male Middle 0,83 nutrition well-being 80
3 56 Female High 0.68 smoking stress 75
4 63 Male Middle 0.91 alcohol anxiety 79
Estimating Human Life Expectancy through Sentiment Analysis, Population-based Optimisation, and Machine Learning Models 447

Fig. 66.2 Hstorical diagram for various parmeters

Mean Absolute Error: 6.8999999999999915 3.2 Making Reliable Life Expectancy Predictions
Predicted Life Expectancy: 78.17142857142856 by Combining Sentiment Analysis with
Machine Learning
A life expectancy prediction system that uses sentiment
analysis and machine learning involves several steps. These
include data collection, preprocessing, feature extraction,
population-based optimization, model evaluation, model
deployment, continuous monitoring and improvement, and
ethical considerations. We collect data from various sources,
such as social media posts, news articles, medical records,
and public health reports. Preprocessing removes noise,
inconsistencies, and irrelevant information, while feature
extraction extracts meaningful features like sentiment scores,
keywords, linguistic features, and demographic information.
Population-based optimization algorithms refine the model
parameters and evaluate the prediction accuracy of each
solution. The model is then divided into training and testing
sets and evaluated using metrics like mean absolute error
or root mean squared error. After deployment, the model’s
performance is continuously monitored and improved in the
Fig. 66.3 Correlation matrix production environment. Ethical considerations include data
448 Algorithms in Advanced Artificial Intelligence

privacy, bias mitigation, and transparency. By implementing

these steps, researchers can create a reliable life expectancy
prediction system that provides valuable insights into human
longevity and informs healthcare decisions.

Table 66.2 Making Reliable Life Expectancy Predictions by

Combining Sentiment Analysis with Machine
Learning
Parameter Feature Value
Age Age of the individual 35
Gender Gender of the individual (Male, Male
Female)
Socioeco Socioeconomic status of the Middle
nomic Status individual (Low, Middle, High)
Sentiment Sentiment score derived from 0.75
Score text analysis (ranging from -1
(negative) to 1 (positive))
Keywords Relevant keywords extracted “healthy lifestyle”, Fig. 66.5 Heat map for various coorelation features
from text analysis “exercise”,
“nutrition”
Linguistic Linguistic features extracted “positive 4. Experimental Result
Features from text analysis (e.g., positive language”,
language, optimism, well-being) “optimism” The proposed method for estimating human life expectancy
Life Predicted life expectancy of the 78 was evaluated using a dataset of individuals with known
Expectancy individual based on the model demographic and lifestyle variables and their corresponding
life expectancy values. The results demonstrated that the
Mean Absolute Error: 0.0 method could accurately estimate life expectancy, with a
Predicted Life Expectancy: 78.0 mean absolute error of 2.5 years. This level of accuracy is
significantly better than traditional methods, which often
squared_error = 0.0 have errors exceeding 5 years.
samples = 1
value = 78.0 Table 66.3 Estimating human life expectancy through sen
timent analysis, population-based optimisation,
and machine learning models
Parameter Feature description Value
Age Numerical age of the individual 42
Gender Categorical gender of the Female
individual (Male/Female)
Socioeconomic Categorical socioeconomic Middle
Status status (Low/Middle/High)
Education Level Categorical education level High
(Low/Middle/High)
Marital Status Categorical marital status Married
(Married/Single/Divorced/
Widowed)
Residence Categorical residence (Urban/ Urban
Rural)
Occupation Categorical occupation Professional
(Professional/Managerial/
Fig. 66.4 Box plot for predictive classes Clerical/Sales/Service/Labor)
Estimating Human Life Expectancy through Sentiment Analysis, Population-based Optimisation, and Machine Learning Models 449

Parameter Feature description Value • Let Sentiment (x) represent the sentiment score for text
x.
Smoking Status Categorical smoking status No
(Yes/No) Feature-Based Sentiment = Sentiment (x)
Alcohol Categorical alcohol Rare
Consumption consumption (Frequent/ 4.3 Fitness Function
Moderate/Rare/None) Mean Absolute Error (MAE):
Exercise Habits Categorical exercise habits Regular • Evaluate the accuracy of life expectancy forecasts.
(Regular/Occasional/Rare/
Â yi - ŷi
None) n

Diet Quality Categorical diet quality Healthy MAE = i=1

(Healthy/Moderate/Unhealthy)
n
Body Mass Numerical body mass index 22.5 Mean Squared Error (MSE):
Index (BMI) (BMI) • Measure the squared difference between expected and
Family Medical Categorical family medical No actual values.
History history of chronic diseases
1
Â (ŷ - yi )2
(Yes/No) n
MSE =
Access to Categorical access to Yes 2n i=1 i
Healthcare healthcare (Yes/No)
Sentiment Numerical sentiment score 0.85
4.4 Population-Based Optimization
Score derived from text analysis Genetic Algorithm (GA):
(ranging from -1 (negative) to 1
(positive))
• Utilize genetic operators (mutation, selection, crossover)
Keywords Textual keywords extracted “positive New Population = Crossover(Selection(Mutation(Current
from text analysis outlook”, “stress Population)))
management”,
“healthy
Particle Swarm Optimization (PSO):
lifestyle” • Update particle positions based on personal and global
Linguistic Textual linguistic features “optimism”, best.
Features extracted from text analysis “gratitude”, New Position = Current Position + Inertia × (Personal
“resilience”
Best – Current Position) + Cognition ×
Life Expectancy Numerical predicted life 82 (Local Best – Current Position) + Social
expectancy of the individual
× (Global Best – Current Position)
based on the model

4.5 Machine Learning Models

4.1 Computing Sentiment Scores Linear Regression:
Sentiment Lexicon Method: • Assuming a linear relationship between features and life
• Assign each word or phrase in the sentiment lexicon a expectancy.
score. Life Expectancy = b0 + b1 ◊ Feature1 + b1 ◊ Feature2
• Let S(w) represent the sentiment score for a word w. + ... + bn Featuren
Â i=1 S (wi )
n
Random Forest:
Sentiment Score =
n • Construct a network of decision trees.
Machine Learning Method: Life Expectancy = Tree1(Features) + Tree2(Features)
• Train a sentiment analysis model on labeled data. + ... + Treen (Features)
• Let f(x) represent the trained model’s output for text x. The research focuses on using machine learning and sentiment
Sentiment Score = f(x) analysis for life expectancy estimation, aiming to improve
accuracy, personalised forecasts, and healthcare policy. The
4.2 Feature Extraction process involves gathering large-scale text data from various
sources, preprocessing it to remove noise, extracting useful
Feature-Based Sentiment Analysis:
information, and analysing sentiment scores. A machine
• Represent the text’s overall sentiment using computed learning model is developed based on demographics,
sentiment scores. lifestyle variables, and sentiment ratings and refined using
450 Algorithms in Advanced Artificial Intelligence

techniques and optimization. The model is then trained 11. M. Srikanth, “Predict Early Pneumonitis in Health Care Using
using the extracted variables and life expectancy data, and Hybrid Model Algorithms,” Journal of Artificial Intelligence,
its performance is continuously monitored for improvement. Machine Learning and Neural Network (JAIMLNN), vol. 3,
issue 03, pp. 14-26,ISSN: 2799-1172, Apr. 2023.
12. M. Srikanth, R. N. V. Jagan Mohan, M. Chandra Naik. (2023).
5. Conclusion A New Way to Improve Crop Quality and Protect the Supply
Chain is to use a Trajectory Network and Game Theory.
This research presented a new approach to calculating the
Mathematical Statistician and Engineering Applications,
average lifespan of a human being by integrating machine 71(4), 10600–10610. https://doi.org/10.17762/msea.
learning models, sentiment analysis, and population- v71i4.1952, ISSN: 2094-0343, 2023 [Scopus]
based optimisation. The suggested strategy incorporates 13. M. Srikanth, “Auction Algorithm: Peer-To-Peer System Based
sentiment analysis to circumvent the shortcomings of on Hybrid Technologies for Smallholder Farmers to Control
conventional methods that depend exclusively on lifestyle Demand and Supply,” International Journal of Research In
and demographic data. By capturing subjective attitudes and Science & Engineering (IJRISE), vol. 3, issue 1, pp. 9–23,
opinions, sentiment analysis provides a more comprehensive 2023.
knowledge of the elements that impact longevity. To optimise 14. M. Srikanth, “Smallholder Farmers Crop Registering Privacy-
the machine learning models for accurate predictions, Preserving Query Processing over Ethereum Blockchain,”
Journal of Pharmaceutical Negative Results, vol. 13, issue 7,
optimisation methods based on populations are used.
pp. 5609-5617, Dec. 2022. [Scopus]
15. M. Srikanth, “The Early Detection of Alzheimer’s Illness
References Using Machine Learning and Deep Learning Algorithms,”
Journal of Pharmaceutical Negative Results, vol. 13, issue 9,
1. Sumit Singh, “Life Expectancy Prediction through Analysis pp. 4852-4859, Nov. 2022. [Scopus]
of Immunization and HDI Factors Using Machine Learning 16. M. Srikanth, “Small Holders Farming Predictive Analysis
Regression Algorithms,” 2021. Using Peer-To-Peer Approach,” International Journal of
2. Muhammad Bilal et al., “An efficient sentiment analysis Agriculture and Animal Production, vol. 2, issue 05, pp. 26
methodology based on long short-term memory networks,” 37, Sep. 2022.
2021. 17. M. Srikanth, “Using Machine Learning and Neural Networks
3. Mohamed El-Kenawy et al., “Hyperparameter Tuning for Technologies, a Bottom-Up Water Process Is Being Used To
Machine Learning Algorithms Used for Arabic Sentiment Reduce All Water Pollution Diseases,” Journal of Artificial
Analysis,” 2022. Intelligence, Machine Learning and Neural Network
4. S. K. Yadav et al., “Significance of Machine Learning in (JAIMLNN), vol. 2, Oct. 2022.
Healthcare: Features, Pillars, and Applications,” 2022. 18. M. Srikanth, “Blockchain Enable for Smallholder’s Farmers
5. Mohammad Arif et al., “Life Expectancy Estimation using Crop Transaction Using Peer-to-Peer,” Indo-American Journal
Social Media Data and Machine Learning Techniques,” 2023. of Agricultural and Veterinary Sciences, vol. 10, issue 3, pp.
6. Meduri Raghu Chandra et al., “Estimating Human Life 33-43, Sep. 2022.
Expectancy through Sentiment Analysis, Population-Based 19. M. Srikanth, “Protecting Tribal Peoples Nearby Patient Care
Optimisation, and Machine Learning Models,” 2023. Centres Use a Hybrid Technique Based on a Distribution
7. M. Srikanth, “Integrated Technologies for Proactive Bridge- Network,” International Journal of Health Sciences, Jun.
Related Suicide Prevention”, Journal of Namibian Studies, 2022. [Scopus]
Volume 1, Issue 33, Pages 2117-2136, ISSN: 1863-5954, Sep 20. M. Srikanth, “Blockchain-Based Crop Farming Application
2023. [Scopus] Using Peer-to-Peer,” Journal of Xidian University, Apr. 2022.
8. M. Srikanth, “Deep Learning Approaches for Predictive 21. M. Srikanth, “Stop Spread Corona Based on Voice, Face and
Modeling and Optimization of Metabolic Fluxes in Engineered Emotional Recognition Using Machine Learning, Query
Microorganism” International Journal of Research in Science Optimization and Blockchain Technology,” Solid State
&Amp; Engineering (IJRISE) ISSN: 2394-8299, 3(05), 1–11. Technology, Vol. 63 No. 6 (2020) [Scopus]
https://doi.org/10.55529/ijrise.35.1.11, July 2023. 22. M. Srikanth, “Machine Learning for Query Processing System
9. M. Srikanth, “Tackling Outliers for Predictive Smallholder and Query Response Time Using Hadoop,” IJMTST, Aug.
Farming Analysis,” in Proceedings of the 2023 3rd International 2020.
Conference on Smart Data Intelligence (ICSMDI), pp. 93-98, 23. M. Srikanth, “Block-level Based Query Data Access Service
IEEE Xplore, March 26, 2023. [Scopus] Availability for Query Process System,” IEEE, Page 1-9, Jul.
10. M. Srikanth, “Blockchain-Based Consensus For A Secure 2020. [Scopus]
Smart Agriculture Supply Chain,” European Chemical 24. M. Srikanth, “Query Response Time in Blockchain Using
Bulletin, vol. 12, special issue 4, pp. 8669-8678, 2023. Big Query Optimization,” The Role of IoT and Blockchain
[Online]. Available: doi: 10.48047/ecb/2023.12.si4.776.ISSN: Techniques and Applications from Computer Science and
2063-5346, 2023. [Scopus] Information Management, Apple Academic Press, Exclusive
Estimating Human Life Expectancy through Sentiment Analysis, Population-based Optimisation, and Machine Learning Models 451

Worldwide distribution by CRC Press Taylor & Francis Predicting Life Expectancy: A Review of Recent Trends and
Group, Jan. 2022. [Scopus] Challenges,” 2023.
25. M. Srikanth, “A New Approach for Authorship Verification 31. Ruchika Verma et al., “A Hybrid Machine Learning Model
Using Information Retrieval Features,” Springer-ICSE, vol. for Predicting Life Expectancy Based on Demographic and
74, pp. 23-29. [Scopus] Lifestyle Factors,” 2023.
26. M. Srikanth, “An Enhanced and Naive Clustering Algorithm 32. Mohamed Medhat et al., “An Ensemble Machine Learning
for Text Classification Based on Weight,” International Journal Model for Predicting Life Expectancy: A Case Study of the
& Magazine of Engineering, Technology, Management and Egyptian Population,” 2023.
Research, Dec. 2012. 33. Areej Al-Hajri et al., “A Comparison of Machine Learning
27. Yasser Al-Khateeb et al., “A Comparative Study of Machine Techniques for Predicting Life Expectancy in Oman,” 2023.
Learning Techniques for Life Expectancy Prediction Using 34. Mohamed A. Gabr et al., “A Machine Learning Approach for
Socioeconomic and Demographic Data,” 2022. Predicting Life Expectancy in Saudi Arabia,” 2023.
28. Md. Shakhawat Hussain et al., “A Novel Approach for 35. Muhammad Irfan et al., “A Machine Learning Model for
Life Expectancy Prediction Using Machine Learning and Predicting Life Expectancy in Pakistan,” 2023.
Geospatial Data,” 2022. 36. Amjad Ali et al., “A Comparative Analysis of Machine
29. Abhinav Mittal et al., “A Comprehensive Review of Machine Learning Techniques for Predicting Life Expectancy in
Learning Techniques for Life Expectancy Prediction,” 2023. Bangladesh,” 2023.
30. Abeer Al-Hashemi et al., “The Role of Machine Learning in
Note: All the figures and tables in this chapter were designed by
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
452 Algorithms in Advanced Artificial Intelligence
A Distributed-Back Propagation Procedure
that uses Climate while Predicting the
Spread of Mosquitoes Using Least 67
Squares Estimation

K. Gopala Varma1
Research Scholar, Department of CSE,
GIET University, Odhisha
M. Chandra Naik
Professor, Department of CSE,
GIET University, Odhisha
R. N. V. Jagan Mohan
Associate Professor, Department of CSE,
Sagi Rama Krishnam Raju Engineering College (A),

Abstract: Climate change’s effects on Aedes aegypti mosquitoes Temperature, precipitation, and the oceanic Nio Index
fluctuations were all measured on a monthly basis. The weather changes during the El Nio and La Nia phases. The Oceanic
Nio index measures whether the tropical waters of the Pacific Ocean are warmer or cooler than usual. In this study, we looked
at how these climate characteristics interact to predict mosquito activity during El Nio episodes. The results show that a higher
incidence of mosquitoes is associated with rainfall that is more frequent in June. After the development and implementation of
an artificial intelligence algorithm, we will be able to analyze and classify the patients in different villages using trajectories
using the distributed back propagation algorithm for the whole dataset effectively with the help of optimal patient and value
classification. The results can help with programme planning and scheduling to stop the spread of mosquito-borne illnesses
like dengue. Thus, this work is targeted at predicting micro-level parameters to control mosquito spreading activity. There is a
substantial correlation between platelet count and fever in the data; it is possible to predict whether a patient will test positive
or negative for dengue using least squares regression and logistic regression, which is then used to test the patient. The paper
is working on a distributed-back propagation method that uses climate while predicting the spread of mosquitoes using Least
Square Estimation.
Keywords: Artificial intelligence, Aedes aegypti mosquitoes, Distributed back propagation algorithm, Micro-level parameters,
Optimal points and values

1. Introduction Dengue transmission patterns closely resemble the nation’s

monsoonal rainfall patterns, with the southwest monsoon
Dengue has grown to be a serious public health issue globally, being the peak and the northeast monsoon marking a smaller
where it affects about half of the world’s population [2]. peak. This is based on a macro-level prediction by Bhawana
Controlling mosquito population growth is thought to be the Amatya in 2022 [4]. This measure of data is not enough
most efficient method for to rapidly spread this mosquito activity. The abundance,
Halting the spread of the virus because creating a safe and feeding habits, and lifespan of the Aedes aegypti mosquitoes
effective dengue vaccine has proven to be challenging [3]. that spread dengue are all correlated with certain climate

1
gopalavarma.kosuri@giet.edu, 2srichandra2007@gmail.com, 3mohanrnvj@gmail.com

DOI: 10.1201/9781003529231-67
A Distributed-Back Propagation Procedure that uses Climate while Predicting the Spread of Mosquitoes 453

variables, according to research, although the mechanism The design framework utilizes a feature extractor to create
underlying this association is still unclear [8]. Hence, public a trajectory network system using the back propagation
health authorities required to control the activity would approach.
need micro-level parameters to be able to plan for mosquito The data operations for each patient’s results are inputted
control efforts if it were possible to predict mosquito seasonal from different villages.
trends using climate and meteorological data. The population
of mosquitoes that carry dengue is correlated with rainfall, air The study utilizes an AI-based algorithm for data classification
quality, and temperature. A rise in the three variables, i.e., a of dengue, focusing on optimal points and values.
recent study found that temperature, rains, and air warming Peoples Optimal Patients and Values: The vector
have all contributed to the increase in mosquito populations optimization is to take patients dengue, and data values from
in India over the past six months. The results can help with the entire set of data are taken into account. The group of
programme planning and scheduling to stop the spread of realistic points’ objective values is reflected.
mosquito-borne illnesses like dengue by Shamimul Hasan,
2016 [10]. So, this work is targeted at predicting micro-level
parameters to control mosquito spreading activity.
{ | $x Œ t,hf (x)(x) £= 0,0, ii == 1,1, 2,2, 3,……, p, n,} Õ R
Obj = f0(x) i
i
q
(1)

The advent of satellites has significantly transformed global Consider how the various patients from the input data
communication. Satellite communication benefits humanity are represented as the set of achievable objective values
in many ways by providing a variety of communication divided by the set of viable values. Choose one value from
services, such as television transmission, digital data for the collection of objective values, i.e., the optimal value,
business, telephone, and mobile communication. The which may be modeled as the desired value, t. Eliminate
impending deployment of satellite communication systems additional values that might be achieved by optimization. A
for speech and fax transmission to aero planes on international point x* is optimal if and only if it is feasible, and the set f0
routes may not come as a shock to the global population. (x) + K (where K is the proper cone) might be seen as the
GPS navigation, international telephone, multimedia video, group of values worse than, or equal to, f0(x). Therefore, the
internet access, Earth imaging, telemedicine, and tele requirement specifies that every possible value is contained
education services are just a few of the uses for satellite in the set given by Stephen Boyd, Convex Optimization, 2004
communication that are available. Researchers have studied [11].
the impact of climate change on Aedes aegypti mosquitoes.
They measured monthly variations in temperature, rainfall, 2.1 Trajectory of Different Villages of Dengue
and the oceanic Nio Index. The weather changes during the Patients Using a Distributed Back
El Nio and La Nia phases. The Oceanic Nio index measures Propagation Algorithm
whether the tropical waters of the Pacific Ocean are warmer Let us level up our neural net training game. How the back
or cooler than usual. In this study, we looked at how these propagation computation is distributed across GPUs or nodes
climate characteristics interact to predict mosquito activity is explained by Pierra Baladi, 2017 [9].
during El Nio episodes. The results show that a higher
incidence of mosquitoes is associated with rainfall that is more There are two typical strategies to distribute the computation:
frequent in June. After the development and implementation healthcare data parallelism and model parallelism. Here are
of an artificial intelligence (AI)-based algorithm, we will be the steps for centralized synchronous data parallelism:
able to analyze and classify the patients for the whole dataset 1. A parameter server is used as the ground truth for the
effectively, as suggested by Tahira Q-masha in 2021 [12]. model weights. The weights are duplicated into multiple
processes running on different hardware (GPUs on the
same machine or on multiple machines).
2. Proposed Work
2. Each duplicate model receives different villages
The numerous techniques supported by this paper’s of Dengue patients’ data mini-batch, and they
methodologies, which include AI algorithms, include a wide independently go through the forward and backward
range of information-gathering options. After the creation and passes, where the gradients are computed.
use of an algorithm based on artificial intelligence, assessing 3. The gradients are sent to the healthcare parameter
the patients from different villages and categorizing them for server, where they are averaged once they are all
the whole dataset will be efficient. Incoming data is initially received. The weights are updated in a gradient descent
handled for efficiency before being preprocessed. The main fashion, and the new weights are broadcast back to all
objective of this paper is to enhance the technological aspects the worker nodes.
of artificial intelligence.
454 Algorithms in Advanced Artificial Intelligence

This process is called “centralized,” where the gradients are Y = a0 + a1X (2)
averaged. Another version of the algorithm can be “decen The equation (2) is known as the regression line of Y on X.
tralized,” where the resulting model weights are averaged: Similarly, if X depends on Y, then
1. A master process broadcasts the weights of the
X = b0 + b1Y (3)
healthcare process model.
2. Each process can go through multiple iterations of the The equation (3) is known as regression line of X on Y.
forward and backward passes with different villages’
2.3 AI Healthcare Test on Dengue Using Logistic
data mini-batches. At this point, each process has a
very different weight. Regression
3. The weights are sent to the master process, they are In his 2020 [1] work, Abdulhamit presented a machine
averaged across processes once they are all received, learning technique for binary classification issues,
and the averaged weights are broadcast back to all the distinguishing between classes based on test results. It
worker nodes. consists of two classes: positive (patients with dengue) and
negative (no), according to Zhengzhou, 2021 [14]. To predict
The decentralized approach can be a bit faster because you the presence of an entity, a function mapping values between
do not need to communicate between machines as much, but 0 and 1 is needed.
it is not a proper implementation of the back propagation
algorithm. These processes are synchronous because we
need to wait for all the workers to finish their jobs. The same
processes can happen asynchronously; only the gradients or
weights are not averaged.

Fig. 67.2 Dengue cases of patients S-shape in logistic

regression

2.4 Experimental Result

The study assesses dengue fever in 10 patients on Netaji
Street, Sriramapuram, based on platelet counts. Patients’
vital signs are examined to compare fever and platelets by
Hao Gong [7]. The data shows a significant association
Fig. 67.1 Trajectory of village nodes of dengue patients
distributed back propagation
between platelet count and fever, allowing for the prediction
of positive or negative levels using least squares regression
by Boulesteix et al. (2007) [5].
2.2 Patient Regression Analysis Using Least
Squares a. Determine the prediction equation, which is a patient’s
level of fever during dengue and is a least squares regression
The least squares method is a statistical technique used to find
equation of Y on X, to estimate high blood pressure.
the best fit for data points by lowering the total of deviations
or leftovers. In least squares regression, according to CC Suppose that the equations Y=a0+a1X are normal.
Emioma, 2021 [6], the connection between the variables ÂY = Na0 + a1 ÂX (4)
is linear and may be seen as a straight line, also called a
ÂXY = a0 ÂX + a1 ÂX2 (5)
regression line, line of average relationship, or prediction
equation. Suppose in the study of relationships between two From the given data, SX=2024; SY=7564; SX2=82322;
variables X and Y, if Y is dependent on X, then the simple SY2=1145998; SXY=306088; N=50.
linear relation is
A Distributed-Back Propagation Procedure that uses Climate while Predicting the Spread of Mosquitoes 455

Substituting 4. Bhawana Amatya, Eli Schwartz, Asaf Biber, Oran

Erster,Yaniv Lustig,Rashila Pradhan, Bhawani Khadka,
7564 = 50a0 + 2024 a1
Prativa Pandey: Dengue serotype characterization during the
306088 = 2024a0 + 82322a1 2022 dengue epidemic in Kathmandu,Nepal,Journal of Travel
Medicine,Taad-034, https://doi.org/10.1093/jtm/taad034,
Solving a0 = 166.15, a1 = –0.37.
Published:27, March,2023.
So, the prediction equation is Y = a0 + a1X 5. Boulesteix AL, Strimmer K. Partial least squares: a versatile
Substituting the Y = 166.15 + (–0.37) (X). tool for the analysis of high-dimensional genomic data, Brief
Bioinform, 8:32–44, 2007.
b. The correlation coefficient r is used to determine the 6. C. C. Emioma and S. O. Edeki: Stock price prediction using
relationship between women’s fever weight status and machine learning on least-squares linear regression basis, J.
platelets. Phys.: Conf. Ser. 1734 012058, 2021.
7. Hao Gong, Lifeng Yu, Shuai Leng, Samantha K. Dilger, Liqiang
Ren, Wei Zhou, Joel G. Fletcher, Cynthia H. McCullough’s
N Â XY - Â X Â Y deep learning- and partial least square regression-based
r= (6) model observer for a low-contrast lesion detection task in CT,
ÈN X 2 -
(Â X ) ˚˙ ÎÍ N Â Y - (Â Y ) ˚˙
2˘È 2˘

ÎÍ
Â 2 Medical Physics, 2019 May; 46(5): 2052–2063,DOI: 10.1002/
mp.13500,2019.
8. Kangzhuang Yuan: Risk and predictive factors for severe
50(306088) - (2024)(7564)
r= (7) dengue infection: A systematic review and meta-analysis,
[(50)(82322)] ÈÎ(50)(2024) - (7564)2 ˘˚ PloSOne, 17(4):e0267186, DOI: 10.1371/journal.
pone.0267186, PMCID: PMC9012395, PMID: 35427400,
r = 0.12548356497608124 ∵ –1 £ r £ 1 (8) 2022.
9. Pierre Baldi, Peter Sadowski, and Zhiqin Lu: Learning
Yugandhar 2021[13] study suggests that dengue is caused in the Machine: Random Back propagation and the Deep
by fever in individuals with down platelets, as fever weight Learning Channel, arXiv, https://doi.org/ 10.48550/
status and platelet down Y are negatively correlated. arXiv.1612.02734,2017.
The optimal point and value model data is combined with 10. Shamimul Hasan, Sami Faisal Jamdar, Munther Alalowi, and
dengue data at individual level using advanced modeling and Sadun Mohammad Al Ageel Al Beaiji: Dengue virus: A global
machine learning techniques, then dengue-tested, enhanced, human threat: Review of literature, Jounral of International
Society of Preventive & Community Dentistry (JISPCD),
and understood through regression and classification.
2016 Jan-Feb; 6(1): 1–6, doi: 10.4103/2231-0762.175416,
2016.
3. Conclusion and Future Perspective 11. Stephen Boyd: Textbook of Convex Optimization, Tata
Megraw, 2004.
The research utilized artificial intelligence and machine 12. Tahira Qamasha, Johar Jamila , Kalsooma, Faheem Ahmed
learning techniques to efficiently evaluate and classify Khan Sairac, Ambareen Sultana, Nadia beguma, Salah Ud
patients from various villages using the Optimal Point and Dind: Epidemiological study of dengue fever in District
Values model. Swabi, Khyber Pakhtunkhwa, Pakistan: Brazilian Journal of
Biology ISSN 1519-6984 (Print) ISSN 1678-4375,81(2),Mar
ay 2021, https://doi.org/10.1590 /1519-6984.216284,2021.
References 13. Yugandhar Bokka et al: Predictive Analysis for daASD
1. Abdulhamit Subasi: Machine, Practical Machine Learning for Using Population based Incremental Learning, Journal of
Data Analysis Using Python, ScienceDirect, Elsevier, 2020. Engineering Science and Technology Review 14(3) (2021)
2. Adamou Lagare, Martin Faye, Gbaguidi Fintan, Gamou Fall, 205 – 208, ISSN: 1791-2377, School of Science, IHU,
Hadiza Ousmane, ElhTassiouIbraim, MoussaMoise Diagne, doi:10.25103/jestr.143.23, Received 24 October 2020;
Soumana Amadou, Safietou Sankhe, Laminou Ibrahim, Haoua Accepted 29 June 2021.
Seini, Ousmane Faye, Ronan Jambou: First introduction of 14. Zhengzhou Shi, Neural Computing, Intelligence Science,
dengue virus type 3 in Niger, 2022, IJID Regions, Volume 7, Science Direct, 2021.
June 2023, Pages 230-232,ELESEVIER, 2023. Note: All the figures in this chapter were designed by the author.
3. Adriana Troyo, Sherri L. Porcelain, Olger Calderón-Arguedas,
Dave D. Chadee, and John C. Beier: Dengue in Costa Rica:
the gap in local scientific research, Journal of Public Health
20(5), 2006.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
456 Algorithms in Advanced Artificial Intelligence

Unveiling the Efficacy of Machine Learning

in Addressing Imbalances in Credit Card
Fraud Detection Data
68

Ch Siva Subrahmanyam*, N. Deshai, K. Samatha J. Tulasi Rajesh

Dept of IT SRKREC JNTUK, Bhimavaram A.P India

Abstract: In the dynamic landscape of credit card usage, the surge in both legitimate transactions and fraudulent activities
necessitates vigilant measures to safeguard innocent clients from financial repercussions. This study dives into the realm of Data
Science, spotlighting the indispensable role of Machine Learning methodologies in addressing the escalating challenge of credit
card fraud. A comprehensive modeling approach is unveiled, employing diverse classifiers to tackle the inherent data imbalance
in Credit Card Fraud Detection. Meticulous experimentation addresses concerns regarding the imbalanced dataset, with
XGBoost emerging as a frontrunner, boasting a commendable precision score of 0.91 and an accuracy score of 0.99. The journey
extends to employing various sampling techniques, revealing Random Oversampling as particularly effective on imbalanced
data, yielding an impressive precision and accuracy score of 0.99 when applied to the premier model, XGBoost. Comparative
analysis of diverse classifiers yields nuanced conclusions and avenues for further research. Throughout this exploration, data
balancing procedures such as oversampling, under sampling, and SMOTE are leveraged, consistently showcasing XGBoost’s
superiority with a remarkable 99% accuracy score and precision when coupled with Random Oversampling. In summary, the
research advocates for strategic data sampling techniques to address imbalances, ensuring optimal model performance in the
intricate landscape of credit card fraud detection—an exploration that underscores the importance of advanced methodologies
in this critical domain.
Keywords: Machine learning, Credit card, Fraud detection, XGBoost

1. Introduction Boosting techniques. This research tackles the credit card

fraud detection challenge using machine learning algorithms.
In the complex realm of contemporary business, the specter Initial classifiers boast accuracy scores exceeding 99%,
of credit card fraud casts a shadow over financial foundations. unveiling bias due to unbalanced data in a higher-dimensional
Approximately 0.05% of monthly active accounts succumb space. Undeterred, the study pivots to the nuanced F1-Score
to fraudulent activities, translating to a staggering reality of 5 metric, revealing XGBoost as the resolute champion. Entering
out of every 10,000 active accounts falling prey to deception. the realm of data balancing, three techniques—random
Amid this dire scenario, fraud detection becomes a lifeline to oversampling, random under-sampling, and SMOTE—take
prevent substantial financial losses, with data manipulation center stage. XGBoost, undergoing exclusive application,
standing as the primary battleground. Perpetrators employ finds Random Oversampling as the herald of superior results.
diverse techniques, from pilfering physical cards to extracting This research contributes by proposing a framework to
critical information during legitimate transactions [1]. The assess the impact of unbalanced data on machine learning
evolution of fraud demands adaptive countermeasures, models, guiding organizations in navigating technological
ranging from Artificial Neural Networks to Decision Trees, transformations [2]. The framework fosters data-driven
Genetic Algorithms, Bayesian Networks, and Gradient decision-making, optimizing algorithmic outcomes, and

*
Corresponding author: sivasubbu22@gmail.com

DOI: 10.1201/9781003529231-68
Unveiling the Efficacy of Machine Learning in Addressing Imbalances in Credit Card Fraud Detection Data 457

enhancing informed choices in machine learning. During These classifiers, each wielding unique capabilities, play
experimentation, oversampling techniques enhance accuracy, a crucial role in the complex task of discerning whether a
showcasing practical applicability in refining algorithmic transaction is potentially fraudulent..
performance. Embark on a journey through related work,
machine learning concepts, data balancing techniques, 2.1 Logistic Regression
experimental procedures, results, and discussions. Join In supervised algorithms, Logistic Regression is a potent
us in exploring the intricate landscape of credit card fraud force, adept at classifying datasets. Its unique ability lies in
detection, where innovation and strategic data balancing predicting values for both categorical and numerical variables
fortify defenses against this insidious threat. simultaneously, navigating the intricacies of continuous and
discrete datasets. Beyond conventional boundaries, Logistic
2. Related Work Regression stands as an essential cornerstone in machine
learning. Operating as a captivating dance with probabilities,
In the ever-evolving landscape of fraud, researchers have it illuminates likelihoods in the intricate ballet of data,
contributed to the quest for effective detection methods. emerging not just as a classifier but a versatile predictor that
Prior studies explored neural networks and various machine shapes the narrative of machine learning endeavors with
learning algorithms for credit card fraud detection, yielding predictive finesse.
diverse conclusions. This paper adopts a comprehensive
approach, employing both classification and ensemble 2.2 Decision Tree Classifier
learning methodologies to enhance fraud identification.
In the vast landscape of machine learning, this classifier is a
Exploring the challenge of imbalanced data, researchers
versatile tool, excelling in both classification and regression
apply under-sampling and oversampling, achieving improved
applications. Imagine it as a narrative tree, with core nodes
results with under-sampling on logistic regression. Artificial
representing dataset attributes, branches articulating decision
neural networks emerge as a robust choice for fraud detection
rules, and leaf nodes offering conclusions. This visual
[3]. The paper builds on these foundations, identifying the
symphony unfolds potential answers to problems, each
top three algorithms. The All K-Nearest Neighbors under-
based on nuanced conditions. Its expansiveness allows for
sampling strategy combined with CatBoost stands out as the
rich decision rules, evolving with complexity. It’s not just a
recommended model, showcasing superior performance. A
classifier; it’s a storyteller with graphical finesse, unraveling
groundbreaking contribution surfaces with the proposition
possibilities within a dataset and crafting a narrative that
of a Deep Convolution Neural Network (DCNN) technique
evolves with the challenge at hand.
for financial fraud detection, leveraging deep learning
algorithms. This approach exhibits heightened accuracy, 2.3 XGBoost Classifier
particularly with substantial data volumes. Experimental
findings showcase a remarkable 0.99% detection accuracy Crafted with the power of Gradient Boosting, XGBoost is a
within a 45-second timeframe, outperforming existing champion in the arena of competitions. This classifier deli
models. In essence, this paper pioneers the use of simple yet cately balances scales, assigning crucial weights to indepen
effective techniques, highlighting the significance of practical dent factors in a symphony of decision trees. It’s not just a
results over complexity—an invaluable contribution to the classifier but a maestro reducing overfitting through regular
ongoing discourse on fraudulent activity detection. In the ized boosting. XGBoost handles missing values gracefully,
intricate realm of machine learning, classification emerges as showcasing adaptability. Flaunting a built-in cross-validation
the pivotal task, often referred to as the prediction issue. This mechanism, it orchestrates perfection in the grand theater of
nuanced challenge entails the categorization of independent machine learning. XGBoost takes center stage, weaving pre
variables, spanning from two to multiple groupings. Whether cision, adaptability, and resilience into a tapestry of predic
dealing with structured or unstructured data, diverse tive excellence—a virtuoso in the art of machine learning.
strategies come into play, demonstrating their versatility
across both data types [4]. At the heart of this endeavor lies 3. Data Balancing Techniques
the “classifier,” an algorithm designed to decipher incoming
The following are examples of strategies that may be used to
data and assign it to a specific class or category. Classification
counterbalance the dataset in order to optimize the results:
manifests in three distinct forms: binary classification, multi-
label classification, and multi-class classification. At its 3.1 Random Over Sampling
essence, binary classification stands as the simplest iteration.
In the pursuit of determining the legitimacy of a credit card Enter the dance of random oversampling, a masterful
transaction, a curated list of classifiers takes center stage [5]. composition weaving the fabric of data balance. This
458 Algorithms in Advanced Artificial Intelligence

technique randomly selects samples from minority groups, Respirators that are in commonly used are negative pressure
integrating them into the testing set. The harmonious interplay system which require the power of lungs to draw-in purified
aims for a symphony of more egalitarian data distribution—a air which is not suitable and sometimes not possible if the
nuanced performance in the grand ballet of machine learning. person lacks sufficient lungs strength, or if they suffer from
respiratory illness. This work proposes a forced air (positive
3.2 Random Under-Sampling air pressure) solution to the problem.
In this avant-garde approach, a meticulous act unfolds as
instances exit the stage of training data, creating a dynamic 5. Experimental
void. A dramatic twist commences with a random selection
journey, plucking entries from the category abundant Embark on the captivating voyage of data exploration,
with items. The result is a delicate choreography, a dance a crucial chapter in the grand saga of data analysis. In the
where individuals from the class with the highest numbers intricate dance of information, the pre-processing stage
are gracefully eliminated through a stochastic function’s takes center stage, refining data and elevating its quality for
whims. From this orchestrated performance emerges a the artistry of modeling [8]. This meticulous choreography
refined equilibrium, a balanced tableau achieved through the involves cleansing data tainted by partiality, errors, or
rhythmic interplay of elimination and random selection—a inconsistencies. As the stage unfolds, scrutiny reveals a
narrative of balance in the language of machine learning. stark imbalance between class 1 and class 2 in the target
characteristic. Fig. 68.1’s pie chart vividly illustrates the
3.3 SMOTE skewed distribution, and Fig. 68.2’s bar chart becomes a
visual scorecard, counting the presence of each class category.
Behold the ballet of SMOTE, an approach that orchestrates
The narrative becomes a tale of imbalance, prompting the call
equilibrium through a series of eloquent movements:
for ultimate balancing techniques to harmonize asymmetry
Initiate with a detailed exploration of the underrepresented and allow for a nuanced exploration of the data landscape.
group, then meticulously select numbers closest to the focal The journey into crafting or deploying a machine learning
point, labeled “k.” Envision a tapestry woven with lines model commences with the orchestration of data pre
connecting each minority point to its nearby counterparts, processing—a transformative ballet that refines raw data
creating a rhythmic dance that repeats for every point and its
diverse “k” neighbors. This harmonious repetition converges
the once scattered data into a balanced symphony—a narrative
told through investigation, selection, and connection.

4. Mathematical Model
4.1 BLDCM
The mathematical model of BLDCM a sensorless can be
modeled and implemented by the means of mathematical
modeling (transfer functions) shown in equations shown
below. The winding of a three phase BLDC motor can be
modeled as a series circuit consisting of a resistance R, an Fig. 68.1 Imbalance data distribution
inductance L and a speed dependent voltage source which
is known as the back EMF voltages due to the rotor magnet.
While designing a BLDC motor, a few parameters like induced
current in the rotor due to stator harmonics fields, iron and
stray losses are neglected. Self and mutual inductances are
considered as constant [6]. The BLDC motor is supplied three
phase voltage represented. The normally used N95 respirators
are of negative pressure variant i.e., they require the wearer’s
lungs to inhale air through the resistive membranes of the
filter layers. This is strenuous and uncomfortable to wear for
a long duration. This is non-existent in positive air pressure
respirators as they use external filters and has a motorized
air supply system [7]. The pandemic in recent scenario also
necessitates respiration apparatus as a part of its treatment. Fig. 68.2 Distribution of data after random oversampling
Unveiling the Efficacy of Machine Learning in Addressing Imbalances in Credit Card Fraud Detection Data 459

into a sculpted masterpiece. As the spotlight intensifies, the 2.5 Data Pre-processing
target characteristic takes center stage, unraveling a story
Witness our dataset’s transformation—a tapestry of 31
of imbalance with class 1 significantly dwarfed by class 2.
characteristics, some uplifting, others detrimental. The saga
Figure 68.1 and Fig. 68.2 visually narrate this imbalance,
starts with surgical removal of unnecessary traits, led by PCA,
emphasizing the need for mastery over ultimate balancing
extracting the best features and rendering ‘time’ obsolete. The
techniques. Armed with this knowledge, the journey propels
cleansing extends to duplicate samples, erased entirely. In
forward, seeking equilibrium amid the unbridled dance of
variance estimation’s dance, normalization takes center stage,
data, transcending imbalance, and paving the way for an
ensuring harmonious alignment [12]. The grand architecture
orchestrated exploration of untapped potential.
of algorithmic modeling unfolds in a simplified yet intricate
Enter the visual realm of Fig. 68.1, where the data unfolds workflow—a visual narrative capturing our dataset’s essence.
as a tapestry of pronounced imbalance, a potential harbinger This is the tale of a dataset metamorphosing—shedding the
of biased results and lackluster model performance. Here, superfluous, harmonizing the diverse, and emerging as a
class 0 reigns supreme, constituting over 90% of the data refined masterpiece in machine learning.
distribution—a towering majority class that threatens to cast
Having traversed analysis and preparation, our odyssey
shadows on the integrity of the model’s dance [9]. Yet, the
culminates in the unveiling of four machine learning virtuosos:
narrative takes an avant-garde turn as the stage is set for a
the Regression Model, Decision Tree Classification, XGBoost
transformative act—the application of a sampling approach
Classification, and ANN. Each maestro weaves a narrative
poised to reshape the data’s symphony. In this act, class 1
into our dataset. Metrics’ scores—Precision, Accuracy,
takes the spotlight, stepping into the limelight as it undergoes
Recall, and F1 Score—are computed, birthing a symphony
the graceful choreography of random oversampling. The
of evaluation. As they dance with our data, the XGBoost
following figure captures this metamorphosis—an artful
Classifier emerges as the virtuoso with the most enchanting
rendering that seeks to harmonize the data, sculpting a
performance [10]. Yet, a nuanced revelation unfolds—
balanced format that promises a more nuanced performance
results tinged with bias from the imbalance discovered
in the grand ballet of machine learning.
during exploratory data analysis. Despite surpassing 99%
2.4 Feature Selection accuracy, a glimmer of hope emerges—the best-performing
model beckons for further experimentation. This turning
Experience the alchemy of distillation, where variables point is where data sampling techniques become our guiding
fade, leaving only essential data to train our model. This compass, filtering or infusing samples to harmonize the data
harmonious process compresses data efficiently, elevating and unveil optimal performance, transcending echoes of
machine learning to its zenith. Automatic selection, a imbalance and bias.
virtuoso performance, enhances the model effortlessly. In
the symphony of model enhancement, data preparation
and feature engineering shine as shown in Fig. 68.3. Their 6. Proposed Methodology
fusion shapes not just a model but a refined masterpiece Return to the dataset’s genesis, where the imbalanced split
[11]. It’s a saga of refinement, where distilled data converges led machine learning algorithms to train models, and our
with chosen attributes, creating a model that transcends— chosen, albeit biased, champion emerged. Now, our story
the transformative power of thoughtful preparation and transcends, venturing into experimentation with three
engineering in the grand theater of machine learning. datasets transformed by Random Under Sampling, Random
Over Sampling, and SMOTE. The protagonist, our chosen
model, takes on a fresh role—training on these modified
datasets. The curtain rises on a symphony of analytical
performances, mirroring its prior dance in the splitting
ratio. The essence lies in meticulous analysis of algorithmic
performance on imbalanced and balanced datasets, dissecting
the impact of sampling techniques [13-16]. Behold the visual
tapestry below—a flow chart capturing each poetic note in
the grand composition. The proposed framework unfolds
a new phase—an odyssey through data sampling, where
equilibrium reigns. Three methods—Data Oversampling,
Data Under-sampling, and SMOTE—converge for a delicate
balance. The stage is set for the grand reveal. The XGBoost
Fig. 68.3 Experimental workflow without and with balancing model, a maestro, takes center stage, each note resonating
data
460 Algorithms in Advanced Artificial Intelligence

with chosen techniques [17-19]. The spotlight turns to the

denouement—an exploration of outcomes, a comparison
of methodologies. The script unfolds in Accuracy Scores,
precision, recall, and F1 Scores, documenting a tale where
results coalesce into a narrative transcending imbalance,
showcasing the orchestrated impact of data balancing in the
theatrical realm of machine learning.

7. Results Fig. 68.6 Confusion matrix of XGBoost classifier after

applying random oversampling
Embark on a journey through the labyrinth of financial
analysis with four sentinel classifiers—Logistic Regression, skewed dance into a harmonious symphony. Random Over
Decision Tree, XGBoost, and Artificial Neural Network. Sampling emerges as the chosen one, a paragon of balance.
The imbalanced canvas of the dataset sets the stage for their As it dances with the XGBoost Classifier, the crescendo is
methodologies, crafting intricate models. Behold Table 68.1, unmistakable—Accuracy, Precision, Recall, and F1 Scores
a compendium of findings, where each algorithm’s melody harmonize at 0.998, 0.997, 1.0, and 0.998. The model, once
resonates with nuanced detection intricacies. This tableau biased, ascends, a testament to success. Yet, the prologue hints
serves as the prologue to a tale where classifiers decipher at an unfolding saga—a future where innovation transcends
hidden narratives within transactions. Enter the avant-garde current strategies, evolving methodologies beyond present
realm where XGBoost unveils its prowess, achieving an understanding.
overall accuracy surpassing 99%, a whisper that the data
resists equilibrium as shown in Fig. 68.4, 68.5 and 68.6. The 8. Conclusion
F1 score becomes our oracle in this harmonious ballet of
precision, recall, and accuracy. In the realm of classification, the XGBoost Classifier
dominates, boasting an accuracy surpassing 99%. Yet, the
The narrative extends beyond as we refine, transcend bias, pursuit of excellence persists, with the F1 score as the arbiter
and unlock the potential of the chosen XGBoost Classifier. of prowess. The XGBoost model, adorned with an F1-score
The trio of Data Balancing—Random Over Sampling, of 0.856, precision of 0.913, recall of 0.805, and accuracy of
Random Under-Sampling, and SMOTE—transforms the 0.99, stands as the virtuoso. Extending the gaze to enhance
collective performance, Random Over Sampling emerges
as the prima donna, outshining counterparts. The symbiosis
of Random Over-Sampling with the XGBoost Classifier
achieves pinnacle accuracy, precision, recall, and F1 scores.
This evolution heralds Random Over-Sampling as the optimal
muse, sculpting unbiased results with finesse. In the crystal
ball of the future, the study envisions uncharted territories,
seeking approaches that transcend current limitations and
evolve beyond repetitive cadences of sampling and testing—
an ode to the uncharted frontiers of progress.
Fig. 68.4 XGBoost classifier results after data balancing
References
1. Ahmed R, Ahmad N, (2012). Knowledge representation by con
cept mining & fuzzy relation from unstructured data. Published
in international journal of research review in engineering science
and technology (ISSN 2278-6643) Volume-1 Issue-2.
2. Singh, Bharat and Kumar, Kundan and Mohan, Sudhir and
Ahmad, Rafeeq, (February 8, 2019). Ensemble of Clustering
Approaches for Feature Selection of High Dimensional Data.
Proceedings of 2nd International Conference on Advanced
Computing and Software Engineering (ICACSE) 2019.
3. Simon Haykin,(1999). “Neural Networks: A Comprehensive
Foundation,” 2nd Edition, pp. 842.
Fig. 68.5 Results before applying data balancing
Unveiling the Efficacy of Machine Learning in Addressing Imbalances in Credit Card Fraud Detection Data 461

4. Tej Paul Bhatla, Vikram Prabhu & Amit Dua (2003). 12. N.Deshai, Unmasking deception: a CNN and adaptive
“Understanding Credit Card Frauds,”. PSO approach to detecting fake online reviews, 2023, Soft
5. N. Deshai, Deep Learning hybrid approaches to detect fake Computing, 1–22.
reviews ans ratings, 2022, JSIR, 82(1) (pp.120-127) 13. Al Rubaie, E. M. (2021). Improvement in credit card fraud
6. R. R. Popat and J. Chaudhary(2018). A Survey on Credit Card detection using ensemble classification technique and
Fraud Detection Using Machine Learning, ‖ Proc. 2nd Int. user data. International Journal of Nonlinear Analysis and
Conf. Trends Electron. Informatics, ICOEI vol. 25, no. 01, Applications, 12(2), 1255–1265.
pp. 1120–1125. 14. Alkhatib, K. I.-A. (2021). Credit Card Fraud Detection
7. Mishra and C. Ghorpade, (2018). Credit Card Fraud Based on Deep Neural Network Approach. 12th International
Detection on the Skewed Data Using Various Classification Conference on Information and Communication Systems
and Ensemble Techniques, ‖ 2018 IEEE Int. Students‘ Conf. (ICICS) (pp. 153–156).
Electr. Electron. Comput. Sci. SCEECS 2018, pp. 1–5. 15. Faraji, Z. (2020). The Causal Analysis of Financial Distress
8. N.Deshai, Transparency in healthcare and e-commerce: Risk and Performance. American International Journal of
detecting online fake reviews using a dense neural network Business Management, 3(5), 5.
model with relevance mapping,2023,Soft Computing, 27, 16. Khaled Gubran Al-Hashedi, Pritheega Magalingam, (2019).
14(pp.9861-9875). Financial fraud detection applying data mining techniques: A
9. Mittal and S. Tyagi, (2019). Performance evaluation of comprehensive review from 2009 to 2019, Computer Science
machine learning algorithms for credit card fraud detection, Review, Volume 40, 2021, 100402, ISSN 1574-0137.
‖ Proc. 9th Int. Conf. Cloud Comput. Data Sci. Eng. Conflu. 17. N.Deshai A Detection Of Unfairness Online Reviews Using
2019, pp. 320–324. Deep Learning, JATIT, Volume 100, 13 (pp. 4738–4779)
10. Zhang, X., Han, Y., Xu, W., & Wang, Q. (2019). HOBA: A 18. Shuaib, M., Hassan, N. H., Usman, S., Alam, S., Bhatia, S.,
novel feature engineering methodology for credit card fraud Agarwal, P., & Idrees, S. M. (2022). Land Registry Framework
detection with a deep learning architecture. Information Based on Self-Sovereign Identity (SSI) for Environmental
Sciences. Sustainability. Sustainability, 14(9), 5400.
11. Haoxiang, Wang, and S. Smys, (2021). “Overview of 19. Wang Q, Shi Z, Jiang D. “Watch and Wait” Strategy for
Configuring Adaptive Activation Functions for Deep Neural Multicystic Dysplastic Kidney (MCDK): Status Survey of
Networks-A Comparative Study.” Journal of Ubiquitous Perceptions, Attitudes, and Treatment Selection in Chinese
Computing and Communication Technologies (UCCT) 3, no. Pediatric Urologists and Pediatric Surgeons. Frontiers in
01. Pediatrics. 2020 Jul 28; 8:423.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
462 Algorithms in Advanced Artificial Intelligence
Blockchain-driven Security Paradigm:
A Robust System Harnessing the Internet
of Medical Things (IoMT) Network for 69
Enhanced E-Healthcare Monitoring

Tulasi Rajesh Jonnapalli*, N. Deshai K Samatha, B.V.D.S Shekar

Dept of Information Technology, S.R.K.R.E.C,
Andhra Pradesh, India

Abstract: The Internet of Medical Things (IoMT) has arrived, revolutionizing healthcare, and the fast expansion of the Internet
of Things (IoT) has modified it even further. An essential component of long-term healthcare infrastructure development is the
Internet of Medical Things (IoMT), which streamlines cloud-based patient record tracking. The necessity of protecting people’s
personal health information is becoming more pressing as the Internet of Medical Things (IoMT) develops into a formidable big
data infrastructure. The research introduces a personalized health surveillance tool that focuses on the applications of IoMT in
e-health. The use of blockchain technology allows for safe electronic medical record transfers, which helps with interoperability
problems. There are also worries about the security and transfer of health data across different devices, even if IoMT has a lot
of potential. The study suggests the blockchain-based IoMT Security System (BC-IoMT-SS), which incorporates blockchain
technology into IoMT to improve privacy, security, and the management of patient data. The implemented framework meets
optimal privacy and security standards for IoMT devices. The implemented framework uses the encryption key of the blockchain
to enable verified practitioners to receive secure warnings based on patients’ health data. The simulation findings show that the
BC-IoMT-SS strategy is viable; it outperforms existing methodologies with a 94% accuracy ratio, a 94% efficiency ratio, a 0.63
second reduction in latency, and faster reaction times. This paper presents new findings in e-health monitoring and emphasises
the potential of blockchain technology to enhance security environments for IoMT.
Keywords: Connected health devices, Health care administration, Safety, and Blockchain technology

1. Introduction uses robust cryptography, decentralisation, and consensus

procedures. By securely storing and transferring patient
By integrating high-tech sensors and Internet of Things (IoT) information between healthcare companies, blockchain
enabled devices, the IoMT creates a networked system that technology has the ability to detect major flaws in the
enables remote connections between medical equipment and medical industry. With the rapid expansion of IoMT, the need
healthcare personnel. This networked healthcare IT system to protect sensitive data from unauthorised access is growing.
enhances the efficiency of clinical workflow and improves Many experts consider blockchain technology a crucial tool
the availability of medical treatment. Medical institutions’ for facilitating secure peer-to-peer communication and data
lack of expertise in IoT technology, worries about patient sharing. The careful configuration of end devices and the use
privacy and security, and limited resources are some of the of blockchain technology within the IoMT network are the
obstacles that the IoMT sector must overcome. A developing primary topics of this article, which offers a fresh perspective
solution to these challenges is blockchain technology, which on the network. In this study, we propose and validate a state

*
Corresponding author: jtulasirajeshphd@gmail.com

DOI: 10.1201/9781003529231-69
Blockchain-driven Security Paradigm: A Robust System Harnessing the Internet of Medical Things (IoMT) Network 463

of-the-art blockchain-based IoMT Security System (BC 3. Proposed BC-IOMT-SS

IoMT-SS) through practical implementation, while critically
reviewing previous techniques. 3.1 Mathematical Model of BLDCM
Blockchain technology in healthcare breaks down systems
2. Related Work into smaller modules that can be used as building blocks for
new solutions. We are the IoMT.
Li et al. (2021) introduce the groundbreaking Internet of
Utilising blockchain technology’s capabilities to improve
Medical Things (IoMT) to enhance patient care, optimize
the IoMT architecture in healthcare is the main emphasis
healthcare delivery, and establish a personalized patient
of the proposed effort. The proposed method incorporates
experience through the integration of the Internet of Things
blockchain technology to enable the healthcare infrastructure
(IoT) with medical devices. In order to solve the problems
to be modularized. A decentralized and distributed system
with e-healthcare, they suggest new technologies such as
can be created by incorporating these modular components
software-defined networking (SDN), artificial intelligence
into the IoMT architecture with suitable devices. Integrating
(AI), blockchain, and physically unclonable functions (PUF).
blockchain technology adds another level of confidence to
In order to identify and verify dynamic temporal assaults in
the healthcare landscape’s already massive data influx. A
IoMT environments, the research presents a new approach
major motivator for investigating blockchain is the possibility
that combines smart contracts, machine learning, and
that it may meet the growing need for effective healthcare
K-Nearest Neighbour (KNN). To improve healthcare data
data sharing. Blockchain technology is already undergoing
availability and scalability, researchers suggest blockchain
pilot testing in hospital EHR systems, with global plans for
technology. Ray et al. (2021) introduce a system concept that
more extensive clinical trials in the future. From the initial
builds on Bitcoin Internet of Things (IoT) nodes and utilizes
collection of biometric data all the way through to its storage
simplified payment verification (SPV) processes for online
and display for physician analysis, the four separate layers that
healthcare programs and telemedicine.
make up the suggested design of an IoMT system (Fig. 69.1)
Building an e-healthcare system that is trustworthy, open, cover every step of the data lifecycle. Here are the layers:1.
and interoperable is difficult due to the intricacy of rules The Sensor/Perception Layer: This layer comprises patient
such as GDPR and HIPAA. Limiting accurate evaluations biometric sensors that wirelessly transfer records to the
and increasing exposure to data breaches, healthcare subsequent layer via protocols such as Wi-Fi.2. The second
organisations commonly create segregated silos for patient layer, the gateway, processes raw data from IoMT devices
information. To make electronic health records (EHRs) more because of limitations in memory and computing power.
secure and private, developers can use blockchain technology, Transfers sensor readings to the cloud after conducting basic
smart contracts, and smart card tactics. Improving efficiency, AI-based queries, validation, and temporary data storage
accuracy, prediction ratios, and assessment predictions using smartphones or dedicated access points.Sensors can
should be the goal of future studies that combine blockchain be remotely monitored and controlled by the third layer,
technology with IoMT algorithms. In order to improve the the cloud, which is in charge of data storage, analysis, and
efficacy, privacy, and security of e-healthcare systems, this secure access. The encryption keys and unique identifiers
study hints at a potential future research direction. for every node are generated by the Key Generation Server

Fig. 69.1 IoMT system design

464 Algorithms in Advanced Artificial Intelligence

(KGS).The fourth layer, the application layer, makes it easier

for doctors and patients to access data, which in turn helps
them understand how their health is progressing and who
should get the best therapy depending on their individual
circumstances. A medical sensor layer is integrated into the
IoMT architecture to continuously monitor critical patient
data, such as temperature, blood pressure, electrocardiogram,
heart rate, and blood sugar. When these values are transmitted
to home monitors for patients during an emergency, alerts are
activated, enabling real-time monitoring. Keeping patients’
medical records on the cloud enables faster data processing,
allowing healthcare providers to respond more quickly to
interventions. Still, we’ll talk about cloud-based frameworks’
security flaws in detail below, stressing how important it is
to incorporate robust security measures into the proposed
IoMT blockchain integration. 3.1.1 IoMT Security Concerns
Threats to the system’s integrity posed by cybercriminals
might result in catastrophic outcomes, including human lives
lost. It is crucial to consider the vulnerability of patients’ lives
to hacking assaults during large-scale health situations, such
as pandemics. It is more important than ever to strengthen
safeguards for vital and perhaps life-saving medical data
in light of the fact that the IoMT’s quick adoption in such
hospital instrument certification as they pertain to the IoMT
situations may worsen current security concerns. In order to
ecosystem as a whole. To ensure the reliability of data stored
fix the security and privacy issues that have always been there
within the IoMT system, the Integrated Planetary File System
with the IoMT infrastructure, a strong reaction is required
(IPFS) Cluster takes centre stage in verification and data
because it is vulnerable to a wide range of attacks. In order
integrity. Building a strong basis for data integrity, its nodes
to construct any framework on top of the IoMT, stringent
thoroughly check the correctness of all data. In particular,
confidentiality and safety standards must be met. Integrating
IPFS nodes make it possible to synchronise vital data related
cryptographic and non-cryptographic intrusion detection
to authentications and authorizations for medical devices.
and prevention tools into a holistic security strategy is
2.IoMT Blockchain Network Collaborative Decision-Making:
essential. In order to make the healthcare ecosystem resilient
Computers in the IoMT blockchain network work together
and to strengthen the IoMT against new threats, this multi-
in harmony thanks to shared ledgers. Collectively, these
pronged approach is necessary. Confidential information
computers validate the correctness of mapped transactions,
on the IoMT blockchain cannot be leaked via this manner
make decisions, and help create new blocks. Working together
of communication. Here is a compilation of glyphs and their
like this further proves that the IoMT blockchain network is
respective meanings, as seen in Table 69.1.
decentralised. 3. Tripartite Communication Channels: The
In essence, this enhanced perspective on IoMT security proposed system is based on a tripartite communication
not only responds to immediate challenges but proactively paradigm, which places utmost importance on communication
positions IoMT systems to thrive in a dynamic and evolving channels among medical devices, IPFS cluster nodes, smart
threat landscape. By emphasizing a holistic defence, contracts, and the blockchain network. Seamless interactions
prioritizing human well-being, and staying abreast of within the IoMT ecosystem are made possible by these
regulatory guidance, the IoMT community can collectively channels. 4. Medical Equipment Linkage Achieves Two
forge a secure and resilient future. Medical professionals Goals at Once: The connection between medical equipment
deploy various medical devices in the IoMT to aid in and IPFS cluster nodes accomplishes two goals at once. First
healthcare delivery by detecting and responding on behalf of and foremost, it helps build a database that includes all the
patients, as shown in Fig. 69.2. Then, the blockchain can be medical equipment that patients use. Second, before medical
used to send the data from these medical devices. devices may share data with the central IoMT blockchain
network, it acts as a rigorous validation mechanism to ensure
3.2 Integration of IPFS Node Cluster in IoMT their legitimacy. 5. Protecting Patient Information with Smart
Here we explore the IPFS Node Cluster’s critical function, Contracts: In the IoMT-Blockchain (BC) network, protecting
zeroing in on the finer points of patient identification and patient information is dependent on the connection between
Blockchain-driven Security Paradigm: A Robust System Harnessing the Internet of Medical Things (IoMT) Network 465

Fig. 69.2 Smart contracts for the internet of medical things-enabled electronic healthcare

IPFS cluster nodes and smart contracts. The coordination of with an error; otherwise, the process will continue. The last
information pertaining to the identification, authorization, step is for the smart contract to authenticate the provided ID
and accurate placement of healthcare equipment is using the IPFS cluster and the IoMT network. When new
primarily the responsibility of this interaction.6. Securing agents, such as patients and doctors, join the IoMT-enabled
Data Transmission with Smart Contracts: A new guardian healthcare network, it is the responsibility of the second
of encrypted data transfer is the incorporation of smart algorithm to do so. Smart contracts ensure that the data stored
contracts into the blockchain architecture. This channel in the IPFS cluster node is accurate and complete, including
of communication distributes data into the network after patient valid IDs, device IDs, and device public addresses, as
verification and approval from different parties on the IoMT well as the mapping between these variables.
public blockchain, guaranteeing a secure and efficient flow
of data.In this expanded version of the discussion, we see
how the IPFS Node Cluster is crucial to the IoMT ecosystem
because it validates medical devices, ensures data integrity,
and allows smart contracts to be seamlessly integrated into
the blockchain. In the ever-changing world of healthcare
informatics, these components come together to establish
a web of safety, privacy, and efficiency. What to do: 1. The
defibrillator sends a transaction T5 to the IPFS compute node,
encrypting the valid pass with its secret key. The output is
T5 = DIDpK(IPFSCIK(PID, DID, DIP)). 2. Smart contracts
on IPFS cluster nodes authenticate incoming transactions
using the phone’s public key. It is DIDPK.IPFSCIK(PID,
DID, DIP) is equal to DIDPK(IPFSCIK(PID, DID, DIP)).
The authenticity of the approved retrieved pass is confirmed
by the IPFS network node using its public key, IPFSCPK.
Following smart contract verification of PID existence on the
IPFS cloud server, the information is distributed throughout
the IoMT block chain. The rules for device authentication are
laid out in Algorithm 1. Fifth, if the PID is not confirmed in
the IoMT-BC, the authentication operation will be interrupted
466 Algorithms in Advanced Artificial Intelligence

Fig. 69.3 The computational framework BC-IoMT-SS

The bonds formed between medical professionals and their

patients According to their function, the application interface
classifies agents in the IoMT network into two groups. When
the designation is ‘0,’ the application interface will display
the patient’s name, age, transaction hash, and mapping for
(PID, DID, and DIP). When the value is set to 1, the IoMT
application interface will record the doctor’s information if
the designation is equal. The transaction will be instantly
canceled if the agents do not match the criteria set by the
smart contracts.
The successful anticipation and validation of using BC-IoMT-
SS, a multi-agent cooperative, to build an e-healthcare system
has not been achieved. The research effectively demonstrates
multi-agent cooperative supply chain management through (a)
the use of CPN-enabled CMPS. With the use of smart contracts
in the IoMT domain, the Internet of Things (IoT) architecture
enables the e-healthcare industry to manage assets while
cutting down on computations, time, and energy usage. By
doing away with intermediaries, blockchain technology
boosts the efficiency of data validation and decision-making.
Every deal relies on inputs that are deterministic variables,
and smart contracts detail who owns what and when. With
the help of the IoMT, all of the linked devices may work
together as one autonomous system, allowing for better
healthcare delivery. You can use either a transaction-based
or an account-based paradigm to put smart contracts into
operation. By automating previously manual processes,
blockchain technology enhances the execution, processing, (a)
and storage of data in IoMT-related services. Fig. 69.4 Precision ratio
Blockchain-driven Security Paradigm: A Robust System Harnessing the Internet of Medical Things (IoMT) Network 467

It is important to code for every possible outcome in the smart • Python was chosen as the coding language due to its
contract in order to ensure that it does not get stuck in a hung versatility and efficacy in developing the proposed
state. The output of a smart contract should be consistent blockchain system.
regardless of the inputs used; this property is known as Research Conclusions:
determinism. Everything is kept in sync since each node can
independently verify the transactions and the overall status The BC-IoMT-SS demonstrated commendable performance
of the system. It is not possible to undo the execution of a across multiple dimensions, as elucidated in the ensuing
piece of code or a transaction; nevertheless, more code is discussion.
executed in order to make the required corrections. Since the 1. Precision Ratio and Validation:
system is impenetrable, this increases the chain’s integrity • The BC-IoMT-SS exhibited robust predictive capabilities
and guarantees the system’s integrity. At their core, smart and validation accuracy, as evidenced by a high precision
contracts are scripts of immutable, self-verifying code that ratio. This metric underscored the system’s ability to
can carry out automatic, decentralized actions; they boost accurately predict and authenticate e-healthcare data.
security, eliminate the need for a trusted third party, and are
2. Reduced Delay and Enhanced Productivity:
inexpensive to build. Smart contract ideas underpin the IoMT
paradigm for electronic health. • Notably, the BC-IoMT-SS showcased a substantially
shorter delay, contributing to enhanced system
1.1 The mathematical equation for IoMT in healthcare responsiveness. This reduction in delay is pivotal for
In order to verify compliance in the networks, the approach real-time applications, particularly in the context of
generates a transaction-based polynomial. Assume that healthcare monitoring where timely data transmission is
a set of purchases is denoted by T = [T1, T2, T3,... Tm]. critical. The system’s productivity was further bolstered
The hashes of the transactional polynomials are H1(T1), by this optimization.
H1(T2),..., H1(Tm). For each purchase m, we may derive a Dataset Description:
polynomial f(q) such that f(H1(Tk)) = 0, where k is an integer
• The clinical study involved a dataset comprising infor
from 1 to m. Once designers have f(q) = (q — H1(T1)),
mation from 10 patients, forming the basis for a compre
(q — H1(T2)),... (q — H1(Tm)) (1), they can express the
hensive analysis of the BC-IoMT-SS performance.
polynomials correctly.The original definition of the vector
U is U = {u1, u2, u3, …, um—1, um}. To rephrase it as hi • The dataset underscored the pivotal role of the Internet
= H1(TK), H1(TK))2,..., H1(TK)m—1, (H1(TK)m}, the of Things (IoT) in healthcare digitization. IoT, as a
definition would be the same. Every time a new structure transformative force, defines the management and
index is created in a public blockchain, the peers (agency) monitoring of physical spaces, their contents, and the
check the agreement vector u. If the trust factor among the well-being of inhabitants through interconnected sensors
peers (agency) is more than 1, the newly valid systematic and actuators.
methodology of the IoMT blockchain adds the data. IoT in Healthcare:
• The IoT paradigm elucidated a system of autonomous
4. Experimental Analysis computing entities capable of seamlessly gathering,
transmitting, and sharing data without direct user
This section delves into the intricacies of the experimentation intervention.
process, detailing the apparatus utilized and presenting • Illustratively, IoT’s impact on the healthcare sector
a comprehensive analysis of the obtained outcomes. The was highlighted, showcasing its ability to connect
experimentation was conducted on an HP EliteBook diverse medical equipment to a centralized server. This
computer equipped with an Intel® Core i5-6300U processor, connectivity empowers individuals to monitor their health
4 GB of RAM, and running Windows 10 Pro. Python served autonomously and engage in remote communication
as the coding language employed to construct the proposed with healthcare professionals. In summation, the
blockchain system. experimentation and analysis underscored the BC-IoMT
Experimental Setup: SS’s prowess in predicting and validating e-healthcare
• The experimentation framework was anchored on an data, with a focus on precision ratio, reduced delay,
HP EliteBook computer, ensuring a standardized and and heightened productivity. The study was much more
reliable environment for conducting the trials. comprehensive and applicable because it used a broad
• Specifications of the computer included an Intel® Core dataset and recognized the revolutionary significance of
i5-6300U processor, 4 GB of RAM, and the Windows 10 the internet of things in healthcare. The importance of
Pro operating system, guaranteeing consistent conditions systems cannot be overstated. One promising strategy
for the experimental trials. is to assess existing evidence using a high-performance
468 Algorithms in Advanced Artificial Intelligence

IoMT-SS proved the system’s efficiency. While address-based

permissions slow down the patient removal procedure in the
IoMT network, deployment speeds up the incorporation of
agents and smart contracts. By streamlining processes like
adding agents, deleting patients, rescinding authorization,
deploying contracts, and providing patient-centered medical
treatment, the system reduces gas consumption expenses.

5. Conclusion
This study introduces BC-IoMT-SS, a blockchain-based
approach to e-healthcare, addressing the growing demand
for enhanced framework networks. The framework secures
(a)
data management through distributed blockchains on IPFS
clusters, ensuring equal service to authorised agents and
protecting patient information confidentiality. Improving
service quality and processing data from consumer devices
are two upcoming goals, along with government integration
and software-defined networking.

References
1. B. Godi, S. Viswanadham, A.S. Muttipati, O.P. Samantray,
S.R. Gadiraju, E- healthcare monitoring system using IoT
(b)
with machine learning approaches, in: 2020 International
Conference on Computer Science, Engineering and
Fig. 69.5 Efficiency ratio Applications (ICCSEA), IEEE, 2020, March, pp. 1–5.
2. T. Saba, K. Haseeb, I. Ahmed, A. Rehman, Secure and energy-
efficient framework using Internet of Medical Things for
e-healthcare, J. Infect. Public Health 13 (10) (2020) 1567–
1575.
3. G. Nagasubramanian, R.K. Sakthivel, R. Patan, A.H. Gan
domi, S. Muthuramalingam, B. Balamurugan, Securing
e-health records using keyless signature infrastructure block
chain technology in the cloud, Neural Comput. Appl. 32
(2020) 639–647.
4. B. Swapna, S. Gayathri, M. Kamalahasan, H. Hemasundari,
M. SiraasGanth, S. Ranjith, E-healthcare monitoring using
internet of things, in: IOP Conference Series: Materials
Science and Engineering, vol. 872, IOP Publishing, 2020,
Fig. 69.6 Managing the upload timing of transactions (Tx) on June, 012024, 1.
IPFS’s encrypted storage layer for different sizes 5. S. Kadam, D. Motwani, Block chain based E-healthcare
record system, in: Image Processing and Capsule Networks:
machine-learning system. The accuracy of experiments ICIPCN 2020, Springer International Publishing, 2021, pp.
conducted using BC-IoMT-SS is 94% or higher. 366–380.
6. M.M. Khubrani, A framework for block chain-based smart
In order to monitor patients in real-time, make accurate health system, Turkish J. Comput. Math. Educ. (TURCOMAT)
diagnoses, and administer effective treatments, hospitals 12 (9) (2021) 2609–2614.
rely on sensor technology and electrocardiograms (ECGs). A 7. Z. Shahbazi, Y.C. Byun, Towards a secure thermal-energy
bounded telemonitoring system enables real-time monitoring aware routing protocol in wireless body area network based
on block chain technology, Sensors 20 (12) (2020) 3604.
of anyone within the medical center, including inpatients and
8. H. Liu, R.G. Crespo, O.S. Martinez, Enhancing privacy and
outpatients. During disasters, tags can monitor patients’ health data security across healthcare applications using block chain
issues, which improves their well-being and overall quality of and distributed ledger concepts, in: Healthcare, vol. 8, MDPI,
life. A 94.1% success rate in an experiment employing BC 2020, July, p. 243, 3.
Blockchain-driven Security Paradigm: A Robust System Harnessing the Internet of Medical Things (IoMT) Network 469

9. D. Wu, N. Ansari, A cooperative computing strategy for block 24. B.V.D.S. Sekhar Et Al (Oct 2022),” Novel Technique of
chain-secured fog computing, IEEE Internet Things J. 7 (7) Threshold Distance-Based Vehicle Tracking System for
(2020) 6603–6609. Woman Safety.”, Intelligent System Design, Lecture Notes in
10. Z. Ashfaq, A. Rafay, R. Mumtaz, S.M.H. Zaidi, H. Saleem, Networks and Systems 494, INDIA 2022, DOI:10.1007/978
S.A.R. Zaidi, A. Haque, A review of enabling technologies for 981-19-4863-3_56, Pp:567-577, https://link.springer.com/
internet of medical things (IoMT) ecosystem, Ain Shams Eng. chapter/10.1007/978-981-19-4863-3_56
J. 13 (4) (2022), 101660. 25. B.V.D.S. Sekhar Et Al (Oct 2022),” Sustainable and reliable
11. G. Miao, A.A. Ding, S.S. Wu, Real-time disease prediction healthcare automation and digitization using deep learning
with local differential privacy in Internet of Medical Things, technologies”, Journal of Scientific and Industrial Research
arXiv preprint arXiv (2022), 2202.03652. (JSIR), (SCIE, WOS, Scopus)
12. M.B. Janjua, A.E. Duranay, H. Arslan, Role of wireless 26. B.V.D.S. Sekhar Et Al (June 2021),” Real Time Facial
communication in healthcare system to cater disaster situations Expression Recognition Using Open CV And Deep Learning”,
under 6G vision, Front. Commun. Net. 1 (2020), 610879. International Journal of Research, Vol X, Issue VI, ISSN:
13. A. Lakhan, M.A. Mohammed, M. Elhoseny, M.D. Alshehri, 2236-6124, Pages: 7-27.
K.H. Abdulkareem, Block chain multi-objective optimization 27. B.V.D.S. Sekhar Et Al (August 2021),” The Hybrid Algorithm
approach-enabled secure and cost- efficient scheduling for the for increasing Reversable Data Hiding Scheme for Medical
Internet of Medical Things (IoMT) in fog-cloud system, Soft Images”, International Journal of all Research Education and
Comput. 26 (13) (2022) 6429–6442. Scientific Methods IJARESM, Vol:9, Issue:8, August 2021,
14. X. Li, B. Tao, H.N. Dai, M. Imran, D. Wan, D. Li, Is block Issn: 2455-6211, Pages: 2470-2476.
chain for internet of medical things a panacea for COVID-19 28. B.V.D.S. Sekhar Et Al (December 2020),” A Novel Technique
pandemic? Pervasive Mob. Comput. 75 (2021), 101434. For Prediction Of Coronary Artery Disease From Human
15. S. Razdan, S. Sharma, Internet of medical things (IoMT): Fundus Images Using Machine Learning Approach”,
overview, emerging technologies, and case studies, IETE International Journal For Innovative Engineering And
Tech. Rev. 39 (4) (2022) 775–788. Management Research, Vol:7, Issue:12, December’2020, Issn:
16. Y.D. Al-Otaibi, K-nearest neighbour-based smart contract for 2456-5083, Pages: 69-74. [SSSN, Elsevier]
internet of medical things security using block chain, Comput. 29. B.V.D.S. Sekhar Et Al (December 2020),” Recognition of Hu
Electr. Eng. 101 (2022), 108129. man Being Through Handwritten Digits Using Image Process
17. A. Abbas, R. Alroobaea, M. Krichen, S. Rubaiee, S. Vimal, ing Techniques And Ai”, International Journal For Innovative
F.M. Almansour, Block chain-assisted secured data manage Engineering And Management Research, Vol:7, Issue:12, De
ment framework for health information analysis based on cember’2020, Issn: 2456-5083, Pages: 69-74. [SSSN, Elsevi
Internet of Medical Things, Personal Ubiquitous Comput. er]
(2021) 30. B.V.D.S. Sekhar Et Al (February 2020),” A Novel Robotic
18. P.P. Ray, N. Kumar, D. Dash, BLWN: block chain-based aid for physically challenged Implemented using Image
lightweight simplified payment verification in IoT-assisted Processing”, Journal Of Engineering Research and
e-healthcare, IEEE Syst. J. 15 (1) (2020) 134–145. Application, IJERA Vol: 10, Issue:2 (Series-I), ISSN: 2248
19. B. Sharma, R. Halder, J. Singh, Block chain-based 9622, Pages: 53-57.
interoperable healthcare using zero-knowledge proofs and 31. B.V.D.S. Sekhar Et Al (July 2020),” Processing Real World
proxy re-encryption, in: 2020 International Conference on Datasets using Big Data Hadoop Tools”, Journal of Scientific
Communication Systems &Networks (COMSNETS), IEEE, & Industrial Research, Vol:79(7), Pages: 631-635. ISSN: 0975
2020, January,pp. 1–6. 1084, http://nopr.niscair.res.in/handle/123456789/54985
20. A. Farouk, A. Alahmadi, S. Ghose, A. Mashatan, Block [SCI, Scopus]
chain platform for industrial healthcare: vision and future 32. B.V.D.S. Sekhar, Pvgd Prasad Reddy, Gps Varma “Performance
opportunities, Comput. Commun. 154 (2020) 223–235. Of Secure And Robust Watermarking Using Evolutionary
21. J. Sengupta, S. Ruj, S.D. Bit, A comprehensive survey on Computing Technique” JGIM, Vol 25, Issue 4, Article 5.
attacks, security issues and block chain solutions for iot, J. October- December 2017, Doi 10.4018/Jgim.2017100105.
Netw. Comput. Appl. 149 (2020), 102481. Pages 61-79, [Web of Science, Sci, Scie Journal, Indexed In
22. D.C. Nguyen, P.N. Pathirana, M. Ding, A. Seneviratne, Block Acm, Scopus]
chain for secure EHRs sharing of mobile cloud based e-health 33. B.V.D.S. Sekhar Et All (January 2020) “An Experimental
systems, IEEE Access 7 (2019) Analysis Of Secure-Energy Trade-Off Using Optimized
23. B.V.D.S. Sekhar Et Al (NOV 2022),” Artificial neural Routing Protocol In Modern-Secure-Wsn”, Eai Endorsed
network-based secured communication strategy for vehicular Transactions On Scalable Information Systems, Issue:24,
ad hoc network”, Soft Computing, Springer, Vol: 27, Issue 1, Issn: 2032-9407(Accepted For Publication) [Web Of Science,
PP 297-309 https://link.springer.com/article/10.1007/s00500 Sci, Scie Journal, Indexed In Acm, Scopus]
022-07633-4, ISSN:1432-7643,1433-7479, IF: 3.643, (SCIE, 34. B.V.D.S. Sekhar, Pvgd Prasad Reddy, Gps Varma “A
WOS, Scopus) Neural Network Model For Detecting Anomalous Traffic
470 Algorithms in Advanced Artificial Intelligence

Implementing Self-Organizing Maps” International Journal 37. B.V.D.S. Sekhar, Pvgd Prasad Reddy, Gps Varma “Principal
Of Computational Intelligence And Health Informatics, 2008. Component Analysis Based Image Denoising Implemented
Vol 1, No 1, Pp 25-29, Issn 0973-7413 Using Lpg And Compared To Wavelet Transform Techniques”
35. B.V.D.S. Sekhar, Pvgd Prasad Reddy, Gps Varma ” Novel Ijesrt, Doi: 10.528/Zendo.55803, Pp 673-678, Vol 5 No 6,
Technique Of Image Denoising Using Adaptive Haar Wavelet June 2016. Issn 2277-9655. (UGC:43449)
Transformation “Irecos , 2015, Vol 10, No 10, Pp 1012-1017, 38. B.V.D.S. Sekhar Et.Al,” A Novel Technique For Home
Issn 1828-6003 ( Scopus Indexed). Automation” Ijircce, Vol 4, Issue: 6, June 2016, Doi: 10.15680/
36. B.V.D.S. Sekhar, Pvgd Prasad Reddy, Gps Varma “Improved Ijircce.2016.0406283, Pp 12059-12062, Issn: 2320-9801.
Psnr In Image Denoising Using Modified Median Filter ” Ieee 39. B.V.D.S. Sekhar, Gps Varma, Et.L” Secure Automative
Explore, Feb 2016, Part Number Cfp1665w-Art, Isbn 978 Locking Control And Anti Theft Using Gps And Blluetooth”
4673-7832-1 Ijirmf, Vol 2, Issue 8, Aug 2016, Pp 165-168, Issn: 2455-0620.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Estimating Foreign Export Volume

Using Machine Learning for Big Data
Business Analytics
70

Yendrapati Geetha*
Assistant Professor,
Dept of Computer Science and Engineering,
Gudlavalleru Engineering College, Gudlavalleru

Abstract: Improved forecasting and strategic decision-making are necessities in today’s cutthroat economic climate for
both importers and exporters. The proposed method, which is based on big data analytics, may help businesses locate fresh
opportunities and revise their strategic decisions. An empirical investigation of the distribution of agricultural commodities in
India over the course of two years validates the proposed strategy. An advanced analytical framework for strategic decision-
making is provided by the proposed study, which is based on Big Data Analytics (BDA). This research delves into the topic of
overseas export volume estimation through the use of machine learning to big data business analytics. Using machine learning
techniques and the values anticipated from a thorough market analysis, this study determines the amount of exports to foreign
nations. These results show that the proposed approach improves strategic market analysis and gives accurate trade estimates.
The experimental results show that using machine learning techniques in a Hadoop context greatly improves the accuracy of
the estimates.
Keywords: Agriculture, Big data analytics, Decision making analysis, Strategic decisions, Machine learning

1. Introduction transformation, future growth, and export performance,

according to Murat ozemre, 2020 [9].The composition
Over the years, India’s exports have steadily grown. It is a of the export bin is determined by wamba, 2017[15] by
percentage of global exports and GDP (olszakem, 2016[11]). looking at the technical content, quality, sophistication,
By 2023, exports will account for over half of GDP, a near- and complexity of exports, as well as the degree to which
doubling of the current percentage. In a similar vein, a market a nation’s exports are linked to commodities and services
survey conducted by Bartus in 2017 indicated that Indian that are sold worldwide. How Indian exports are doing A.
exports of products will nearly triple to 4.9 percent of global Vidya Lakshmi, 2020[1] examines the implications of this
exports of goods from 2021 to 2023 [3].Indian service exports study’s newly documented big data along these dimensions
followed a similar pattern, increasing fourfold from 2000 for future export performance, structural change, and growth,
to 2023 and representing almost 5% of all service exports while Reihanesh, 2019[13] provides more detail on these
worldwide. While it is widely acknowledged that trade can dimensions. The goal of big data analytics, according to
lead to economic diversification and structural change, newer Bhattacharya (2016), is to aid in data-driven decision-making
studies reveal that the type of commodities and services by discovering patterns, trends, and correlations in massive
carried and the actors engaged also influence the dynamics amounts of raw data. These processes use modern tools to
of structural change. Having diversity across destinations, apply popular statistical analysis methods to bigger datasets,
commodities, and services is crucial for structural like classification and regression. According to Corria

*
Corresponding author: geetha.yendrapati223@gmail.com

DOI: 10.1201/9781003529231-70
472 Algorithms in Advanced Artificial Intelligence

(2021)[4], there is software and tools for big data analytics as text, audio, video, and images. Kornelia (2022) found that
that might help us make decisions based on data, which although information is richer than structured, it is difficult
could improve the results of company operations. Possible to scrape [8]. The export product structure and function
advantages include more efficient marketing, more consumer of our product export business system are attempted to be
personalization, and more efficient operations. If you have a mimicked by neural networks, which employ technology that
well-thought-out strategy, it can outperform your peers and aims to generate intelligent behavior. Typically, this system
reap these rewards. the work of Janssen M. in 2017.Using is represented as a weighted directed graph, with neurons
effective big data analytics, as outlined by Gupta, 2019[6], serving as nodes and connections between them as edges.
one can derive the most insightful conclusions from the The type and strength of the contact between the adjacent
data’s increasing volume, velocity, and diversity.In this essay, neurons are indicated by the weights on each.
you will find five sections. Section 1: Research Motivation is
an Introduction to the Section. Section 2 covers the related Perceptrons, according to Neelam Tyagi, 2020[10], are the
approach of using Perceptron to export product data. In simplest sort of neural networks. They comprise a single
Section 3, we detail the intended task and its technical details. neuron with several real-valued or binary inputs and a binary
Section 4 presents the experimental results of the proposed output. The inputs are multiplied by the weights on the edges
investigation. The results are summarized in Section 5. that have been assigned to them. At any given moment, the
total of all weighted inputs is what the neuron perceives as
its net input. In response to a net input that is greater than a
2. Related Methods certain threshold, the neuron will fire and emit a “1” signal;
The export quantities to foreign countries are determined otherwise, it will emit a “0” signal. Figure 70.1 shows the
using data business analytics. Perceptron.
Framework for Export Products A Learning Algorithm for If all goes according to plan, this export product system’s
Perceptrons: Unstructured Big Data includes media files such Perceptron will be trained to provide specific results when

Table 70.1 Export product commodity

Product India Exporting Unit Quantity Value (INR) Quantity Value
Commodity (2021-22) (2022-23) (INR)
RICE AUSTRALIA TON 0 0 2921 59883789
RICE ITALY TON 22 807480 0 0
RICE RUSSIA TON 0 0 23 429716
RICE CANADA TON 42 1956141 0 0
RICE CHINAP RP TON 57 3292344 0 0
RICE HONG KONG TON 0 41387 0 0
RICE IRAN TON 0 0 14 531101
RICE ITALY TON 37 2396665 44 3617048
RICE JAPAN TON 4 435883 6 1441172
RICE KOREARP TON 14 493924 0 0
RICE NETHERLAND TON 2 250705 0 0
RICE SAUDI ARAB TON 460 24498500 0 0
RICE SINGAPORE TON 72 3108150 0 0
RICE SPAIN TON 206 10448214 241 12811673
RICE SWITZERLAND TON 2 101115 0 0
RICE THAILAND TON 128 5570989 216 1199015
RICE TURKEY TON 0 0 0 5707
RICE U ARAB EMTS TON 0 0 24 1199015
RICE UK TON 0 0 0 7139
RICE USA VIETNAM TON 36 2181872 148 7745341
RICE SOC REP TON 0 0 5 406179
Estimating Foreign Export Volume Using Machine Learning for Big Data Business Analytics 473

Reducing parameter precision and switching from float to

integer is what model quantization is all about, and it leads
to a compression ratio of 4X. The model may need to be
fine-tuned with further training data if it deviates from its
convergence point, which can happen during this process.
Bypassing this stage, post-training quantization makes
room for more weight adjustments. The spatial complexity
of neural network weight matrices can be reduced through
the use of low-rank decomposition. Using response-based,
feature-based, or relation-based distillation approaches,
Fig. 70.1 Perceptron is trained export product system to knowledge distillation transfers information from one model
respond to certain inputs with certain desired to another, usually from a larger one to a smaller one. In
outputs the LLM approach to using empirical results to produce
more efficient architecture, lightweight model design is a
given specific inputs. At first, the Perceptron’s weights commonly utilized method (Deepak Kumar Sharma, 2022)
are chosen at random. Afterwards, the Perceptron is fed a [5].
sequence of inputs in the form of (x1, x2, x3…,xn) during
Products export Process Using Back propagation
the training phase. A zero or one is the intended outcome for
Algorithm: The upcoming project Train your neural networks
every one of these inputs. It is the net input that determines
to export products in the Hadoop environment. Methods
the actual output.
used by parthbhasin (2017) to disperse computations in back
Net input = w1x1 + w2x2 + ... + wnxn (1) propagation [12]. Two common approaches to dividing up
the work are: “Parallelism in data and models” Centralized
3. Proposed Work synchronous data parallelism procedures:
1. The concept The parameter server’s ground truth
The study suggests a big data analytics-based strategy for
is used to determine product weights. The weights
improving forecasting and strategic decision-making in the
are replicated among multiple processes running on
competitive economic environment. It uses real data from
different hardware, like Hadoop, either inside the same
a two-year study on agricultural commodity distribution
system or on separate machines.
in India. The strategy uses machine learning to estimate
foreign export volume, demonstrating its effectiveness in 2. Each duplicate model is assigned its own export data
strengthening market analysis and providing precise trade mini-batch and executes the forward and backward
estimates. Deepak Kumar Sharma (2022)[5] explores using passes, where the gradients are generated, separately.
neural networks trained in big data analysis for exporting 3. Gradients are averaged after being sent to the parameter
goods in Hadoop environments, utilizing data parallelism server. The weights are updated via gradient descent
and model parallelism for processing. Perceptron is trained and sent to all the worker nodes.
export product system to respond to certain inputs with In this process, the term “centralized” refers to the mean of
certain desired outputs. the gradients. Averaging the obtained model weights is the
• Compression approach for reduce the size of large Data next step in the procedure’s “decentralized” iteration:
using Model Compression Approach. 1. A master process distributes the weights of the process
• Feature Approach for Back propagation Algorithm on model.
Hadoop Environment. 2. Different data mini-batches allow for several iterations
Outsized Product Data Model Compression Approach: of each procedure’s forward and backward passes. At
With 100B and 1T parameters, machine learning models the moment, the weights of the various processes are
can now utilize memory anywhere from 400GB to 4TB. very different from one another.
Researchers have devised five primary methods with the goal Third, once the master process receives all of the weights, they
of decreasing model size without sacrificing performance. are broadcast back to all of the worker nodes and averaged.
Removing irrelevant nodes from a network is known as
Although the decentralized method may be slightly faster
“model pruning.” This process is typically evaluated using
due to reduced machine-to-machine connection, it badly
the gradient or second-order derivative of the loss function.
executes the back propagation process. These operations
The inference speed is improved via structured pruning,
are synchronous since we have to wait for every worker to
which eliminates entire neurons, layers, or filters.
do their job. In contrast, Shivani (2021) does not average
474 Algorithms in Advanced Artificial Intelligence

Fig. 70.2 Large data model compression approach

Fig. 70.3 Export large data process using distributed back propagation approach
Estimating Foreign Export Volume Using Machine Learning for Big Data Business Analytics 475

Fig. 70.4 Untested export product data

weights or gradients when processes execute asynchronously Sensitivity: Sensitivity, also known as the true positive rate
[14]. That is the only difference. or recall, is a measure of an individual’s ability to recognize
positive examples.
4. Experimental Result Specificity: The specificity of a model is its ability to
accurately predict true negatives in each accessible category.
Assessing a machine-learning model’s correctness is a
crucial phase in the procedure. With only 4800 records of Precision: The precision of a test is determined by dividing
training export product data, a model cannot produce accurate the number of correctly identified positive samples by the
predictions for untested export product data. total number of positive samples.
The Distributed Back Propagation algorithm in the Hadoop F1-Score: The F1-score is a statistical tool that compares the
Environment was compared for accuracy, sensivity, performance of two classifiers by calculating the harmonic
specificity, precision, and F1-score using research metrics by mean of their precision and recall.
Zhang, 2018[16]. The graphs display accuracy metrics like sensitivity,
Accuracy: The accuracy of a model is a metric that gauges its specificity, precision, and F1-score for Distributed Back
performance across all classes, particularly when all classes Propagation achieving the most favorable results.
are equally essential.
Table 70.2 Graph for comparing algorithm
Algorithm Accuracy Sensitivity Specificity Precision Fl-score
Back Propagation 0.96 0.97 0.95 0.95 0.97

Fig. 70.5 Graph for comparing algorithm

476 Algorithms in Advanced Artificial Intelligence

6. Gupta V, Singh VK, Ghose U, Mukhija P: A quantitative and

5. Conclusion text-based characterization of big data research, Journal Intell
The article describes how to train neural networks in a Fuzzy System,36:4659–75,2019.
Hadoop environment using big data analysis so that product 7. Janssen M, van der Voort H, Wahyudi A.: Factors influencing
information can be exported. Why and how the distributed big data decision-making quality, Journal Business
Research,70:338–45,2017.
nodes are put to use in the back propagation method. Two
8. Kornelia Batko, Andrzej Ślęzak: The use of Big Data Analytics
common methods for dividing up tasks between employees
in healthcare, Journal of Big Data, Springer Open, 3, 2022.
are data parallelism and model parallelism. A trained 9. Murat Özemre, Ozgur Kabadurmus: A big data analytics
Perceptron system is able to produce the desired results based methodology for strategic decision-making, Journal
when given the right inputs. The massive volume of data is of Enterprise Information Management, ISSN: 1741-0398,
reduced by combining a model with a compression method. published in Emerald Insights, 2020.
A Hadoop-compatible back propagation algorithm feature 10. Neelam Tyagi: Understanding the Perceptron Model in Neural
method. Networks, Published in Analytics Steps, Jan 27, 2020.
11. Olszak CM: Toward better understanding and use of
business intelligence in organizations. Inf System Managing,
References 33(2):105–23, 2016.
12. Parth Bhasin, Vaishali: Back Propagation Algorithm: An
1. A.Vidhyalakshmi, C. Priya: Medical Big Data mining and
Artificial Neural Network Approach, International Journal of
processing in e-health care, An Industrial IoT Approach for
Engineering Research & Technology (IJERT) ISSN: 2278
Pharmaceutical Industry Growth, Science Direct, 2020.
0181, Volume 5, Issue 10, 2017.
2. Bhattacharya M, Islam R, Abawajy J: Evolutionary
13. Reihanesh H. Hariri, Erik M. Fredericks, Kate M. Bowers:
optimization: a Big data perspective, Journal Network
Uncertainty in big data analytics: survey, opportunities, and
Computer Application, 59:416–26, 2016.
challenges, Journal of Big Data, Springer Nature, 44, 2019.
3. Bartuś K, Batko K, Lorek P: Business intelligence systems:
14. Shivani Kuninti, Rooban S: Back propagation Algorithm and
barriers during implementation. In: Jabłoński M, editor
its Hardware Implementations: A Review, Journal of Physics:
in Strategic performance management new concept and
Conference Series 1804, 012169, IOP Science, IOP Publishing
contemporary trends. New York: Nova Science Publishers;
doi:10.1088/1742-6596/1804/1/012169,2021.
2017. p. 299–327. ISBN: 978-1-53612-681-5.
15. Wamba SF, Gunasegaram A, Akter S, Ji-fan RS, Dubey R,
4. Corsi A, de Souza FF, Pagani RN, et al.: Big data analytics
Childe SJ: Big data analytics and firm performance: effects of
as a tool for fighting pandemics: a systematic review of
dynamic capabilities, Journal Business Research, 70:356–65,
literature, Journal Ambient Intell Hum Computer, 12:9163–
2017.
80, 2021,https://doi.org/10.1007/s12652-020-02617-4.
16. Zhang Q, Yang LT, Chen Z, Li P: A survey on deep learning
5. Deepak Kumar Sharma, Suchitra Vavilala: Deep learning
for big data analysis, Information Fusion, 42:146–57, 2018.
applications for disease diagnosis in Data, Science Direct,
2022. Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Unmasking Deceit: Pioneering Deep

Learning Hybrids to Expose Fabricated
Reviews in the Digital Realm
71

N. Deshai* and B. Bhaskara Rao

Dept of C.S.E GITAM School of Technology, Visakhapatnam, 530045
Andhra Pradesh, India

Abstract: These days, consumers rely heavily on internet reviews to gauge other shoppers’ opinions before making a purchase,
which in turn shapes the online marketplace. Although real evaluations are valuable, the problem of fraudulent reviews is a new
dimension that could lead customers astray. Accurately recognising fraudulent reviews, especially within the prolific Amazon
dataset, is an urgent challenge that this paper addresses. This research delves into the complex terrain of the e-commerce
business by introducing two unique deep-learning hybrid models, the most notable of which is BERT-CNN. Hybrid models
that use state-of-the-art word embedding techniques like Glove and One-Hot Encoding demonstrate exceptional performance.
The results of the experiments prove that BERT-CNN is effective; it achieved an outstanding 98.7 percent accuracy rate. This
research not only adds new knowledge to the field of false review detection, but it also demonstrates how effective and useful
BERT-CNN is for improving the accuracy and efficiency of this critical work.
Keywords: BERT-CNN glove, One hot encoding

1. Introduction highly rated items and services, frequently choosing them

over competitors despite large price differences. Numbered
Reviews posted online help us navigate the enormous world ratings have a powerful effect, boosting customer confidence
of products and services in today’s ever-changing digital and allowing for faster decision-making. The prevalence of
ecosystem, and they have an outsized impact on our daily bogus high ratings further complicates these systems and
decisions. Reviews and ratings play a significant role in challenges their trustworthiness. A watershed moment in our
determining a business’s success or failure in the online understanding of consumer behaviour and preferences came
marketplace. The power of a single review to influence with the advent of star-rating systems in 2013. Our mission
sales is immense, especially when customers are relying on is to understand how people use product reviews so they may
these testimonies more and more to make their decisions. better navigate the vast diversity of options in the online
A major obstacle, however, is the widespread problem of marketplace. Our primary goal is to provide consumers with
fraudulent online reviews; estimates range from 16.1% to trustworthy review and rating systems so they can make
33.3%. Online reviews and ratings now carry comparable educated purchases. There are noticeable limits, even though
weight in the decision-making process as price and seller there have been substantial efforts to detect bogus reviewers.
reputation. Consumer reviews and ratings posted on sites like Primarily, current studies mostly focus on creating new
Amazon, TripAdvisor, Yelp, Google Play, and IMDB have behavioural traits, which requires expensive human effort
a significant impact on product selection. The transition to and knowledge, even though utilising these qualities is vital.
digital commerce has magnified the importance of word Furthermore, researchers have investigated word embeddings,
of-mouth referrals. Importantly, consumers tend to favour part-of-speech n-grams, and n-grams as potential text features

*
Corresponding author: desaij4@gmail.com

DOI: 10.1201/9781003529231-71
478 Algorithms in Advanced Artificial Intelligence

to improve detection. When reviews use casual language or is crucial, enabling deep learning methods to accurately
obscured words, however, these text elements may reduce comprehend, leverage, and effectively classify each online
detection accuracy. If the bag-of-words assumption is review.
applied, which relies on word frequency, the review feature
vectors may become sparse. Linguistic features, like POS 2.2 Feature Extraction
n-grammes, might not be able to tell the difference between The N-Grammes These days, N-grammes are all the rage when
a real reviewer and a highly skilled imposter. Word2Vec it comes to text classification jobs. All sorts of permutations
and similar word embedding algorithms also have trouble of letters, numbers, and symbols are encompassed in this.
capturing all possible semantic meanings, which could affect Numerous tokenization techniques are available, such as
how well they recognise reviews in different contexts. whitespace, expression, and unigram, which function on the
phrase, comment, or character levels. For example, a bigram
2. Related Work or trigram uses sequences of many tokens, but a unigram
only uses one token. Normalisation to reduce the number of
Amazon, Yelp, Google Play Store, Trip Advisor, and social unique tokens and eliminate text variances, normalisation
media are just a few of the modern venues where customer is a crucial strategy. Stemming and lemmatization remove
evaluations and ratings play a crucial role in influencing superfluous data from content during the cleaning process.
purchasing decisions. The recent upsurge in fraudulent reviews A lexeme is a basic form in the fields of linguistics and NLP.
threatens both businesses’ success and customers’ trust. Text Using stem cells Though stemming’s primary function is to
reviews influence 93% of buyer decisions and command a reduce words to their simplest forms, it has the potential to
large 43% market share, with digital buyers estimated to introduce ambiguity. The Porter stemmer is a good example
reach 2.14 billion. A critical area of research has emerged, of how stemming may still accurately infer the root when
the detection of bogus reviews. In particular, deep learning spelling mistakes are present. A process called lemmatization
techniques have demonstrated great promise in this area, Lemmatization provides a more refined technique than
while machine learning tools such as support vector machines
(SVMs) and neural networks (NN) have achieved remarkable
success. It is still quite difficult to do binary classification,
which involves telling the difference between real and bogus
reviews. When it comes to identifying French online reviews,
hybrid models like LSTM+CNN with Camem-BERT
accomplish remarkable results, attaining an accuracy of
93.7%. The accuracy rate on the Tweep-Fake Dataset is 89.7
percent when using DNN, CNN, GRN, and HAN. Combining
CNN with BiLSTM achieved the best accuracy of 90.66%. Fig. 71.1 Proposed deep learning hybrid methodology
On Arabic datasets, for example, CNN-LSTM attains the best
accuracy of 86.88%, proving that deep learning models often
beat conventional machine learning. Online reviews have a
significant effect on e-commerce earnings, so there needs to
be a solution to the growing problem of fraudulent reviews.
Advancements in identifying automated fake evaluations
have been substantial, highlighting the necessity for strong
techniques to safeguard online feedback platforms.

2.1 Proposed Deep Learning Framework

In the contemporary landscape, the vast realm of online
reviews often harbors noise in the form of hyperlinks, HTML
tags, and unofficial comments. Many words within these
reviews lack significant impact on the overall sentiment
expressed. To distill more meaningful insights, we adopt
a minimalist approach to text preprocessing, leveraging
standard Python libraries. This process involves eliminating
capitalizations, stop words, and punctuation, as illustrated in
Fig. 71.2. The transformation of text into a suitable format Fig. 71.2 Proposed deep learning evaluation
Unmasking Deceit: Pioneering Deep Learning Hybrids to Expose Fabricated Reviews in the Digital Realm 479

stemming, which has limitations in normalisation and leads 2.3 TF-IDF vs. Word2Vec Symphony
to the formation of useless terms. Lemmatization, which
As the story progresses, we see Word2Vec’s multidimensional
reduces words to their base form, is preferable to stemming
representation pitted against TF-IDF’s solitary vector. In
because it produces a more true portrayal of meaningful
contrast to Word2Vec’s orchestra of N-dimensional vectors,
phrases that conform to linguistic standards.
which provides a more comprehensive representation,
Revealed:Vectorization Innovation becomes the choreography TF-IDF is a soloist in its simplicity. The collision of these
of text-to-numerical vector translation in the complex dance titans reveals the depth of data production techniques. Skill-
of language and machine. By combining the linguistic depth Based Word Embeddings PerformGlove leads the way in
and computational power of algorithms, this procedure helps the elaborate waltz with BERT-CNN in the ballroom of
to isolate unique traits for the model’s training. Best Practices pre-trained word embeddings. With its extensive training
for Vectorizing Vectorizer for Counts, TF, and IDFThe on a large corpus, Glove proves to be a reliable partner,
numerical substance is extracted via these text transformation demonstrating its capabilities in conjunction with BERT
pillars. Term Frequency (TF) measures the frequency of CNN to uncover the mysteries of identifying fraudulent
words in a sample, whereas Count Vectorizer counts the reviews. BERT-CNN: A Master of Hybrid Systems Here we
occurrences of words. To make it more precise, Inverse present the hybrid model BERT-CNN, the main attraction
Document Frequency (IDF) takes into account how often the of our innovation display. We can expect better accuracy
term appears in the whole dataset. Each term’s relevance is and faster performance in bogus review prediction with this
nuancedly represented by the TFIDF equation, which depicts powerful mix of convolutional neural networks (CNNs) for
this subtle dance. Using Word2VecWord2Vec comes out with data mining and Bayesian error rate tremor (BERTs) for time
hidden semantic linkages; it’s a shining star in modern NLP. It series dependency extraction. By moving beyond the ordinary
can predict word vectors either alone or in context, using either and into a paradigm where complexity meets innovation, we
the CBOW or continuous skip-gram a approaches. Unlike arrive at this groundbreaking study. As our knowledge grows,
Skip-gramme, which predicts context vectors from the centre we can hear the symphony of comparisons that bode well
word, CBOW predicts a word’s vector based on neighbouring for solving the mystery of fraudulent internet evaluations.
context. Deciphering the complex network of contextual To find and stop the spread of fake reviews on the internet,
subtleties is like a symphony of predictions. Single-Record the BERT-CNN model skillfully combines the best features
Encoding by converting categorical variables into expressive of BERT (Bidirectional Encoder Representations from
binary vectors, One-Hot Encoding becomes a deep learning Transformers) and CNN (Convolutional Neural Network). It
master of sequential classification. By quantifying the core of is a revolutionary hybrid architecture. The unique features of
sequential linkages, this representation improves prediction this model become clear when we disassemble it:
accuracy. GloVe is a global vector representation of words.
1. BERT: Contextual Understanding: Taking into account
A new member of the vectorization ensemble, GloVe creates
both the left and right word contexts, the powerful pre-trained
word vectors by combining global and local statistics.
language model BERT does an excellent job of contextual
By de-emphasising commonly used word pairs, GloVe
understanding. Its bidirectional methodology enables a
manages to strike a balance between global context and local
sophisticated understanding of complex word associations.
subtleties. Deciphering word co-occurrences throughout
After being pre-trained on large corpora, BERT learns rich
the entire corpus is a breeze for this unsupervised virtuoso.
contextual representations through transfer learning. This
The innovative spirit is front and centre in this vectorization
model has been fine-tuned for some jobs with reduced
symphony, creating a tapestry where the power of words and
labelled data requirements using transfer learning.
numbers meet. A Groundbreaking Comparison** The rise of
fraudulent internet reviews is a complex modern problem with 2. CNN: Local Feature Extraction: One area where CNN really
many moving parts. We compare and contrast the outcomes shines is when it comes to taking input data and extracting
of various machine learning and deep learning models with local features. When applied to text, CNN is a master at
the BERT-CNN-LSTM hybrid model, which is powered by extracting characteristics and patterns from localised areas of
deep learning, in this ground-breaking investigation. Data a sequence; this is especially helpful for finding sentence or
Alchemy Revealed The transformation narrative, in which we paragraph structures.
reveal the many techniques—Glove and one-hot encoding— 3. Feature Fusion using the BERT-CNN Hybrid Model: The
orchestrated to convert textual subtleties into numerical hybrid model achieves remarkable results by combining the
vectors, is fundamental to this discovery. Data shaping is best features of BERT and CNN. CNN creates a comprehensive
revealed by each technique, which is like painting a stroke on picture of input text by concentrating on extracting local
the canvas of knowledge. features or patterns, whereas BERT offers deep contextual
comprehension. Combining BERT’s contextual embeddings
480 Algorithms in Advanced Artificial Intelligence

with CNN’s local features, the model incorporates insights References

from both BERT and CNN. Because of this integration, the
model can pick up on a wide range of language patterns and 1. N.Deshai A Detection of Unfairness Online Reviews Using
subtleties. Deep Learning, JATIT, Volume 100, 13 (pp.4738-4779).
2. N.Deshai, Unmasking deception: a CNN and adaptive
Fake Review Detection: Finding phoney reviews on the web
PSO approach to detecting fake online reviews, 2023,Soft
is the main goal. The algorithm deftly detects linguistic Computing, 1-22.
clues, trends, and anomalies that indicate phoney reviews by 3. N.Deshai, Transparency in healthcare and e-commerce:
utilising BERT’s contextual comprehension and CNN’s local detecting online fake reviews using a dense neural network
feature extraction. model with relevance mapping,2023,Soft Computing, 27,
Adjustment and Instruction: After being pre-trained on a 14(pp.9861-9875).
4. N.Deshai, Deep Learning hybrid approaches to detect fake
large dataset using BERT, the model is usually fine-tuned on
reviews ans ratings, 2022, JSIR, 82(1) (pp.120-127)
a labelled dataset that is dedicated to detecting false reviews.
5. Wiens, J. A. 2005. Avian community ecology: An iconoclastic
The convolutional neural network (CNN) part learns the view. In Perspectives in ornithology, ed. A. H. Brush, and G.
nuances of the domain of interest while fine-tuning. A. Clark, 355–403. Cambridge: Cambridge Univ. Press.
Sensitivity to Context: Sensitivity to context in both directions 6. Terborgh, J. 2009. Preservation of natural diversity.
enables BERT to fine-tune its perception of sentence BioScience. 24:715-22.
meanings and relationships. 7. Alotaibi, A. R. and Mishra, A. V. (2015). Global and regional
volatility spillovers to GCC stock markets. Int. Economic.
Detecting Subtle Linguistic Indicators of Fake Reviews: Modelling. 45(3):38–49.
Convolutional Neural Networks (CNNs) are great at 8. Akhtaruzzaman, M., Boubaker, S., and Sensoy, A. (2021).
recognising local patterns and characteristics. The BERT Financial contagion during COVID–19 crisis. Int. Financ.
CNN model effectively tackles the complex problem of Res. Lett. 38(2):101604-101609.
identifying false internet reviews by combining BERT’s 9. Testa, B. and L. B. Kier. 2013. Emergence and dissolvence in
contextual awareness with CNN’s local feature extraction in the self-organisation of complex systems. Entropy 2, no. 1:
a novel way. 1-25. http://www.mdpi.org/entropy/papers/e2010001.pdf.
10. Schwartz, G. J. 2012. Multiwavelength analyses of classical
carbon-oxygen novae. PhD diss., Arizona State Univ.
3. Conclusion 11. O’Guinn, T. C. 2014. Touching greatness. Paper presented
at the annual meeting of the American Psychological
In this study, we presented BERT-CNN, a novel hybrid model Association, New York.
for identifying fraudulent reviews posted online. In order to 12. Adamic, L. A. and B. A. Huberman. 2006. The nature of
prepare the data for analysis, we used one-hot encoding and markets in the World Wide Web. Working paper, Xerox Palo
GloVe to strategically pad each input matrix to a consistent Alto Research Center. http://www.parc.xerox.com/istl/groups/
size. The results showed that our suggested method iea/www/webmarkets.html (accessed March 12, 2014).
outperformed the state-of-the-art models both during training 13. Aidy Ali, Kannan Rassiah, M.M.H Megat Ahmada. (2021).
on the dataset and, later, when tested on the same dataset. The The effect of stacking sequence of woven bamboo on
BERT-CNN hybrid model stands out since it outperformed mechanical behaviour of fiber reinforced composites. Journal
other approaches with an impressive performance of 98.07% of Southwest 592 Jiaotong University / Vol.56 No.2.
14. Anigol,M.N.B., Anil, S.P. (2015). Study of the effect of various
when paired with GloVe. In the domain of online review
fillers on mechanical properties of carbon-epoxy composites.
detection and classification tasks, these data highlight the
Int. Res. J. Eng. Technol. 02(03), 798–802.
efficacy of our proposed technique. 15. Biswasa, S., Shahinura, S., Hasana,M., Ahsan, Q. (2015).
Physical, mechanical and thermal properties of jute and
bamboo fibre reinforced unidirectional epoxy composites.
Procedia Eng. 105, 933–939
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

YOLO CNN Approach for Object Detection

Aluri Dev Ananth1

Department of CSE, SRM University-AP, Amaravati, India
Abhiram Seemakurthi2
Department of CSE, SRM University-AP, Amaravati, India
Sasank Tumma3
Department of CSE, SRM University-AP, Amaravati, India
Prasanthi Boyapati4
Department of CSE, SRM University-AP, Amaravati, India

Abstract: Among the most rapidly developing areas in computer vision is object detection. Mask detection is the main objective
of the effort. With the use of deep learning and computer vision techniques, this project offers a reliable method for mask
identification that is implemented using RESNET architecture. Identifying faces and differentiating between people wearing
masks and those without is the main goal. The model is refined via transfer learning on a customized dataset that includes
annotated photos of faces that have been masked, masked incorrectly and unmasked faces.
Keywords: Computer vision, Object detection, You only look once, Transfer learning

1. Introduction detection involves a regression towards a bounding box

that can be separated spatially and a corresponding layer
Although everyone has access to this amazing tool, the probability.
human eye’s capacity to perceive, recognize, and precisely Boundary boxes plus class probability can be directly
locate objects is frequently disregarded. Regrettably, people predicted from the whole image by a single neural network
frequently undervalue the role that eyes play in helping model with a single evaluation. It is possible to directly
them comprehend their surroundings. The field of computer enhance the accuracy of detection from beginning to
vision, which focuses on obtaining useful data from photos end since the detection process is a network. The unified
and movies, is founded on these similar principles. Computer architecture operates at a very high speed The YOLO
vision addresses tasks that mirror the amazing powers CNN technique is substantially less likely to generate false
of the human eye, including object identification, image positives in unoccupied locations, despite having higher
recognition, and image super-resolution. localization errors than other detection systems. Moreover,
Object detection is one of computer vision’s most widely the YOLO-CNN approach picks up extremely generic object
used applications. The practice of accurately locating items representations.
in an image or scene by using a computer or software system Single-shot object detection is the strategy used by the
is known as object detection. The presentation of object YOLO-CNN method to address the object detection problem.

1
devananth_aluri@srmap.edu.in, 2abhiram_seemakurthi@srmap.edu.in, 3sasank_tumma@srmap.edu.in, 4prasanthi.b@srmap.edu.in

DOI: 10.1201/9781003529231-72
482 Algorithms in Advanced Artificial Intelligence

Fig. 72.1 One stage object detection

When prediction of presence and placement of required fully linked stage to generate all of the layer’s predictions.
classes/objects in a image detection only goes through the While the technique utilizing a regional recommendation
input image once. By processing the full image in one pass, network processes the same image numerous times, the
computational efficiency is increased. YOLO-CNN method accesses it only once.
To examine the predicting abilities of various object
recognition models, the YOLO-CNN approach necessitates 2. Methodology
a consistent quantitative evaluation. The Average Precisions
YOLO-CNN employs a basic network known as a deep
(AP) and the Intersection-over-Unions (IoU) are two of the
convolutional neural network to detect objects within the
most popular metrics for assessment.
given image. The structure of the CNN model, essential to
A approach utilizing point-to-point neural networks in the the algorithm, is depicted in Fig. 72.3.
YOLO-CNN model to simultaneously forecast bounding box
As seen in Fig. 72.4, the initial 20 layers of convolution of the
and class probabilities. This is not the same as the prior object
model have been trained using RESNET by using temporal
detection algorithms’ approach, which recycles classifiers for
mean pooling with all connected layers. Next, the pre-trained
detection purposes.
model is adjusted to enable detection. This is because prior
By using an entirely new way for object detection and studies have demonstrated that the addition of convolutional
attaining cutting-edge outcomes, the YOLO-CNN method and linked layers improves the performance of pre-trained
surpassed existing real-time object detection systems. Instead networks. View the Fig. 72.2 that follows. In the diagram
of using distinct regions to identify possible regions of interest depicted in Fig. 72.1, the algorithm’s ultimate connection
like some algorithms do, such Faster RCNN, YOLO-CNN layer anticipates the dimensions of the bounding box and the
employs a regional recommendation network and a single, probabilities related to various classes.

Fig. 72.2 Transfer learning

Fig. 72.3 YOLO-CNN algorithm

YOLO CNN Approach for Object Detection 483

The system predicts various bounding boxes for each grid

cell. During training, each object should have a single
bounding box predictor. Depending on the prediction that
has the greater current IoU with the reality, the algorithm
determines which predictor is responsible for forecasting
the item. This results in a more specialized bounding box
generator. The total recall rises in proportion to how well
each predictor forecasts an object’s size, aspect ratio, or class.
Non-maximal suppression (NMS) is a crucial method in
algorithmic models, playing a significant role in improving
the precision and efficacy of object detection. This
postprocessing technique proves particularly useful when
dealing with scenarios where a single object in an image
leads to the creation of multiple bounding boxes. Despite
variations in their overlap or arrangement, all these bounding
boxes convey the same information. By applying NMS to
each object in the image, unnecessary or inaccurate bounding
boxes can be identified and eliminated, ensuring that only a
single bounding box per object remains.

2.2 Data Preparation

The dataset includes XML-formatted photos as well as
annotations. To process data, this code makes use of libraries
like NumPy, Pandas, Matplotlib, and PIL. opens the contents
of an XML annotation file after reading it. Moreover, a few
sample photos from the dataset will open and be shown.
The bounding box coordinates are then converted from XML
format to You Only Look Once format and the other way
around using a method defined in the code. Additionally
defined is a list of classes associated with mask detection.
The bounding box coordinates and object labels are extracted
from the XML annotation file, iterated through, then
formatted into the You Only See Once function.
These format labels need to be saved to a text document
before they can be seen. Lastly, the code arranges labels and
images into the appropriate folders and builds a data directory
structure for training, test, and validation data

2.2 Configuration File Generation

Paths to training, testing, and validation datasets are specified
in YAML configuration files that are created. Define the class
names and the total number of classes as well.
Fig. 72.4 RESNET algorithm 2.3 Data Preparation
Figure 72.3 illustrates how the YOLO-CNN algorithm This code trains a single-search method model using the
divides an input image into a S*S grid. If an object’s center Ultralytics package. The YOLOv8m model should be loaded,
is inside a grid cell, the grid cell is in charge of detecting it. and the configuration file, epoch count, and other parameters
Each grid cell contains a prediction for the bounding box plus for training should be specified. The model’s performance
confidence for each frame. These confidence levels show indicators are recorded when the training process has
how confident the model is that the object is in the box and completed a predetermined number of epochs. The optimal
how accurate it thinks the forecast box is. model’s weights are preserved.
484 Algorithms in Advanced Artificial Intelligence

2.4 Configuration File Generation

Paths to training, testing, and validation datasets are specified
in YAML configuration files that are created. Define the class
names and the total number of classes as well.

2.5 Training the maskDet Model

Paths to training, testing, and validation datasets are specified
in YAML configuration files that are created. Define the class
names and the total number of classes as well.
Fig. 72.5 Training graph
2.6 Model Evaluation
Using a test data set, metrics are calculated to evaluate the
trained model’s performance. The results are visualized with
the bounding box that was detected.

2.7 Inference with the Trained Model

The code does inference on fresh photos using the learned
model. To forecast the position and kind of items in the picture,
load the ideal weights into the model. The identification of
the bounding box is included in the results that are shown. Fig. 72.6 Validation graph
Since the dataset is only in XML format, it must be converted
to the object class’s x-y width-height dimensions. The
subsequent formula might be utilized to attain this goal:
xMid = (bb[1] + bb[3])/(2 × w) (1)
yMid = (bb[2] + bb[0])/(2 × h) (2)
w = (bb[3] − bb[1])/w (3)
h = (bb[2] − bb[0])/h (4)
Next, extract data such as object, name, size, width, height,
and bndbox. Following this, split the files into 603, 150, 100
files (80%, 10%, and 10%), for train, test, and val.
To use YOLO-CNN, a yaml-type configuration file with the
train path, train, test, and val paths with the number of classes
and their names must be produced. This YOLO-CNN will
identify faces that have masks on, don’t have masks on, and Fig. 72.7 Person with mask
are wearing masks wrongly.
In the context of object detection, the YOLO-CNN approach
addresses a regression challenge related to bounding boxes
that are spatially distinct along with their associated class
probabilities. The YOLO-CNN system divides the input
image into an S*S grid, as illustrated in Fig. 72.3. Each grid
cell is responsible for detecting an object if its center lies
within the cell’s confines.
For every grid cell, the YOLO-CNN model predicts bounding
boxes along with associated confidence scores. These
confidence scores not only signify the model’s confidence in
the presence of an object within the box but also reflect the
accuracy of the box prediction. In instances where no object
exists in a cell, the confidence ratings are expected to be zero. Fig. 72.8 Person with mask, wearing incorrectly
YOLO CNN Approach for Object Detection 485

Fig. 72.12 People without mask

Fig. 72.9 Person with mask detected from sideways Conversely, if there is an object, YOLO-CNN anticipates the
confidence score to align with the intersection over union
(IOU) of the ground truth and the predicted box. [11]
The YOLO-CNN algorithm is employed to multiply
conditional class probabilities and individual box confidence
projections during the test phase. For the You Only Look
Once (YOLO) evaluation on Pascal VOC, which comprises
20 named classes, the YOLO-CNN method is used with
parameters S = 7, B = 2, and C = 20. This final prediction
made by the YOLO-CNN is represented as a 7 × 7 × 30 tensor.

3. Conclusion
In summary, the offered code demonstrates a reliable and
organized method for setting up and using a YOLO-CNN
model for image mask detection. With the help of the
Fig. 72.10 Person with mask detected even if it is blocked
Ultralytics library, this project efficiently handles the setup
with hand
of the YOLO-CNN method model training, data conversion
into the format required by the YOLO-CNN technique, and
dataset arrangement for training. The code also incorporates
necessary procedures like real-world inference on fresh
images, metrics computation, and model validation. All
things considered, this work demonstrates how deep learning
techniques can be used to automate mask identification tasks,
with potential applications in public safety, healthcare, and
other fields

Acknowledgement
The authors gratefully acknowledge the students, staff, and
authority of CSE department for their cooperation in the
research.
Fig. 72.11 Person with mask wearing incorrectly from
sideways
486 Algorithms in Advanced Artificial Intelligence

6. J. Canny, “A computational approach to edge detection,” IEEE

References Transactions on pattern analysis and machine intelligence, no.
1. A. Sharma, J. Pathak, M. Prakash, and J. N. Singh, “Object 6, pp. 679–698, 1986.
detection using opencv and python,” in 2021 3rd International 7. J. Li and S. Ding, “A research on improved canny edge detection
Conference on Advances in Computing, Communication algorithm,” in Applied Informatics and Communication:
Control and Networking (ICAC3N), pp. 501–505, 2021. International Conference, ICAIC 2011, Xi’an, China, August
2. P. Viola and M. Jones, “Rapid object detection using a boosted 20-21, 2011, Proceedings, Part V, pp. 102–108, Springer,
cascade of simple features,” in Proceedings of the 2001 IEEE 2011.
Computer Society Conference on Computer Vision and 8. S. Mehtab, “Object detection and tracking using opencv in
Pattern Recognition. CVPR 2001, vol. 1, pp. I–I, 2001. python,”
3. S. Liao, A. K. Jain, and S. Z. Li, “A fast and accurate 9. T. Kanade, “An iterative image registration technique with an
unconstrained face detector,” IEEE Transactions on Pattern application to stereo vision (ijcai),”
Analysis and Machine Intelligence, vol. 38, no. 2, pp. 211– 10. H. Altun, R. Sinekli, U. Tekbas, F. Karakaya, and M. Peker,
223, 2016. “An efficient color detection in rgb space using hierarchical
4. D. Luo, G. Wen, D. Li, Y. Hu, and E. Huan, “Deep-learning neural network structure,” in 2011 International Symposium
based face detection using iterative bounding-box regression,” on Innovations in Intelligent Systems and Applications, pp.
Multimedia Tools and Applications, vol. 77, 10 2018. 154–158, IEEE, 2011.
5. Y. Zhang, X. Wang, and B. Qu, “Three-frame difference 11. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only
algorithm research based on mathematical morphology,” look once: Unified, real-time object detection,” in 2016 IEEE
Procedia Engineering, vol. 29, pp. 2705–2709, 12 2012. Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 779–788, 2016.
Note: All the figures in this chapter were designed by the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

Multi-Crop Analysis Using Multi-Regression

via AI-based Federated Learning 73

Mouna Penmetsa1
Research Scholar,
Dept of Compute Science and Engineering,
Sagi Ramakrishnam Raju Engineering College, Bhimavaram.
R.N.V. Jagan Mohan2
Associate Professor,
Dept of Compute Science and Engineering,
Sagi Ramakrishnam Raju Engineering College, Bhimavaram.

Abstract: The art and science of tilling land and cultivating crops is known as agriculture. These days, smallholder farmers
rely on a variety of crops. To increase productivity, these farmers must grow a variety of crops. The heavy rains caused damage
to agricultural harvests in the last season then crop-loss evaluations help identify intervention needs. In addition to that small
hold farmers’ livelihoods are impacted by a variety of factors, including disease, pests, climate change, natural catastrophes,
and human activity. In this paper, Researchers advice to small hold farmers will harvest many crops per acre on a limited plot
of land. The paper discusses the use of AI-based Federated Learning for multi-crop analysis. Before going to the experiment is
on multi-regression analysis is used to estimate crop loss. The experiment involves multi-crops using multi-regression analysis
to estimate crop loss before.
Keywords: Artificial intelligence federated learning, Multi-crops and multiple class, Multiple regression analysis etc

1. Introduction regional imbalances persist by Bharath,2020[3]. In 2021,

68.38 million hectares were irrigated, with only 18.8% under
Indian agriculture, employing over 60% of the population micro-irrigation. Soil fertility depletion is a major issue due to
and contributing 18% to the GDP, faces challenges such as the Green Revolution’s increased use of chemical fertilizers
small and fragmented land holdings, insufficient income, by Cao,2021[4]. Farmers face issues with crop insurance
mechanization difficulties, land quality deterioration, and schemes, climate change, and weather patterns by Hoffman,
inheritance laws. Farmers also face challenges in marketing, 2018[9]. Price volatility impacts farmers’ livelihoods, leading
transportation costs, market infrastructure, price fluctuation, to income instability and low productivity. Agricultural
and post-harvest losses, resulting in 16% annual waste
extension programs in India provide technology transfer
by Abrougui, 2019[1].India’s agricultural operations rely
and rural development assistance, but lack balance, causing
heavily on labor, with 60-70% mechanization in ploughing,
farmers to be unaware of latest practices and increasing
harvesting, threshing, and irrigation by Alejandro Morales,
vulnerability to pests and diseases by Dahikar,2014[6].
2023[2]. However, small farmers struggle with mechanization
due to lack of awareness and capital constraints. Improved In the Global South, Multicropping systems are a prevalent
credit access is crucial for productivity and quality, but agricultural practice, especially on small farms used for

1
mounakishan@gmail.com, 2mohanrnv@gmail.com

DOI: 10.1201/9781003529231-73
488 Algorithms in Advanced Artificial Intelligence

sustenance. Land races, cultivars, rainfall, fertilizers, soil, (MC2FL)covered in Section 3. Section 4 deals with Multi-
agricultural leftovers, and domestic animals are all necessary Crop Analysis Using Multi-Regression. Section 5 deals
for these systems to function. In biodiversity agro ecosystems, withexperimental results. Conclusion included in section 6.
their role in nutrient cycling, water management, and insect References are included in Section-7.
control is paramount by Ersin Elbasi, 2023[7]. Multicropping
contributes to the environment, tackles social concerns in the 2. Agriculture with Federated
community, and offers resilience in output and income. It
deals with reducing poverty and hunger and can help with Learning (FL)
the planning of larger-scale farming systems that will provide FL is an approach that emphasizes experience and
food in the future. As evidenced by agriculture, multi- collaboration while trainingmodels decentralized from a
cropping, or growing several crops at once on one piece of central data transfer. It has applications in agriculture but
land, increases land production and advantages. To increase faces challenges in technical components like platforms,
land productivity and provide more benefits and revenue, hardware, software, and data privacy. This work explores FL
mixed cropping involves growing several crops concurrently
architectures, platforms, hardware, and software, providing
on the same plot of land, such as banana, coconut, coco, and
a blueprint for designing FL-based solutions to improve
Kandha, during the same growing season. Intercropping
crop data. Earlier works on these lines by various authors as
reduces rivalry and increases cooperation.
follows Kulkarni V, Kulkarni M, and Pant A, An examination
The paper addresses challenges faced by small and of personalization strategies in federated learning, 2020[11]
marginal farmers in agriculture, such as climate change, has proposed the authors classify FL Systems based on data
soil deterioration, biodiversity loss, water resource distribution, model, privacy frameworks, and communication
depletion, and labor and money shortages. It suggests using architectures, dividing them into private and public systems,
Artificial Intelligence-based Federated Learning approach with potential for improvement. Geiping J, Bauermeister H,
for agricultural harvesting on small and fragmented land Dröge H, and Moeller M: Gradient inversion: how simple is
holdings, focusing on multiple crop cultivation for better it to compromise privacy in federated learning? ArXiv, vol.
yield and profitability. abs/2003.14053, 2020[7]has suggested the article delves
The seven portions of this paper’s response can be into the use of FL for recovering image data, highlighting its
understood as follows: section 1’s broad introduction of privacy limitations and the impact of network architecture and
Agriculture. Agriculture with Federated Learning is covered parameter state. Liu Y, Yuan X, Xiong Z, Kang J, Wang X, and
in Section 2. Multi-Crops Multi-Class Federated Learning Niyato D:FL for 6G communications: obstacles, strategies,

Fig. 73.1 Federated learning architecture applied in an agriculture environment

Multi-Crop Analysis Using Multi-Regression via AI-based Federated Learning 489

Fig. 73.2 Multi-crops multi-class federated learning (MC2FL)

and next steps, arXiv preprint arXiv: 2006.02931, 2020[12] on a human activity recognition dataset, the Agriculture with
suggested the article discusses the potential of machine Federated Learning framework outperformed traditional
learning in 6G communications, addressing challenges such machine learning algorithms, making it applicable to
as high costs, security, and privacy, while also presenting Agriculture Environment. The framework can incorporate
architecture for its implementation by Manoj T, 2022[13]. other conventional Deep learning methods by Madhuri
Shripathi Rao,2022[14].
3. Multi-Crops Multi-Class Federated
Learning (MC2FL) 4. Multi-Crop Analysis Using
The framework facilitates label sharing across multiple crops, Multi-Regression
preserving privacy and enhancing personalized learning, with
By fitting a linear equation to observed data, multiple linear
improved performance evaluated using multiple crop datasets
with more and smaller features by Qiusi Zhang, 2023[17]. regression models reveal the connection between features
Agricultural with FL is a cooperative learning method that and responses, comparable to basic regression. It evaluates
uses multiple devices or servers to train algorithms, providing factors affecting predicted output and their relationships by
reliable models without data exchange and increased security Orogun Okunola Adebola, 2021[15].
by Salman Khan, 2022[18]. Y = b0 + b1 * x1 + b2 * x2 + b3 * x3 + …… bn * xn (1)
Wearable technology is a focus of the Federated Learning
x1, x2, x3…xn are several independent variables, and Y is the
architecture based on Agriculture by Gupta A, 2021[8].
dependent variable.
This tackles issues like privacy concerns and a lack of
personalization in smart agriculture. Transfer learning is Where b0, b1, b2 are the estimates of b0, b1, b2. The three
used to develop well-tailored models as it gathers farming normal equations are.
information from individual crops grown in different regions 6 6 6
by Prabhat kumar, 2021[16]. The framework assumes several
crop farming locations, one server, and a process. Deep
Ây i = nb0 + b1 Âx1 + b2 Âx2 (2)
i=1 i=1 i=1
learning is used by the Agriculture with FL architecture to
train classifiers and learn features. New farmer crop data Âx1i y i = b0 Âx1i + b1 Âx1i2 + b2 Âx1i x2i (3)
is regularly fed into the models to keep them updated by
Kuradusenge, 2023[10]. The framework can incorporate other Âx2i y i = b0 Âx2i + b1 Âx22i + b2 Âx2i2 (4)
methods for personalization and can be generalized. Tested
490 Algorithms in Advanced Artificial Intelligence

5. Experimental Result b2 = –0.10716

b3 = –0.281017
Farmers are advised to produce a range of crops, including
rice, bananas, coconuts, turmeric, and other crops, as Consequently, the necessary regression plane is.
this can increase productivity and yield of use outcomes. y = 18211.882 + 60.3507262x1 + (–0.10716) x2
Heavy rains caused damage(y) to agriculture harvesting in Estimate: For a Rice Crop damage of per acre (x1 = 25) and
2023, with weight(x1), investment amount(x2), and crop x2 = 10000, the damage incurred in rupees is
received amount(x3) as independent variables. By comparing
agricultural yields from damaged and healthy crops, crop loss y(x1 = 25,x2 = 10000) = 8211.882 + 60.3507262 * 25
evaluations, a type of conventional analysis, assist farmers +(–0.10716) * 10000 = 19718.5 – 1071.6.
in determining the need for intervention. Several factors, y = 18646.9.
including disease, pests, climate change, natural catastrophes,
and human activities, affect the economy and the livelihoods The damage on particular crop loss y=18646.9.
of farmers when crops are lost. With multi-regression
analysis, the goal is to estimate crop loss in Table 73.1: Multi- 6. Conclusion
Crops of Agriculture by Sheathe, 2020[18].
Farmers should produce diverse crops to boost productivity.
Table 73.1 Multi-crops of agriculture Heavy rains in 2023 caused damage to agriculture harvesting.
Crop loss evaluations help identify intervention needs.
S. Crops Damage Crop Invest Received
Factors like disease, pests, climate change, natural disasters,
No. Names (y) Weight Amount (x3)
(100kgs)/ (x2) and human activities impact farmers’ livelihoods. Multi-
Quintal (x1) regression analysis estimates crop loss. The paper explores
1 Rice 15000 25 10000 15000
the utilization of AI-based Federated Learning for Multi-
Crop Recommendation through Multi-Regression Analysis.
2 Banana 12000 50 30000 18000
A statistical method for analyzing data using a linear equation
3 Turmeric 15000 50 10000 14000 is MLR, which is like a straightforward regression analysis.
4 Kandha 13000 60 33000 22000
Acknowledgements: We express our sincere gratitude for N.
5 Coconut 14000 10 11000 24000 SivaKishan, Dr. R.N.V. Jagan Mohan’s assistance.
6 Coco 12000 30 10500 15000
Conflict of Interest: Since there were no financial or
Total 81,000 220 1,04,500 1,08,000 commercial ties, there were no possible conflicts of interest
when doing the research.
In this case, n=6. Replace the data from Table 73.1 with the
harvesting information in the standard equations:
References
∑y = N b0 + b1∑x1 + b2∑x2 + b3∑x3 (5)
∑x1 y = b0∑ x1 + b1∑ x1 2+ b2∑x1 x2 + b3 ∑x1 x3 (6) 1. Abrougui, K., Gabsi, K., Mercatoris, B., Khemis, C., Amami,
R., Chahaibi, S: Prediction of organic potato yield using tillage
∑x2 y = b0 ∑x2 + b1∑x1 x2 + b2∑x2i2 + b3∑x2 x3 (7) systems and soil properties by artificial neural network (ANN)
∑ x3 y = b0∑x3 + b1∑x3 x1 + b2∑x2 x3 + b3 ∑x3i 2
(8) and multiple linear regressions (MLR), Soil Tillage Res. 190,
202–208,Doi: 10.1016 /j.still.2019.01.011,2019.
So that, 2. Alejandro Morales and Francisco J.Villalobos: Using machine
81,000 = 6b0 + 220 b1 +104500 b2 + 108000 b3 learning for crop yield prediction in the past or the future,
Frontier Plant Science, Volume 14, https://doi.org /10.3389 /
3005000 = 220 b0 + 10,225 b1 + 4655000 b2 fpls.2023.1128388,2023.
+ 3985000 b3 3. Bharath S, Yeshwanth S, Yashas B L and Vidyaranya R
1369000000 = 104500b0 + 4655000b1 + 2420250000 b2 Javalagi: Comparative Analysis of Machine Learning
+ 1977500000 b3 Algorithms in The Study of Crop and Crop yield Prediction,
International Journal of Engineering Research & Technology
1425000000 = 108000b0 + 3985000b1 + 1977500000b2 (IJERT), Vol-8 Issue-14, 2020.
+ 2030000000b3 4. Cao, J., Zhang, Z., Luo, Y., Zhang, L., Zhang, J., Li, Z., et al:
When we solve, we Wheat yield predictions at a county and field scale with deep
learning, machine learning, and Google earth engine, Eur. J.
b0 = 18211.882 Agron. 123, 126204, doi: 10.1016/j.eja.2020.126204,2021.
b1 = 60.3507262 5. Dahikar S and Rode S V: Agricultural crop yield prediction
Multi-Crop Analysis Using Multi-Regression via AI-based Federated Learning 491

using artificial neural network approach, International 11. Liu Y, Yuan X, Xiong Z, Kang J, Wang X, and Niyato D:
Journal of Innovative Research in Electrical, Electronics, Federated learning for 6g communications: Challenges,
Instrumentation and Control Engineering,Vol-2, Issue-1, pp methods, and future directions, arXiv preprint arXiv:
683-6, 2014. 2006.02931, 2020.
6. Ersin Elbasi, Chamseddine Zaki, Ahmet E. Topsoiled 12. Manoj T et.al: A FL-based crop yield prediction for Agriculture
Abdelbaki,Aymen I. Zreikat, Elda Cina, Ahmed Shdefat, production risk management, IEEE Xplore, 2022.
Louai Saker: Crop Prediction Model Using Machine Learning 13. Madhuri Shripathi Rao et al: Crop prediction using machine
Algorithms, Applied Sciences, MDPI, Volume 13, Issue 16, learning, Journal of Physical Conference Series, 2161 012033,
Doi:10.3390/app13169288,2023. 2022.
7. Geiping J, Bauermeister H, Dröge H, and Moeller M: Inverting 14. Orogun Okunola Adebola:A Multiple Linear Regression
gradients - how easy is it to break privacy in federated learning, Model for Analyzing and Predicting Educational Development
ArXiv, vol. abs/2003.14053, 2020. in Nigeria, NIPES Journal of Science and Technology
8. Gupta A, Nagda D, Nikhare P, Sandbhor A: Smart crop Research, 3(1), pp:99-108, pISSN: 2682-5821,2021.
prediction using IoT and machine learning International 15. Prabhat kumar et.al: PEFL: Deep Privacy-Encoding-Based FL
Journal of Engineering Research & Technology (IJERT), Vol- Framework for smart Agriculture”, september16, 2021.
9, Issue-3, 2021. 16. Qiusi Zhang et.al: Maize yield prediction using Federated
9. Hoffman, A. L., Kemanian, A. R., and Forest, C. E.: Analysis random forest, 14 May 2023.
of climate signals in the crop yield record of sub-Saharan 17. Salman Khan Et.al: Federated learning-based UAVs for the
Africa, Global Change Biol., 24(1),143–157, doi: 10.1111/ diagnosis of plant diseases, proceedings of the 8th ICEET, 27
gcb.13901, 2018. 28, October 2022.
10. Kuradusenge, M.; Hitimana, E.; Hanyurwimfura, D.; 18. Sheathe, N: Detecting Multicollinearity in Regression
Rukundo, P., Mtonga, K., Mukasine, A., Uwitonze, C.; Analysis, Am. J. Appl. Math. Stat .8, 39–42, 2020.
Ngabonziza, J., Uwamahoro, A. Crop Yield Prediction
Note: All the figure and table in this chapter were designed by the
Using Machine Learning Models: Case of Irish Potato and
author.
Maize, Agriculture 2023, 13, 225, 2023.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
492 Algorithms in Advanced Artificial Intelligence
Empowering Inclusive Communication:
Advancements in Wearable Technology
with GloSign—A Glove-Based Solution for 74
Seamless Sign Language Interaction

L V Srinivas1,
Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, INDIA.
R. Shiva Shankar2,
Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, INDIA.
N. Deshai3,
Department of Information Technology,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, INDIA.
K. Sravani4,
Department of Computer Science and Engineering,
SRKR Engineering College (A), Bhimavaram, Andhra Pradesh, INDIA.
V. Maheswararao5
Department of Computer Science and Engineering,
Shri Vishnu Engineering College for Women (A), Bhimavaram, Andhra Pradesh, INDIA.

Abstract: The profound psychological and social impacts of losing the ability to speak or to hear emphasize the critical demand
for innovative communication alternatives. Sign Language plays a pivotal role as a medium for individuals confronting these
challenges, fostering meaningful interaction. This paper presents GloSign, a groundbreaking glove engineered to transcend
communication barriers by translating American Sign Language into characters. Integrating cutting-edge flex and inertial
measurement unit (IMU) sensors, GloSign precisely identifies gestures, and its seamless integration with an IoT platform
ensures portability and wireless functionality. Incorporating the k-nearest neighbors (KNN) in machine learning propels the
system to an impressive accuracy rate of 97.7%. Beyond mere gesture translation, GloSign facilitates the formation of complete
sentences, offering output options ranging from on-screen display to speech conversion. This versatile glove emerges as a
potent communication tool, fostering seamless interaction among sign language users and extending its impact to individuals
unfamiliar with sign language. In doing so, it actively promotes inclusivity in communication, making strides toward a more
interconnected and accessible future.
Keywords: Glove, Pattern recognition, Gesture recognition, Sign language sensor

1. Introduction effects. Sign language (SL) has emerged as a solution to this

challenge, utilizing hand gestures and facial expressions for
Communication breakdowns resulting from losing effective and interactive communication [1]. Many people
interpersonal abilities can have profound and detrimental are not familiar with SL, which can make it challenging for
1
srinivas.srkrcse@gmail.com, 2shiva.csesrkr@gmail.com, 3desaij4@gmail.com, 4sravani.kalidindi@gmail.com, 5mahesh_vvr@yahoo.com

DOI: 10.1201/9781003529231-74
Empowering Inclusive Communication: Advancements in Wearable Technology with GloSign 493

individuals with impairments to communicate effectively. This paper adopts a novel structure comprising four distinct
Additionally, SL faces limitations in digital communication, sections, each providing a groundbreaking topic exploration.
and the absence of a universal standard means that each Following this, the methodology section intricately outlines
country has its unique system with gestures that may the innovative approach employed in this study, prioritizing
differ from those of other nations. In response to these transparency and offering valuable insight into the research
communication challenges, this study introduces GloSign, process [6, 7, 8].
a novel glove designed to transcend barriers by translating The subsequent section, results, goes beyond mere
SL into English [2]. Like spoken languages, American presentation, offering a detailed and nuanced account of
Sign Language (ASL) consists of formal and informal the findings derived from the conducted research. Here, it
components. This paper focuses explicitly on the legal Ensures a comprehensive understanding of the outcomes and
aspect of ASL, which involves a set of 26 alphabets used to their potential implications [9].
construct words and sentences. These alphabets are defined
by four elements: hand shape, position relative to the body, In a culmination of insights, the fourth section serves as the
hand movements, and palm alignment, with specific gestures conclusion, meticulously summarizing key findings and
requiring dynamic hand movements.GloSign incorporates contributions of the paper. It not only provides closure but
flex sensors, accelerometers, and gyroscopes to recognize also goes a step further by delineating potential avenues for
gestures, offering a wireless, IoT-enabled solution for future research, thereby propelling the exploration of the
uploading and analyzing data [3]. The system interprets topic into uncharted and innovative territories. This structured
gestures captured by the glove, transforming them into words approach enhances the paper’s clarity and contributes to its
and sentences. These linguistic outputs are then displayed on distinctiveness in the academic landscape [10].
a screen running gesture recognition software, accompanied
by converting sentences into speech. By facilitating the 2. Related Work
translation of SL into a widely understood language, GloSign
strives to significantly enhance communication accessibility The literature review is meticulously organized based on
for individuals facing these challenges [4-5]. the categorization of sensors, communication methods,
employed algorithms, and the presentation of gestures. In
[11, 12], showcased gloves equipped with flex sensors that
successfully predicted all alphabets with commendable
accuracy, serving their purpose in assisting communication
for individuals with disabilities [13, 14]. Authors in [15] and
A B C D E F G
[16] innovatively developed wireless gesture decoders using
Plam is usually forward Thumb bent Thumb frequently
forward
flex sensors and accelerometers, achieving a remarkable 95%
unless stated. lowers (like claw)
accuracy in recognizing all alphabets and 15 words. In [17],
incorporating pressure sensors elevated accuracy to 98.2%.
A different device with flex sensors could detect movements
by measuring voltage levels. It could then display the
H I J K L M detected movements on a phone or laptop via Bluetooth with
Palm in an accuracy rate of 83% [18]. Building on this technology,
[19, 20] used flex sensors and accelerometers to recognize
movements and convert them into sound and text in just 0.74
seconds.
N O P Q R S The media report regular occurrences. Now, social networks
One palm facing Palm in Index Similar to P
like Twitter give user-generated news material. They
opposite body. Finger but pointing must cluster and deliver only relevant data to make this
indicates. down and resource usable. They filtered data using density-based
curled finger. k-means and graph clustering [21]. Based on a thorough
categorization process, the IDCS has four steps. IDCS begins
by methodically collecting data from diverse sources on
rural women’s nutritional knowledge, attitudes, and dietary
T U V W X Y Z patterns. After that, it preprocesses the data according to
Thumb over fingers, palm front, palm tilted away from body conventional methods and organizes it for categorization.
Fig. 74.1 Sign language in america Next, a learning algorithm develops an intelligent classifier
494 Algorithms in Advanced Artificial Intelligence

to divide rural women into appropriate nutritional categories Thiemjarus fitted the glove with more touch sensors [31]. In
[22]. another investigation, a KNN algorithm and touch sensors
Use sentiment classifiers to classify comments and visualize yielded 91.54% efficiency [34]. Ahmed and Arif et al. [35,
student perspectives. This technology offers instructors fast 36] created a glove with touch, flex, and inertial sensors
qualitative feedback to enhance student learning. This program that achieved 92% accuracy using a gesture recognition
enables quicker online feedback than paper-based systems. algorithm. Using surface Electromyography (EMG) sensors
The current approach is inefficient for student input. Hence, and accelerometers, Wu et al. [37] achieved 96% accuracy
an online system was established. Students will provide for 80 movements. Python-coded capacitive touch sensors
comments online using a standard form. The suggested by Abhishek et al. [38] responded in 0.7 seconds with 92%
approach prioritizes security by allowing only legitimate users accuracy. Mehdi and Khan [39] employed a tilt sensor to
to see and understand the cumulative input and opinions of a measure glove rotation in a 5DT 7-sensor glove. A neural
batch of pupils [23]. Evaluation is the primary way schools network and three-layer algorithm gave this system 88%
assess students’ learning capabilities. One of the main tests gesture recognition accuracy. In [40], [41], Immersion’s
is essay writing. Currently, this assessment is done manually. 18-sensor CyberGlove with a resistive bend, abduction, and
Here, it takes time and effort. For this project, this assessment flexion sensors achieved 90% accuracy but was non-real
method is automated. They used keywords as features since time. A revolutionary glove designed to translate gestures
machines can’t grasp our rating metrics. into alphabets, offering a range of cutting-edge features:
• Wireless and portable functionality [42]
First, the student essay is matched against admin keywords
[24]. Linguistic diversity and variations hampered • Real-time responsiveness
research into Indian sign language (ISL). SL is required to • Capability to form words and sentences [43]
communicate with one another. Most learning takes place via • Accessibility from anywhere through an IoT platform
interaction with peers. Sign-learning materials are seldom Modern technology allows us complex answers to common
available. Because of this, learning to sign is a difficult task. problems. The medical glove literature review examines
Sign learning begins with the first stage, which is called features and drawbacks. The highlighted glove fills holes
finger spelling. This stage is used when there is no applicable left by its predecessors. As seen in the results section, it
sign, or the signatory is unknown [25, 26]. Smoke from the gives real-time results, essential in urgent circumstances. Its
fire causes vehicle accidents owing to poor visibility. simplicity and accessibility make it appealing to healthcare
Unaware of woodland fires and smoke, people died while practitioners. By resolving the flaws of previous gloves, this
walking down the forest route. The Raspberry Pi-interfaced novel approach may significantly improve patient care [44].
sensor detects forest fires in real-time and alerts the system.
This system uses the Publish-Subscribe Protocol to identify 3. Methods
forest fires and warn users [27]. The localization of hand
joint positions under kinematic limits is accomplished by Translating sign language into English using the GloSign
implementing a hierarchical mode-seeking technique that has glove is revolutionary and consists of a series of novel
been developed. At long last, an ASL signal may be identified processes. With the careful selection and positioning of
using a Random Forest (RF) classifier based on a combined sensors on the glove, the primary focus is on recording the
angle. While testing this method, they used a dataset from hand motions of the user with an unmatched accuracy. This
Surrey University [28]. The IoT and cloud computing have method is essential for the accurate interpretation of SL
seen tremendous growth, contributing to the optimization of motions, which lays the groundwork for the revolutionary
supply times and the retention of spare parts in warehouses. capabilities of the glove. A dynamic conduit for data transfer
Therefore, out-of-stock situations are optimized. A passive to a computer or mobile device will be established via the
UHF-RFID tag/sticker system to arrange extra components is subsequent step, which comprises the seamless integration of
the primary goal of this effort [29-30]. the GloSign glove with an Internet of Things (IoT) platform.
Tanyawiwat and Thiemjarus [31] developed GesTALK, The connection in question is not only a technical need
a portable gesture recognition glove that converts static but also a doorway to the understanding of SL motions in
motions to speech and works 90% accurately in ASL and continuous time. It can efficiently assist communication
PSL. A glove with contacts and flex sensors translated eight between persons who are deaf or hard of hearing and others
SLs to text with 93.16% accuracy [32]. El-Din and El- who can listen, therefore breaking down boundaries in a way
Ghany [33] achieved 88% accuracy with dynamic motions that has never really been seen before.
from ASL and ArSL using a Python GUI application and In the last stage, the data collected by the GloSign glove
a glove with flex and inertial sensors. Tanyawiwat and is subjected to a complex process of interpretation. The
Empowering Inclusive Communication: Advancements in Wearable Technology with GloSign 495

interpretation is achieved through machine learning (ML) Navigating the intricate nuances of ASL, where similar
algorithms extensively trained on significant sign language gestures often exhibit subtle differences, presents a
(SL) datasets. These advanced algorithms can quickly identify formidable challenge. In a groundbreaking move to overcome
subtle patterns in hand gestures and translate them into this hurdle, incorporating a contact sensor emerges as a
English words or phrases. As a result of this comprehensive pioneering solution. This sensor is strategically introduced
and innovative procedure, GloSign has established itself as to meticulously discern the nuanced variations inherent
a trailblazer in the realm of deaf-hearing communication, in these closely related gestures. The connection setup for
effectively bridging the gap between individuals who are deaf the contact sensor mirrors the innovative configuration
and those who are hearing. This action not only contributes showcased in Fig. 74.2, embodying a synergy of technology
to the development of a global communication environment and precision. This strategic positioning not only elevates the
that is more accessible and inclusive, but it also serves as a sensor’s efficacy but also contributes to the overall finesse
demonstration of the potential that innovation has in terms of of the system, ensuring that even the most intricate gestures
promoting meaningful relationships. are accurately captured and interpreted. This integration of
contact sensors transcends the conventional, pushing the
3.1 Selection and Placement of the Sensors boundaries of SL interpretation technology. It doesn’t merely
Flex sensors, contact sensors, and an inertial measurement detect gestures; it refines the art of distinction, embodying
unit (IMU) sensor with an accelerometer and gyroscope are a testament to the innovative spirit underlying GloSign.
carefully selected and positioned innovatively. The GloSign By embracing advanced sensor technologies and strategic
glove’s sensor array relies on flex sensors selected for finger- placement strategies, this glove doesn’t just interpret ASL;
angle measurement capabilities. Resistance levels fluctuate it revolutionizes the precision and clarity with which it’s
dynamically with finger angles in these sensors. Flex sensors done, setting new standards for inclusive and effective
are placed above the glove to record the wearer’s gentle finger communication [46]. Unlocking the realm of dynamic
motions for maximum accuracy. gestures leaps into the future by integrating the Inertial
Connecting this intricate sensor network is where the true Measurement Unit (IMU) sensor, seamlessly woven into
innovation unfolds. The flex sensors connect seamlessly the Arduino and strategically perched atop the hand. The
with the Arduino NANO IoT, creating a sophisticated innovative placement of this technological marvel is elegantly
communication pathway. The configuration, visually captured in the visual representation presented in Fig. 74.3,
represented in Fig. 74.2, is designed to harness the full showcasing the glove as a convergence point of cutting-edge
potential of the flex sensors. In this setup, the initial pin of sensor technologies.
the flex sensor (colored red) connects to the 3.3v port of the
Arduino, establishing a vital power link. Meanwhile, the
opposing pin (colored blue) is strategically linked to a resistor,
a crucial element in fine-tuning the sensor’s response. The
connection preceding the resistor is then skillfully attached
to the Arduino’s analog input, ensuring a streamlined data
flow. Finally, the remaining connection (colored black) is
grounded, completing the intricate circuitry that forms the
backbone of GloSign’s sensor system.

Fig. 74.3 Sensor placement and flow of data

The orchestration of this technological symphony doesn’t

Fig. 74.2 Flex sensor connection
halt at the glove; it extends into the digital domain through a
This methodology doesn’t just position sensors on a glove; it visionary connection to the Internet of Things (IoT) platform.
orchestrates a symphony of technological precision, fusing Operating on the expansive canvas of Wi-Fi, the Arduino
advanced sensor technologies with meticulous placement equipped glove establishes a seamless link to the International
strategies [45]. The result is a seamlessly integrated network Business Machines (IBM) Watson IoT platform, a choice
of sensors, each playing a distinct role in capturing and marked by its prowess in handling complex data integrations.
translating the intricate language of hand gestures into a The raw data, a rich amalgamation of accelerometer,
revolutionary communication tool. gyroscope, and flex sensor data, embarks on a transformative
496 Algorithms in Advanced Artificial Intelligence

journey from the Arduino to the IoT platform. Within the 4. Results and Discussion
intricate web of connectivity, the IoT platform becomes a
dynamic canvas, visualizing changes in gestures through a This section unravels the innovative outcomes from the
scatter plot feature. This real-time monitoring capability intricate dance of the GloSign glove’s operations. The
amplifies the GloSign glove’s responsiveness, setting it apart system’s three acts showcase its revolutionary capabilities.
as a groundbreaking tool in SL interpretation. In the first phase, the GloSign glove transmits a plethora of
The saga doesn’t end here; the values extracted from this data to the vast Internet of Things (IoT) infrastructure. The
digital canvas are ingeniously transported to the PC through next step involves analyzing and decoding the raw data. The
the IBM Watson IoT Software Development Kit (SDK). Figure glove software carefully interprets complex motions using
74.3 vividly captures this data flow, illustrating the seamless ML and K-Nearest Neighbors (KNN) methods. It’s a ballet
transfer from the glove to the PC, unlocking possibilities for of computation, where patterns are unraveled, and gestures
further processing and analysis. This methodology doesn’t are translated into a language comprehensible to the digital
merely capture dynamic gestures; it orchestrates a symphony realm—the crescendo of innovation peaks in the final phase,
of data, seamlessly blending sensor technologies, IoT where the processed data takes center stage. What emerges is
integration, and real-time monitoring. The GloSign glove, not just a stream of letters; it’s a refined output, a testament
equipped with this ensemble of innovations, emerges as a tool to the system’s ability to construct meaningful sentences.
for gesture identification and a visionary leap into the future The narrative unfolds on a screen, where coherent sentences,
of inclusive and intelligent communication technologies. meticulously crafted from the language of gestures, come to
life. This isn’t just data; it’s a story, a dialogue, an eloquent
3.2 Data from Glove expression born from the fusion of technology and human
communication. In essence, each phase in the GloSign
The GloSign glove collects data, which is then sent to IBM
glove’s operation isn’t just a step; it’s a leap into the future of
Watson IoT for conversion. This phase is crucial as it enables
communication technology. It’s an orchestration of data, an
new methods of interpreting gestures. This sensor data is
analysis of meaning, and a presentation of coherent output.
accurately mapped and calibrated onto characters during the
This isn’t mere experimentation; it’s a journey into the
offline system training. The system’s accuracy is achieved
uncharted territories of innovative communication solutions,
through an alphabet categorization process using the K-Nearest
where the language of gestures becomes a bridge connecting
Neighbors (KNN) method. An iterative method is used to
diverse worlds.
analyze K values from 1 to 25, and K values are optimized
for precision and system speed based on iteration accuracy. 4.1 IoT Platform and Sensors
This sophisticated algorithm can effortlessly process IBM
Watson IoT data and accurately predict the nearest letters. In this avant-garde experiment, the technological synergy
The system incorporates a visionary gesture fix algorithm unfolds by utilizing the IBM Watson IoT platform, creating
during sentence formation. This algorithm not only rectifies an ingenious connection between the GloSign glove, adorned
potential duplicate or incorrect letter predictions but does so with the Arduino NANO IoT, and the digital realm through
comprehensively, addressing entire sentences. The process is seamless Wi-Fi communication. The orchestration of data
swift, typically taking 2 to 8 seconds, showcasing the system’s becomes an art, dissected into two distinct realms: the
efficiency even with varying sentence lengths. The gesture nuanced language of the flex sensors and the binary poetry
fix algorithm operates on the entire sentence concurrently, of the contact sensor and Inertial Measurement Unit (IMU).
sidestepping delays that might arise when addressing words Like binary sentinels, the IMU and contact sensors manifest
individually. This forward-thinking approach ensures a their outputs in Boolean. With a profound simplicity, contact
seamless correction process, eliminating any risk of letter sensors echo a 1 when touched and gracefully bow to a 0
loss during the preceding word’s processing. A complex without contact. The IMU sensor, a judge of motion, breathes
procedure produces clear messages on a screen designed life into the glove’s dynamism, affirming its existence with a
for gesture recognition. The story continues with the IBM regal one during motion and gracefully transitioning to a 0 in
Watson Text-to-Speech software development kit converting moments of stillness. The values oscillate between the binary
these messages into spoken words. This versatile tool is ballet of 0s and 1s, where contact and movement values paint
easily accessible, works on any computer, requires a stable a canvas of interaction. A contact value of 1 whispers the
internet connection, and is compatible with Python software. tale of touch, while a movement value of 1 choreographs the
As the system performs its dynamic processes, variations in dance of dynamic gestures. Each with a balletic range of 0
performance may occur based on the specifications of the to 90°, grace the stage with the elegance of their numerical
system in use, adding a layer of adaptability to GloSign’s expressions. Like fingers in a symphony, F1, F2, F3, F4,
innovative communication landscape. and F5 waltz through the spectrum; they shape the nuanced
Empowering Inclusive Communication: Advancements in Wearable Technology with GloSign 497

Table 74.1 Average sensor values for each gesture Yet, in this digital ballet, a revelation surfaces. The sensor
Alphabet F1 F2 F3 F4 F5 C M
values, a testament to the intricacies of signed communication,
reveal a mesmerizing similarity among gestures. “i” and “j,”
A 44.04 26.97 40.85 53.29 0.28 1 0
akin to kindred spirits, share sensor values so profound that
B 0.56 -4.48 -3.47 -0.04 31.32 1 0 they dance on the edge of indistinguishability, differentiated
C 20.26 31.82 53.51 34.5 2.93 1 0 only by the subtle heartbeat of the movement sensor value.
D 32.01 42.04 51.73 0.13 -1.50 0 0 Embarking on a visual odyssey, Fig. 74.4 unfolds as a portal
E 66.83 71.89 73.64 58.29 35.41 1 0
into the dynamic realm of GloSign’s data visualization.
F 5.98 0.80 1.22 69.99 5.40 0 0 This avant-garde depiction transcends mere representation,
offering an immersive experience that illuminates the
G 76.21 66.59 83.98 -0.50 4.54 0 0
intricacies of various gestures. Within this digital canvas,
G 59.45 56.08 -4.44 -0.03 12.33 0 0
the sections marked with the resonant symbols “C” and “M”
I 3.45 79.17 65.58 79.62 17.40 1 0 emerge as visual symphonies, each stroke and hue capturing
J 2.78 71.62 60.66 74.46 15.61 1 1 the essence of communication. The visualization pulsates with
K 70.55 62.95 -6.81 -0.66 -0.64 0 0 life in the turquoise embrace of “C” (contact sensor). A surge
L 83.53 73.88 77.67 -1.13 -0.82 0 0
in this azure tide signifies the poetic convergence of contact
sensors, painting a vivid tableau of tactile connection. The
M 77.58 57.87 53.03 54.70 10.29 1 0
subtle dance of turquoise unveils the nuanced choreography
N 90.52 71.54 58.45 61.96 -2.11 1 0 of touch, echoing the very heartbeat of human connection. In
O 44.81 43.63 51.74 42.93 7.46 1 0 tandem, the ethereal glow of “M” (dynamic gesture) bathes
P 57.93 49.62 2.35 -0.80 -3.45 0 0 the canvas in light turquoise. A celestial dance ensues as this
Q 74.20 69.96 68.03 -0.21 -2.14 0 0
luminescence responds to the slightest quiver of movement
within the glove. It is not just data; it’s a visual symphony,
R 60.91 47.69 -5.54 -1.10 6.88 0 0
an elegant ballet of interaction painted on the canvas of
S 95.83 78.83 71.81 75.64 35.59 1 0 digital expression. Figure 74.5, a tapestry of readings from
T 76.92 34.22 40.41 52.19 -68 0 0 the flex sensors, invites observers into the tranquil realm
U 82.84 31.52 -6.82 -1.11 13.97 1 0 of hand gestures at rest. Five protagonists, F1 through F5,
V 77.50 38.18 -7.07 -0.76 14.15 0 0 stand poised to reveal the silent eloquence of the fourth,
third, second, and first fingers and the Thumb. The chart
W 53.30 -4.28 -7.13 -0.47 12.56 0 0
unfolds as a serenade of numerical expression, where each
X 76.99 75.49 70.31 58.34 41.50 0 0 curve and contour represents the nuanced articulation of hand
Y 1.06 74.33 68.31 80.23 -1.50 1 0 postures in repose. It is not just a chart; it’s a testament to the
Z 64.54 62.94 73.39 -0.73 7.17 0 1 intricate language encoded in the resting hands, awaiting the
moment they shall dance again in the symphony of signed
gestures of the fourth, third, second, first fingers, and the communication. In essence, these figures are visualizations
Thumb. and portals into gesture poetry. They transform raw data into

Fig. 74.4 Accuracy of the gestures

498 Algorithms in Advanced Artificial Intelligence

Fig. 74.5 Mean error of K values

a visual tapestry, where hues and contours resonate with symphony of ML and signed communication. It’s a pioneering
the very essence of human expression. This is not just data; exploration into the nuances of gesture identification, where
it’s an innovative exploration into the visual language of the system’s accuracy becomes a poetic expression, echoing
communication, where every stroke is a note, and every hue the challenges inherent in the language of silent conversation.
is a melody in the symphony of GloSign. Venturing into innovation, our quest for precision led us to
a groundbreaking exploration of the K-nearest neighbors
4.2 Analyzing and Decoding Data
(KNN) algorithm. To fine-tune its accuracy, we embarked on
The IBM Watson IoT platform Python API is a crucial tool a symphony of experiments, dancing through a spectrum of K
in data analysis. It helps to transform IoT platform data values to discern the optimal configuration for our dynamic
into a more usable format. A KNN-supervised ML model framework. Figure 74.7 is a visual testament to this journey, a
orchestrates this transformation. This model is based on a canvas where the average mean errors gracefully unfold, each
gesture classification master that uses 200 different gestures stroke representing a different K value. This isn’t just a chart;
to convey information. These gestures are like a dance it’s a visual aria, a poetic representation of our meticulous
of the hands, based on ASL. The system uses a model to quest for perfection.
categorize gestures by taking precise readings and training
The graph unveils a narrative where the K values 1 and
the KNN model. The motions are divided into two groups
3 emerge as protagonists, each carrying a melody of
for testing the model. The first batch, which contains 75%
precision. In this innovative exploration, the average mean
of the data, introduces the model’s neural connections. The
errors become the notes in our symphony. They echo our
system’s accuracy is then tested using the second batch,
algorithmic ballet’s subtle nuances and intricacies, with
which includes 25% of the data. Figure 74.4 displays a bar
each K value contributing a unique cadence. The revelation,
chart that shows the system’s accuracy in recognizing letters
painted on this digital canvas, is profound—K values 1 and 3
when K was 1 (1-NN). The chart, with its peaks of 100%
are beacons of optimal performance, adorned with an average
accuracy, unveils the system’s prowess in decoding the
error rate below the ethereal threshold of 0.5%. This isn’t
language of gestures. Yet, within this symphony, a poetic
just an experiment; it’s a poetic exploration of the optimal
discord surfaces. Dynamic gestures and an ever-elusive
configuration of our algorithmic orchestra. It’s a testament to
cadence reveal an accuracy slightly shy of perfection, dipping
our commitment to precision, where every K value is a note,
below 95%. The dance between “i” and “j,” intertwined in
and every average mean error is a resonance in the symphony
similarity, unveils a challenge of differentiating one from
of innovation. The journey continues, but in the harmonious
the other, with movement being the subtle heartbeat that
echoes of K values 1 and 3, we find a melody of excellence
sets them apart. The letters “h” and “r,” standing shoulder to
that propels our framework into precision and proficiency.
shoulder in sensor readings, add another layer to the nuanced
ballet. However, in this intricate dance, “j” emerges as the Embarking on a visionary exploration, Fig. 74.5 unfurls as
prima donna of complexity, its identification proving to be a digital tapestry, capturing the essence of our relentless
the most challenging feat. This is not just a chart; it’s a sonnet pursuit of precision in recognizing gestures. This isn’t just
of accuracy and challenges, a testament to the evolving a chart; it’s a symphony of accuracy, a visual composition
Empowering Inclusive Communication: Advancements in Wearable Technology with GloSign 499

that orchestrates the dance of our algorithmic virtuoso across

the spectrum of K values. In this visual opus, the average
accuracy takes center stage, each note resonating with the
precision of our system. The graph, akin to a musical score,
reveals a crescendo of accuracy that reaches a sublime zenith
when K is set to the virtuoso values of 1 or 3. It’s not just data
points; it’s a melody of recognition, a harmonious interplay
between algorithmic finesse and the language of gestures. As
we delve deeper into the nuances of this visual composition,
a revelation emerges—our system achieves a staggering
accuracy that eclipses the celestial threshold of 99.5% when
guided by the symphonic K values of 1 or 3. This isn’t just
accuracy; it’s a triumph, an ode to the meticulous calibration
of our algorithm. Yet, in this pursuit of perfection, a poignant
realization surfaces.
In its majestic complexity, the computational orchestra
demands resources and time that swell with the ascending K
values. The decision-making, akin to a conductor’s precision,
leads us to an optimal choice—to embrace the lowest K value
that preserves an acceptable equilibrium of accuracy and
mean error rate. In this ballet of decisions, K=1 emerges as
the prima donna, the ideal choice that encapsulates numerical
efficiency and an artistic harmony between computational Fig. 74.7 GloSign glove
resources and accuracy. This isn’t just a decision; it’s a
significantly improved, particularly for letters with similar
harmonious resolution, a choice that echoes the innovative
movements. Further enhancements could involve integrating
ethos of precision and efficiency in our dynamic framework.
touch sensors to identify complex motions. In addition, the
The glove depicted in Fig. 74.6 underwent testing with the gesture correction algorithm could be optimized for faster
pangram: “The quick brown fox jumps over a lazy dog.” processing. Shifting the analysis and correction focus to
This pangram is a comprehensive assessment of the glove’s the word level rather than the sentence level could enhance
proficiency in constructing sentences, given that it includes system speed. Additionally, incorporating predictive features
all the alphabets. The glove operated with the support of a for guessing the following words and sentence endings holds
battery pack linked to the bottom of the glove. promise for refinement.
To broaden accessibility, transitioning the entire system to
the IBM platform using Node-RED is proposed. While this
move facilitates easy access from any device and location, it
might introduce a potential slowdown in online processing.
Furthermore, it’s likely to integrate this project with video
chatting software, enabling real-time decoding and display
of gestures during meetings. This innovation could offer a
novel meeting experience for individuals who use SL as their
primary mode of communication.

5. Conclusion
The study introduces GloSign, a glove that converts SL
movements into words and phrases. The glove’s IMU and flex
Fig. 74.6 Accuracy of k values sensors send sign-language gesture data to the IBM Watson
IoT platform. The system employs KNN to distinguish
This article looks closer at gloves that can interpret ASL between complex or similar movements. The gesture repair
gestures. By incorporating ML and sentence-level error technique corrects word and sentence errors once identifying
correction, the accuracy of the system’s output has been letters are combined into sentences. The system outputs the
500 Algorithms in Advanced Artificial Intelligence

converted text both visually and audibly for user convenience. International Conference on Power, Control, Signals and
Further research is required to enhance the accuracy and Instrumentation Engineering, ICPCSI 2017, pp. 1840–1844,
speed of sign language gesture verification. 2018, doi: 10.1109/ICPCSI.2017.8392033.
13. S. M. Biju, H. Z. Sheikh, M. F. Malek, F. Oroumchian, and A.
Bell, “Design of grip strength measuring system using fsr and
References flex sensors using svm algorithm,” IAES International Journal
of Artificial Intelligence, vol. 10, no. 3, pp. 676–686, 2021,
1. G. Kumar, M. K. Gurjar, and S. B. Singh, “American sign doi: 10.11591/IJAI.V10.I3.PP676-686.
language translating glove using flex sensor,” Imperial journal 14. S. M. Biju and H. Z. Sheikh, “Sensor evaluation for hand grip
of interdisciplinary research 2, no. 6, pp. 1439–1441, 2016. strength,” International Journal of Electrical and Computer
2. S. Kumuda and P. K. Mane, “Smart assistant for deaf and dumb Engineering (IJECE), vol. 12, no. 5, p. 4756, Oct. 2022, doi:
using flexible resistive sensor: implemented on LabVIEW 10.11591/ijece.v12i5.pp4756-4764.
platform,” Proceedings of the 5th International Conference on 15. S. Bin Rizwan, M. S. Z. Khan, and M. Imran, “American sign
Inventive Computation Technologies, ICICT 2020, pp. 994– language translation via smart wearable glove technology,”
1000, 2020, doi: 10.1109/ICICT48043.2020.9112553. RAEE 2019 - International Symposium on Recent
3. Rao VV, Silpa N, Gadiraju M, Shankar RS, Vijaya K. An Advances in Electrical Engineering, 2019, doi: 10.1109/
Optimal Machine Learning Model Based On Selective RAEE.2019.8886931.
Reinforced Markov Decision To Predict Web Browsing 16. B. G. Lee and S. M. Lee, “Smart wearable hand device for
Patterns. Journal of Theoretical and Applied Information sign language interpretation system with sensors fusion,”
Technology. 2023 Jan 31;101 (2):859-73. IEEE Sensors Journal, vol. 18, no. 3, pp. 1224–1232, 2018,
4. Reddy SS, Gadiraju M, Maheswara Rao VV. Analyzing Student doi: 10.1109/JSEN.2017.2779466.
Reviews on Teacher Performance Using Long Short-Term 17. R. Ambar, C. K. Fai, M. H. Abd Wahab, M. M. Abdul
Memory. InInnovative Data Communication Technologies Jamil, and A. A. Ma’Radzi, “Development of a Wearable
and Application: Proceedings of ICIDCA 2021 2022 Feb 24 Device for Sign Language Recognition,” Journal of Physics:
(pp. 539-553). Singapore: Springer Nature Singapore. Conference Series, vol. 1019, no. 1, 2018, doi: 10.1088/1742
5. Starner T, Pentland A. Real-time american sign language 6596/1019/1/012017.
recognition from video using hidden markov models. 18. S. Vutinuntakasame, V. R. Jaijongrak, and S. Thiemjarus, “An
InProceedings of International Symposium on Computer assistive body sensor network glove for speech- and hearing-
Vision-ISCV 1995 Nov 21 (pp. 265-270). IEEE. impaired disabilities,” Proceedings - 2011 International
6. Shiva Shankar R, Ravibabu D. Digital Report Grading Using Conference on Body Sensor Networks, BSN 2011, pp. 7–12,
NLP Feature Selection. InSoft Computing in Data Analytics: 2011, doi: 10.1109/BSN.2011.13.
Proceedings of International Conference on SCDA 2018 2019 19. K. Kadam, R. Ganu, A. Bhosekar, and S. D. Joshi, “American
(pp. 615-623). Springer Singapore. sign language interpreter,” Proceedings - 2012 IEEE 4th
7. Chuan CH, Regina E, Guardino C. American sign language International Conference on Technology for Education, T4E
recognition using leap motion sensor. In2014 13th International 2012, pp. 157–159, 2012, doi: 10.1109/T4E.2012.45.
Conference on Machine Learning and Applications 2014 Dec 20. M. S. Amin, M. T. Amin, M. Y. Latif, A. A. Jathol, N. Ahmed, and
3 (pp. 541-544). IEEE. M. I. N. Tarar, “Alphabetical gesture recognition of American
8. Shankar RS, Rajanikanth J, Sivaramaraju VV, Murthy KV. sign language using e-voice smart glove,” Proceedings - 2020
Prediction of employee attrition using datamining. In2018 ieee 23rd IEEE International Multi-Topic Conference, INMIC
international conference on system, computation, automation 2020, 2020, doi: 10.1109/INMIC50486.2020.9318185.
and networking (icscan) 2018 Jul 6 (pp. 1-8). IEEE. 21. Shankar RS, Murthy KV, Rao CS, Gupta VM. An approach
9. VVR MR, Silpa N, Gadiraju M, Reddy SS, Bonthu S, for extracting tweets from social media factors. In2018 ieee
Kurada RR. A Plausible RNN-LSTM based Profession international conference on system, computation, automation
Recommendation System by Predicting Human Personality and networking (icscan) 2018 Jul 6 (pp. 1-7). IEEE.
Types on Social Media Forums. In2023 7th International 22. Maheswara Rao VV, Silpa N, Mahesh G, Reddy SS. An En
Conference on Computing Methodologies and Communication hanced Machine Learning Classification System to Investigate
(ICCMC) 2023 Feb 23 (pp. 850-855). IEEE. the Status of Micronutrients in Rural Women. InProceedings
10. Assan M, Grobel K. Video-based sign language recognition of International Conference on Recent Trends in Computing:
using hidden markov models. InInternational Gesture ICRTC 2021 2022 (pp. 51-60). Springer Singapore.
Workshop 1997 Sep 17 (pp. 97-109). Berlin, Heidelberg: 23. Pigou L, Dieleman S, Kindermans PJ, Schrauwen B. Sign
Springer Berlin Heidelberg. language recognition using convolutional neural networks.
11. H. Joshi, S. Bhati, K. Sharma, and V. Matai, “Detection of fin InComputer Vision-ECCV 2014 Workshops: Zurich,
ger motion using flex sensor for assisting speech impaired,” Switzerland, September 6-7 and 12, 2014, Proceedings, Part
International Journal of Innovative Research in Science, Engi I 13 2015 (pp. 572-578). Springer International Publishing.
neering and Technology, vol. 6, no. 10, pp. 20798–20804, 2017. 24. Shankar RS, Srinivas LV, Ravibabu D, Raminaidu C. Novice
12. G. Sabaresh and A. Karthi, “Design and implementation of a Retroaction Report. ARPN Journal of Engineering and
sign-to-speech/text system for deaf and dumb people,” IEEE Applied Sciences. 2006;13.
Empowering Inclusive Communication: Advancements in Wearable Technology with GloSign 501

25. Metaxas D, Dilsizian M, Neidle C. Scalable ASL sign 37. S. S. Ahmed, H. Gokul, P. Suresh, and V. Vijayaraghavan,
recognition using model-based machine learning and “Low-cost wearable gesture recognition system with minimal
linguistically annotated corpora. Insign-lang@ LREC 2018 user calibration for asl,” Proceedings - 2019 IEEE International
2018 May 12 (pp. 127-132). European Language Resources Congress on Cybermatics: 12th IEEE International
Association (ELRA). Conference on Internet of Things, 15th IEEE International
26. Shankar RS, Babu DR, Murthy KV, Gupta V. An approach Conference on Green Computing and Communications, 12th
for essay evaluation using system tools. In2017 International IEEE International Conference on Cyber, Physical and Social
Conference on Innovative Research In Electrical Sciences Computing and 5th IEEE International Conference on Smart
(IICIRES) 2017 Jun 16 (pp. 1-9). IEEE. Data, iThings/GreenCom/CPSCom/SmartData 2019, pp.
27. Srinivas LV, Raminaidu C, Ravibabu D, Reddy SS. A 1080–1087, 2019, doi: 10.1109/iThings/GreenCom/CPSCom/
framework to recognize the sign language system for deaf SmartData.2019.00185.
and dumb using mining techniques. Indonesian Journal 38. A. Arif, S. T. H. Rizvi, I. Jawaid, M. A. Waleed, and M. R.
of Electrical Engineering and Computer Science. 2023 Shakeel, “Techno-talk: an American sign language (ASL)
Feb;29(2):1006-16. translator,” International Conference on Control, Decision and
28. Chong TW, Lee BG. American sign language recognition Information Technologies, CoDIT 2016, pp. 665–670, 2016,
using leap motion controller with machine learning approach. doi: 10.1109/CoDIT.2016.7593642.
Sensors. 2018 Oct 19;18(10):3554. 39. J. Wu, L. Sun, and R. Jafari, “A wearable system for
29. Shankar RS, Gupta VM, Priyadarshini V, Neelima P. PS recognizing american sign language in real-time using IMU
protocol to detect fire in forest and fire alert system using and surface EMG sensors,” IEEE Journal of Biomedical and
sensors. InAIP Conference Proceedings 2022 Dec 9 (Vol. Health Informatics, vol. 20, no. 5, pp. 1281–1290, 2016, doi:
2576, No. 1). AIP Publishing. 10.1109/JBHI.2016.2598302.
30. Dong C, Leu MC, Yin Z. American sign language alphabet 40. K. S. Abhishek, L. C. F. Qubeley, and D. Ho, “Glove-based
recognition using microsoft kinect. InProceedings of the hand gesture recognition sign language translator using
IEEE conference on computer vision and pattern recognition capacitive touch sensor,” 2016 IEEE International Conference
workshops 2015 (pp. 44-52). on Electron Devices and Solid-State Circuits, EDSSC 2016,
31. Shiva Shankar R, Devareddi R, Mahesh G, MNSSVKR Gupta pp. 334–337, 2016, doi: 10.1109/EDSSC.2016.7785276.
V. Develop a Smart Data Warehouse for Auto Spare Parts 41. S. A. Mehdi and Y. N. Khan, “Sign language recognition
Autonomous Dispensing and Rack Restoration by Using IoT using sensor gloves,” ICONIP 2002 - Proceedings of the 9th
with DDS Protocol. InComputer Networks, Big Data and International Conference on Neural Information Processing:
IoT: Proceedings of ICCBI 2021 2022 May 22 (pp. 879-895). Computational Intelligence for the E-Age, vol. 5, pp. 2204–
Singapore: Springer Nature Singapore. 2206, 2002, doi: 10.1109/ICONIP.2002.1201884.
32. Devareddi RB, Shankar RS, Mahesh G. IoT Protocol for 42. J. M. Allen, P. K. Asselin, and R. Foulds, “American sign
Inferno Calamity in Public Transport. Integration of Cloud language finger spelling recognition system,” Proceedings
Computing with Internet of Things: Foundations, Analytics, of the IEEE Annual Northeast Bioengineering Conference,
and Applications. 2021 Mar 19:87-110. NEBEC, pp. 285–286, 2003, doi: 10.1109/nebc.2003.1216106.
33. N. Tanyawiwat and S. Thiemjarus, “Design of an assistive 43. Y. Khambaty et al., “Cost effective portable system for sign
communication glove using combined sensory channels,” language gesture recognition,” 2008 IEEE International
Proceedings - BSN 2012: 9th International Workshop on Conference on System of Systems Engineering, SoSE 2008,
Wearable and Implantable Body Sensor Networks, pp. 34–39, 2008, doi: 10.1109/SYSOSE.2008.4724149.
2012, doi: 10.1109/BSN.2012.17. 44. Mahesh G, Shankar Reddy S, Maheswara Rao VV, Silpa N.
34. A. Natesh, G. Rajan, B. Thiagarajan, and V. Vijayaraghavan, Preeminent Sign Language System by Employing Mining
“Low-cost wireless intelligent two hand gesture recognition Techniques. InInternational Conference on IoT Based Control
system,” 11th Annual IEEE International Systems Networks and Intelligent Systems 2023 Jun 21 (pp. 571-588).
Conference, SysCon 2017 - Proceedings, 2017, doi: 10.1109/ Singapore: Springer Nature Singapore.
SYSCON.2017.7934745. 45. D. Bajpai, U. Porov, G. Srivastav, and N. Sachan, “Two way
35. S. A. E. El-Din and M. A. A. El-Ghany, “Sign Language wireless data communication and American sign language
Interpreter System: An alternative system for machine translator glove for images text and speech display on mobile
learning,” 2nd Novel Intelligent and Leading Emerging phone,” Proceedings - 2015 5th International Conference on
Sciences Conference, NILES 2020, pp. 332–337, 2020, doi: Communication Systems and Network Technologies, CSNT
10.1109/NILES50944.2020.9257958. 2015, pp. 578–585, 2015, doi: 10.1109/CSNT.2015.121.
36. V. Pathak, S. Mongia, and G. Chitranshi, “A framework for 46. M. M. Chandra, S. Rajkumar, and L. S. Kumar, “Sign
hand gesture recognition based on fusion of Flex, Contact and languages to speech conversion prototype using the SVM
accelerometer sensor,” Proceedings of 2015 3rd International classifier,” IEEE Region 10 Annual International Conference,
Conference on Image Information Processing, ICIIP 2015, pp. Proceedings/TENCON, vol. 2019-Octob, pp. 1803–1807,
312–319, 2016, doi: 10.1109/ICIIP.2015.7414787. 2019, doi: 10.1109/TENCON.2019.8929356.
Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
502 Algorithms in Advanced Artificial Intelligence

AI-Based Voice Assistant Application for

B5G and 6G Free Space Optic Technology
is Competent of Detecting Fake Words
75

R. N. V. Jagan Mohan1
Associate Professor, Dept of CSE, Sagi Rama Krishnam Raju Engineering College, Bhimavaram
https://orcid.org/0000-0003-1457-0824
Vasamsetty Chandra Sekhar2
Professor & HoD, Sagi Rama Krishnam Raju Engineering College, Bhimavaram

Abstract: Forensic voice comparison is a process where police compare a criminal’s voice to suspects’ voices to identify them,
aiding in criminal inquiries as well as legal issues by determining the likelihood of similar speech samples. Voices contain
unique information, allowing individuals to be identified [12]. To determine if two voices belong to the same person, forensic
analysts consider their similarity and distinctiveness. Voices with similar characteristics provide stronger evidence that the
suspect and perpetrator are the same person, as fewer others would have similar characteristics. Research on the benefits of
sixth-generation (6G) networks is focusing on combining Artificial Intelligence (AI) based voice assistants with B5G and 6G
free space optic technologies. The main objectives are to develop voice assistants that can respond to specific vocal commands,
carry out scheduled operations, or perform unprepared actions for law enforcement applications or crime investigations. Speech
recognition, voice synthesis, and natural language processing (NLP) are used to achieve this. Voice assistant interact with
users using voice recognition technology. The research work displays accuracy metrics for Transformers using the attention
algorithm’s outcomes.
Keywords: Artificial intelligence, Free space optic, B5G, 6G, Nature language process, Voice assistant

1. Introduction FSO communications’ range. A Free Space Optical (FSO)

transmission system is a wireless method for connecting two
Broadband communications use free-space optics (FSO) to sites with a direct line of vision, converting conventional
transmit modulated visible or infrared beams outside the data or telecommunications signals into digital form and
atmosphere. Laser beams are common, and collimated energy delivering them into empty space by Arun, K, 2019[1].
beams can be used for longer distances. Data is modulated FSO uses optical waves as the carrier frequency to send
into visible or infrared light at the source, demodulated, and point-to-point data through the atmosphere. Its low cost,
transmitted to hardware. FSO systems can reach several ease of installation, quick deployment of communication
kilometers, but have limitations due to weather conditions. lines, particularly in the context of crisis management,
Close ranges can cause lost packets and signal problems. high bandwidth provisioning, and range of applications
Military-based assessments suggest a maximum range of have attracted the attention of the telecoms industry.
2 to 3 km, with longer dependability estimates provided FSO communication requires no license due to its
by these investigations. Relays can be used to increase operating frequency spectrum by khalingi, 2014[4].

1
mohanrnvj@gmail.com, 2dr.vcs@srkrec.ac.in

DOI: 10.1201/9781003529231-75
AI-Based Voice Assistant Application for B5G and 6G Free Space Optic Technology is Competent of Detecting Fake Words 503

FSO communication offers up to 2.5 Gbps data transfer learning to identify fake words, generating fake words
speeds, unlike RF’s 622 Mbps limit. It uses air for optical through natural language processing. The following issues
transmission of speech, video, and data. FSO requires two come across AI-based Voice Assistant Application:
systems with optical transceivers, a laser transmitter and • To build a voice assistant system, a feature extractor will
receiver, and a telescope for data gathering. Conventional RF be applied to the device communication architecture.
wireless technology offers larger bandwidth, security, fewer • AI-based voice recognition with predictive data
power needs, and portable packaging. classification for voice assistants.
FSO-based network architectures are widely utilized in both • Fake Words identification from voice text in the voice
deep space and terrestrial applications due to their primary assistants using optimization algorithm.
advantages, which include increased bandwidth, enhanced
security, and reduced installation costs. The optical network
configuration is connected to the main part of an FSO,
2. Anatomy of Voice Assistants
which gets its input from transmitters and receivers based The text provides an in-depth analysis of the anatomy of
on photodiodes and lasers. Thus, the possibility exists that voice assistants. Voice assistants and other contemporary
the unified optical network topology will contribute to technology are transforming how people use technology
FSO’s continued success in the future. This approach may by Guoming Zhng, 2017[2]. Voice assistants become
effectively address the problems associated with current smarter as artificial intelligence (AI) develops, allowing
FSO models, including co-channel interference impairments, consumers to interact with their devices in a more organic
FSO aiming, and RF-related problems. In addition, using and intuitive way by Swahili, 2018[7]. Voice assistants are
a composite structure of optical networks and FSO can revolutionizing daily tasks by setting reminders, answering
enhance the performance of 5G services. Fifth-generation questions, controlling smart home technology, and providing
cellular technology has been replaced by sixth-generation personalized recommendations. They use natural language
wireless, or 6G. Because 6G networks can run at higher processing and machine learning to understand spoken
frequencies than 5G networks, they will have significantly commands, benefiting multitaskers and mobility-impaired
more bandwidth and latency. One of the objectives of the users by simplifying device control by Jingjiin, 2023[3].
6G internet is one-microsecond latency in communications. The Internet of Things (IoT) is growing quickly, and as a
Sixth generation (5G) will advance sustainability in multiple result, voice assistants are being incorporated into more and
ways. Data collection and closed-loop control of numerous more facets of our life. Speech assistants may now be used to
devices could be facilitated by its faster and less expensive manage and control smart home appliances like thermostats,
cost-per-bit communication. Mehtab, 2016[6] can analyze lighting controls, and security cameras with straightforward
the data using cutting-edge approaches to improve energy speech commands. This seamless connection allows users
efficiency. to easily monitor and change their home settings from
The Central Bureau of Investigation is utilizing voice anywhere, which not only increases convenience but also
samples in crime investigations, ensuring the right to privacy promotes energy efficiency and security.
and considering legal and ethical considerations in court. Voice assistants are also importantly enhancing user
The Central Bureau of Investigation has received a request interaction and satisfaction. By leveraging AI-powered
for a political leader’s voice samples to corroborate his customization algorithms, voice assistants can provide
alleged involvement in the 1984 antisikh riot case. Voice customers with unique recommendations based on their
samples are crucial in criminal investigations as they enable preferences, habits, and prior interactions [5]. Voice
investigators to verify evidence and identify suspects. Voice assistants offer personalized experiences, enhancing user
differences in recordings can help law enforcement identify experience and fostering a stronger connection between users
suspects and criminal trials, especially in criminal trials due and their devices. They are valuable tools for learning and
to the increasing prevalence of Smartphone audio and video career advancement, providing quick access to information
recording by Wormald J et al, 2022[12]. The credibility of and resources. However, concerns about security and privacy
a sample is significantly impacted by the expert’s technique arise due to cloud-based processing and storage, prompting
and court analysis, as potential inaccuracies may arise from significant investments in encryption and security measures.
medication effects or colds. The paper explores the use of 6G In addition, concerns have been voiced concerning the possible
technology for sustainability and energy efficiency in various effects of voice assistants on social interaction and human
sectors. It proposes an AI-based system for voice assistants, communication. While voice assistants can make technology
predicting micro-level traits affecting behavior. The study communication more effective and easy by Ronan Collobert,
also suggests a speech recognition application using machine 2011[8], others fear that an over dependence on these tools
504 Algorithms in Advanced Artificial Intelligence

affects player experience. Major gaming platforms like

Stream and Roblox emphasize moderation and community
standards.

3. Understanding Text Classification

in NLP
The text provides an in-depth understanding of text
classification in Natural Language Processing (NLP). Texts
and speeches make up unstructured data in this format.
Although there is a lot of it, it might be challenging to
Fig. 75.1 Process of voice assistants using NLP
extract relevant information. If not, mining the data would
be time-consuming. Both vocal and written language is rich
will result in a deterioration in face-to-face encounters and a
in information. It is because writing and speech are the two
loss of empathy and emotional connection.
main ways we communicate as sentient beings. NLP can
NLP use in Voice Recognition: An important development perform tasks like voice assistants, spotting bogus news, and
in the field of artificial intelligence is the use of Natural real-time language translation for us while analyzing this
Language Processing (NLP) for speech detection. Voice data. Text organization into logical groups is referred to as
recognition converts spoken words into structured text, text classification, text category or tagging. Automatic text
whereas NLP interprets meaning from text input. Voice analysis using Natural Language Processing (NLP) may
recognition and NLP are complementary but separate from result in the application of a number of predetermined tags
one another. Both are employed in use cases involving voice or categories based on the content. In this regard, Computer-
control, speech analytics, and governance. based classifiers acquire the ability to categorize objects
Natural language processing, or NLP, is the method by based on prior observations from data sets. There are labels
which voice assistants, such as Google Assistant and Alexa for “user data” and “test data.” It continuously learns by
from Amazon, understand and react to spoken commands. assembling the categorization mechanism from the prior
Speech-to-Text, a subfield of NLP, transcribes commands inputs. By leveraging a word bank, machine-based classifiers
and triggers actions using natural language understanding. can broaden their feature set. A vector in a bag of words
Natural Language Generation, a branch of NLP, enabled is used to express the frequency of words in a specified
Siri’s response to “Set an alarm for 7:30 AM.” dictionary or word list. Deep Learning is a machine learning
technique that can be used to implement NLP by Subba
Social listening, a popular method for monitoring social
Reddy in 2022[10].
media posts and comments, is now gaining popularity
among younger generations. Voice Recognition and Natural
Language Processing (NLP) can enhance these monitoring, 4. Fake Words Identification from
enabling enterprises to understand speakers’ semantic Voice Text Estimation Using
and vocal emotions, beyond Speech-to-Text. Voice Chat
Monitoring and Moderation is used by call centers to
Optimization Algorithm
comply with regulations and train agents. Advances in Voice The text words is an estimate of the minimum words provided
Recognition have improved accuracy and reduced costs. It’s t“(w) > 0 orc2 > 0, which depends only on the choice of the
also used in multiplayer games, where online harassment three fake words. Among the four words (w1, w2, w3 and w),

Fig. 75.2 Text classification using NLP

AI-Based Voice Assistant Application for B5G and 6G Free Space Optic Technology is Competent of Detecting Fake Words 505

the best fake three words are kept and a new interpolated
function t(w) is found again. This procedure continues until
two consecutive estimates are close to each other.
Let w1 be an initial word and entire text i.e., D be the step size.
Compute w2 = w1 + D.
Evaluate t(w1) and t(w2).
If t(w1) > t(w2), let w3 = w1 + 2D; ElseLet w3 = w1 – D.
Evaluatet(w3).
Determine Tmin = min(t1, t2, t3) and Wmin is the word wi that
corresponds to Tmin. Use words w1, w2, and w3 to calculate w.
Are |Tmin – t(w)| and |Tmin – w| small? If not, go to step-7; Else
the optimum is the fake of current four words and Terminate.
Save the fake word and two bracketing it, if possible;
otherwise, save the fake three words. Reliable them according
to w1 < w2 < w3 and go to Step-4.
In the above algorithm, no check is made to satisfy c2 > 0. The
same can be incorporated in Step-5. If c2 is found to be not
Fig. 75.3 Transformers are encroaching on Machine Learning
found the fake words (i.e., negative), one of the three words
may be replaced by a random text. This process is continued By repeatedly using this strategy, we can generate a large
until the quantity c2 becomes nonnegative. number of Attentions and develop a multi-head attention
layer. This improves student understanding of potential
5. Transformers are Encroaching on word connections while they are studying. Only an attention
layer, a few feed-forward layers, some leftover ResNet units,
Machine Learning and layer normalizations make up the initial Transformer
The Transformers’ Attention process is easy to comprehend block. A “Transformer” model typically consists of several
if one can approach it organically. For instance, the string Transformer blocks arranged in order. The great majority of
of consecutive inputs that a transformer receives might be language models follow this basic framework.
thought of as the words (or tokens) of a phrase. It is evident
that time series, pictures, or sound data might be contiguous 6. Experimental Result
inputs. We are aware that a word could show up in an
The accuracy metrics taken from the research work, the
embedding of a vector. The vectorization of the fake word
accuracy metrics for the Transformers: The outcomes of
may additionally take into account the word’s placement
using the attention algorithm are displayed below.
inside the input text. We have three matrices, Wq, Wk, and
Wv, which divide each of the input embedding vectors into Accuracy: When all classes are equally relevant, accuracy—a
three distinct vectors, the Query, the Key, and the Value, model’s performance across all classes—is desirable. It is
as part of the attention mechanism. This terminology is computed as the fraction of correct guesses divided by the
developed from a retrieval system. total number of forecasts.
The connected Query vectors for each word and the Key Sensitivity: The ability to identify good instances is measured
vectors for each additional word are then dot-produced. by the sensitivity. It is often referred to as the recall rate or
The definition of “attention” is determined by the focus of true positive rate.
one word in a sentence on another word to comprehend its Specificity: The specificity of a model determines how well
meaning. This exemplifies how similar the Queries and the it can predict real negatives in each available category.
Keys are. The great resemblance of the resulting vector is Precision: By dividing the total number of positive samples
further highlighted and normalized by a Softmax treatment. by the number of positively identified samples that were
As a result, every word has its own vector. Compute the dot accurately identified, precision may be calculated.
products with the Value vectors of all the other words for each
of the resulting vectors. The calculation of self-attention is F1-Score: The F1-score is a single statistic that is produced
by taking the harmonic mean of the precision and recall
finished.
506 Algorithms in Advanced Artificial Intelligence

of a classifier. Usually, it is used to compare how well two References

classifiers work.
1. Arun K. Majumdar: Fundamentals of Free-Space Optical
Table 75.1 Graph for comparing algorithms Transformers: Communications Systems, Optical Channels, Characterization,
Attention and Network/Access Technology, ScienceDirect, Elsevier,
2019.
Algorithm Accu Sensi Speci Preci F1
racy tivity ficity sion score
2. Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang,
Taimin Zhang, and Wenyuan Xu: Dolphin Attack: Inaudible
Transformers: 0.97 0.98 0.96 0.96 0.98 Voice Commands. In Proceedings of the 2017 ACM SIGSAC
Attention
Conference on Computer and Communications Security
(Dallas, Texas, USA) (CCS ’17), Association for Computing
Machinery, New York, NY, USA, 103–117. https://doi.org/10.
1145/3133956.3134052,2017.
3. Jingjin Li, Chao Chen, Lei Pan, Mostafa Rahimi Azghadi,
Hossein Ghodosi, and Jun Zhang. 2023. Security and Privacy
Problems in Voice Assistant Applications: A Survey, 1, 1,
April 2023, 19 pages, https://doi.org/10.1145/ 2023.
4. Khalighi, M. A.; Uysal, M: Survey on Free Space Optical
Communication: A Communication Theory Perspective, IEEE
Communications Surveys & Tutorial, 16 (4):2231–2258, doi:
10.1109/COMST. 2014.2329501. S2CID 3141460, 2014.
5. K. Vikram Reddy: Personal Voice Assistant, JETIR June 2020,
Volume 7, Issue 6,www.jetir.org, ISSN-2349-5162, 2020.
Fig. 75.4 Graph for different accuracy metrics on 6. Mehtab Singh: Performance Analysis of FSO Link under
Transformers: Attention Different Weather Conditions and Modulation Formats,
International Journal of Signal Processing, Image Processing
The above graphs show the different accuracy metrics like and Pattern Recognition Vol.9, No.5,pp.51-58,http://dx.doi.
accuracy, sensitivity, specificity, precision and F1-score of org/10.14257/ijsip. 2016.9.5.05, 2016.
NLP algorithm of Fake Words Identification got the best 7. P. Shanmuga Sundari, Roy Pushpavilasam Veettil: Artificial
result. Intelligence & Voice Assistants, : Royal Book Publishing,ISBN:
9789391131296, DOI:10.26524/royal.109,June 2022.
8. Ronan Collobert et al : Natural Language Processing (Almost)
7. Conclusion from Scratch, Journal of Machine Learning Research 12,2493
2537 Submitted 1/10; Revised 11/10; Published 8/11,2011.
Free space optics (FSO) offers a cost-effective and efficient 9. Sawhil, Swadha Agarwal, Yashasvi Singhal, Priyanka
way to connect to the fiber optic backbone, providing the most Bhardwaj: An Overview of Free Space Optical Communication,
affordable transmission capacity in the broadband industry. International Journal of Engineering Trends and Technology,
FSO solutions increase network investments and function as https://ijettjournal.org › volume-55 › number-3, 03-Jan-2018.
protocol-independent broadband conduits, saving significant 10. Subba Reddy, K. Sesha Shai Datt, A. Tarun, S. Ajay
upfront capital expenses. Varma: Voice Based System Assistant Using Nlp And Deep
Learning, International Research Journal of Modernization in
Voice differences, both biological and behavioral, can be Engineering Technology and Science, Volume: 04/Issue: 05/
compared to determine the likelihood of two recordings May-2022.
coming from the same person. This voice comparison 11. Shishupal, R. S, Varsha, Supriya Mane, Vinita Singh, Damini
evidence can aid law enforcement in identifying suspects or Wasekar: Virtual Assistant for Prediction of Fake Job Profile
in criminal trials, especially with the increasing prevalence Using Machine Learning, International Journal of Advanced
of Smartphone audio and video recording. This paper is to Research in Science, Communication and Technology
build a voice assistant system; a feature extractor will be (IJARSCT), ISSN (Online) 2581-9429,Volume 3, Issue 2,
applied to the device communication architecture. AI-based March 2021.
voice recognition with predictive data classification for 12. Wormald J et al: How Voice Analysis Can Help Solve
voice assistants developed system. The study focused on Crimes, Frontiers Young Minds, 10:702664,doi:10.3389/
frym.2022.702664, 2022.
identifying fake words from voice text in voice assistants
using an optimization algorithm. Note: All the figures and table in this chapter were designed by the
author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4

GenerativeAI in Personal Dairy Information

Retrieval for Criminal Investigation 76

KVSS Murthy1, J. Rajanikanth2, R. Shiva Shankar3, CH. Ravi Swaroop4, D. Ravibabu5

Dept of Computer Science and Engineering,
Sagi Ramakrishnam Raju Engineering College, Bhimavaram, Andhra Pradesh, INDIA

Abstract: Criminal investigation involves studying facts for trials, using forensic science techniques. A wide-ranging criminal
investigation involves various methods such as searching, interviews, interrogations, evidence collection, and preservation.
During the investigation of specific crimes, recent demographic changes have been observed, indicating a higher priority
for their investigation. This paper explores a crime investigation using Generative AI and Information Retrieval, focusing
on Personal Dairy and its impact on crime involvement probability Using Bayes Belief Network Classification. Retriever
Augmented Generation (RAG) is an AI framework that utilizes external knowledge to provide accurate and current crime
information for large language models (LLMs) and offer users insights into their generative process.
Keywords: Artificial Intelligence, Bayes Belief Network, Crime Investigation, LLM, Retriever Augmented Generation etc.

1. Introduction individuals, including librarians and professional searchers,

make extensive use of it, and it is commonly regarded as the
In the pursuit of justice, it is crucial to appraise the data principal method of gaining access to information. IR notifies
that will be presented as evidence in court. Investigation in users about documents that contain the required information
the criminal justice system involves various processes such and assists in browsing or filtering document collections. It
as searches, communication with individuals, interviews, searches over billions of documents on millions of computers,
collection and preservation of evidence, and other using keywords to summarize information descriptions
investigative procedures [1]. All of these procedures are [3]. An Information Retrieval (IR) model selects and ranks
necessary to conduct a comprehensive criminal investigation. documents based on user queries, using a matching function
Modern scientific techniques known as forensic science are to return a retrieval status value (RSV) [4]. These systems
commonly used in criminal investigations. The regulation use terms from a vocabulary V, and determine the query-
stipulated that the evidence had to be presented by both the document matching function using four main approaches [7].
accuser and the accused. In the current day, government The acquisition step involves selecting documents and
police units are typically in charge of conducting criminal objects from text-based web resources, collecting data from
investigations [2]. Private investigators frequently conclude web crawlers, and storing it in a database. The representation
or support criminal investigations. includes indexing, summarizing, and bibliographic
The term “information retrieval” (IR) refers to a software descriptions, including author, title, sources, data, and
application that was developed for the purpose of organizing, metadata, using both manual and automatic techniques.
storing, retrieving, and analyzing information collected from File organization methods include sequential, which stores
document repositories, particularly textual data. Millions of documents by document data, and inverted, which lists records

1
kvssrmurthy75@gmail.com, 2rajanikanth.1984@gmail.com, 3shiva.csesrkr@gmail.com, 4raviswaroop.chigurupati@gmail.com, 5raviswaroop.chigurupati@
gmail.com

DOI: 10.1201/9781003529231-76
508 Algorithms in Advanced Artificial Intelligence

Fig. 76.1 Process of information retrieval

under each term, or a combination of both. An IR process It is a big problem that undermines justice and may have
initiates when a user enters a query, formalizing information catastrophic implications, and the threat of breaching the
needs and matching multiple objects with different relevancy law is a huge concern. Computational crime prediction and
levels in the collection [5]. forecasting has the potential to play a significant role in
Generative AI is an AI-powered technology that creates new improving public safety in urban areas, which is a problem
content using foundation models that can perform multiple that needs to be addressed. However, the enormous volume of
tasks with minimal training. It adapts to specific use cases complex big data cannot be handled by humans, which makes
with minimal example data. Generative artificial intelligence accurate criminal activity forecasts impossible. Predicting
uses supervised learning to recognize patterns in human- crime rates, types, and hotspots based on past trends poses
generated material, allowing it to create similar content both computational challenges and possibilities. Therefore,
autonomously. The model is trained using a set of labels [6]. more research is necessary to improve prediction systems
that can focus police patrols to crime incidents [9].
Generative AI enhances customer interactions, explores
unstructured data, and assists with repetitive tasks like Identifying criminal hotspots using criminal analysis is a
responding to proposals, localizing marketing content, and difficult technique. GIS was the dominant non-machine
checking customer contracts for compliance. Vertex AI allows learning temporal and geographical data approach in 2020.
developers to customize and embed foundation models into GIS reduced crime rates by using criminal-type sites [10].
applications without ML expertise. It offers a quick way to A technology that predicts crime patterns and helps law
build generative AI-powered search engines and chatbots, and enforcement solve crimes may cut real-world crime rates.
Duet AI provides assistance to users. Google’s experts offer Time series approaches are used in crime forecasting in
consulting services in generative AI, enabling organizations order to discover future crime trends. These methodologies
to create new content, discover trends, summarize data, make use of time series data in order to make predictions
and automate processes, among other services in their about crimes that may occur many years in the future. This
comprehensive Google Cloud Consulting portfolio [7]. approach helps prevent crime by enabling law enforcement
agencies to take proactive measures to mitigate potential
Inaccuracies in Personal Dairy Information Retrieval for
criminal activity before it happens.
Crime Investigation by Generative AI. A case diary is
essential for recording the investigation that an investigating Machine learning algorithms have made crime data analysis
representative conducts. During the trial, the examination simpler by preprocessing and grouping raw data to extract
could ask for the case diary. The case diary, which is meant crime locations [11]. This data has been analyzed using
to support the trial, cannot be presented as evidence in and supervised and unsupervised machine learning models to
of itself during the test. The name of the diary varies from detect crime trends by time and place, resulting in accurate
state to state, and the police laws include instructions on how predictions [12]. Machine learning algorithms used to past
it should be kept. Usually referred to as a “case diary,” this data from the same site have helped investigators determine
journal is also known as a “special diary” in some states. The the causes of crime [13]. In, [14], the authors has removed the
case diary notes the time when each detail was documented noise in images and analysed the performance by using SGO
by the investigating representative [8]. and APSO. They have used the filters to remove the noise in
GenerativeAI in Personal Dairy Information Retrieval for Criminal Investigation 509

PGM images [15]. After that the images are retieved by using to the other foundation models. A condensed representation
Genetic model [16]. Some of the techniques has been used to of the underlying structure of raw data is created from its
identify the flower species by using Deep Learning Models peak. A foundation model can be refined based on tagged,
[17] and detected the counterfeit on scanned document by domain-specific data starting from this core representation to
using NN [18]. So that the approach was colourized for gray fit a crime investigation application [22]. The contribution of
scale image [19] and sharpened the images by using CNN propose work as follow:
[20]. • Uncertainty Crime Investigation Using Bayes Belief
Network Classification.
1.1 Challenges in Prediction Systems
• RAG is an AI framework that uses external knowledge
Predicting crime is a complex task for both researchers to provide accurate, up-to-date information for large
and security agents. They face difficulties in determining language models (LLMs) and provide users with insights
the location and time of the crime, as well as choosing the into their generative process.
most effective method to predict it. Researchers working in
Uncertainty Crime Using Bayes Belief Network in
the field of computer science who use techniques such as
Artificial Intelligence: The Bayesian belief network (BBN)
data mining, machine learning, and spatial-temporal data
is a kind of computer technology that is applied to manage
also encounter obstacles. In 2012 and 2016, algorithms that
uncertain and probabilistic occurrences in the context of
anticipate crimes in households, streets, and regions were
investigative difficulties involving criminal activity. It is a
established. These approaches were referred to as near
probabilistic graphical model representing variables and
repeat-victimization and repeat-victimization, respectively.
their dependencies, using probability theory for prediction
The results of these methodologies indicate that if a crime
and anomaly detection. A BBN is a probabilistic tool used
is committed in a certain location, there is a substantial
in real-world applications for prediction, anomaly detection,
possibility that the number of subsequent crimes committed
diagnostics, and decision making under uncertainty. The
in the same location will dramatically rise. BBN consists of a Directed Acyclic Graph and a Table of
Developers of crime prediction systems face several conditional probabilities, also known as an Influence diagram.
challenges, The network structure reveals that crime investigation and
a. This includes the necessity for extensive storage owing Information Retrieval are the parent nodes of Personal Dairy,
to the high volume of data. directly affecting the probability of person involvement in a
b. A diverse range of formats, including text, images, crime [23].
graphs, audio, structured and semi-structured data, can
be used to store information related to criminal activity.
c. Creating a comprehensible structure for this data is
equally challenging.
d. Utilize a data mining strategy that produces better
results than the algorithms that are already in use in
order to effectively identify instances in machine
learning by using this approach.
e. Crime prediction systems may be influenced by Fig. 76.2 Uncertain crime investigation using bayes belief
environmental elements like weather and lawlessness, network
leading to significant mistakes [21]. To minimize such Retriever Augmented Generation Using LLM: The use of
inaccuracies and obtain high prediction accuracy, crime crime databases in augmenting LLMs is a valuable method,
forecasts must account surrounding and environmental but it has several significant flaws. The debate between fine-
changes. tuning and Retriever Augmented Generation (RAG) with
Therefore, in order to avoid making mistakes of this kind LLMs is ongoing, with RAG being better for enhancing
and to reach a high level of prediction accuracy, any crime LLMs with small additional data. RAG encodes data into
forecast must take into account changes in the surrounding embeddings and indexes it into a vector database. Users ask
environment and the environment itself. questions, which are converted into embeddings and used to
search for similar embeddings [24]. Prompts provide context
2. Proposed Work for LLM answers, usually using cosine similarity metric.
The problem lies in the search’s ability to retrieve documents
Using artificial intelligence to generate information retrieval. with similar words or context without providing relevant
The transformer AI architecture supports LLMs in addition information, leading to an excess of irrelevant documents
510 Algorithms in Advanced Artificial Intelligence

Fig. 76.3 Retriever augmented generation using LLM

showing higher cosine similarity than the actual answer. The NB, SVM, DT, and K-Nearest Neighbor algorithms
High cosine similarity in Transformers does not necessarily were used in order to produce predictions about criminal
imply semantic similarity; it can also indicate the high co activities. The results of these algorithms were compared,
occurrence of two terms within the same training data [25, which served to demonstrate the efficacy of neural networks
26]. The data’s indexing can cause issues if it’s broken in complex systems [28]. The complexity of the connections
down into large chunks, potentially containing unrelated between geographical data, on the other hand, significantly
information. To avoid diluted information and irrelevant reduced the value of pattern extraction. Table 76.1 shows the
documents, break down the data into a few paragraphs per Comparison data using DM with various datasets.
chunk, ensuring uniqueness. The RAG approach emphasizes
limiting the type of questions asked by the LLM. Aggregating Table 76.1 Comparative study with various datasets in DM
data across the database may lead to incorrect answers, while Reference Dataset Used Models Accuracy
similarity searches may find local information. For example,
[31] Chicago crime DT 76.1
if a question requires scanning all documents, similarity data
RF 82.59
searches may not be helpful [27].
NB 76.63

3. Prediction of Crime using Data [32] Los Angeles crime

data
NB 64

Mining DT 48
[33] Bangladesh crime NB 70.8
2011 was the year that saw the development of specialized data
KNN 77.4
data mining and technologies that were able to extract patterns
[34] Indian and Banga KDE 78.83
from geographical and temporal data. For the purpose of lore crimes data
forecasting criminal activity and determining the possibility of
[35] India crime data WEKA on two 92.52 for C1
residential burglary, the data from Portland was geospatially K-mean clusters
92.59 for C2
mined with the assistance of specific technical knowledge.
GenerativeAI in Personal Dairy Information Retrieval for Criminal Investigation 511

In 2016, a high level of accuracy was achieved in Table 76.2 Comparative study with various datasets in ML
extracting information from data acquired throughout 1994 Reference Dataset Used Models Accuracy
occurrences, which had 128 characteristics, using a variety of
[40] UCI machine learning J48 94.25
DT algorithms [29]. Scatter plots were used to represent the repository website
crime regions and their severity based on the data collected in [41] London mobile and crime data RF 71
the past for training and testing the data.
[42] Chicago crime data DT 39
Also, in the same year, data mining techniques were used to RF 61
categorize the infractions according to the types of crimes NN 82
committed [30]. It was found that the timing of the crime was
[43] Bangladesh crime data LR 72.9
classified, taking into account elements such as vacations that
[44] New York crime data SVM 44
began with the beginning of the school year for colleges and
universities. RF 51
XGBoost 53

4. Prediction of Crime using Machine

Learning 5. Experimental Result
There were a number of scholarly publications that were given The experimental result is on criminal investigation using
during the year 2020 on the subject of predicting criminal Retriever Augmented Generation performance on massive
behavior via the use of machine learning. According to the datasets and its accuracy in predicting. The study LLM model
findings of one research, three unique methods were used for accuracy in criminal recognition, focusing on Retriever
in order to extract spatiotemporal data and forecast crimes Augmented Generation performance on massive datasets and
in Chicago. These methods were long short-term memory its accuracy in predicting true nativities. In Table 76.3 clarly
(LSTM), residual neural network, and graph convolutional represented the Personal diary Information of criminals.
network. Mean absolute error and root mean square error Accuracy: A two-dimensional classification test’s accuracy
were the two methods that were used in order to assess the is a statistical indicator of its capacity to recognize or rule
efficiency of the technique [36]. Another piece of study out a condition based on a comparison of pre- and post-test
demonstrated the development of a crime network using probability estimates.
spatiotemporal data [37]. Table 76.2 shows the Comparison TP + TN
data using DM with various datasets. This network made use Accuracy =
TP + TN + FP + FN
of a convolutional neural network (CNN) to make predictions
Where TP = True positive; FP = False positive; TN = True
about the time and place of criminal actions. A time series
negative; FN = False negative.
crime prediction system for Addis Ababa was developed in
a distinct research that was published in [38]. This system The RAG with LLM technique, which has a 98% accuracy
was constructed by combining recurrent neural networks record in criminal identification of crime cases, fails
(RNN) and long short-term memory (LSTM). In conclusion, to correctly identify 98 out of 100 criminals, leaving 2
a different study report [39] applied machine learning unidentified, in class-imbalanced data with significant
approaches such as support vector machines (SVM), neural positive/negative label variations.
networks (NB), latent variables (LR), and decision trees (DT) 96 + 2
to anticipate the intensity of criminal activity in Boston. Accuracy = = 0.98%
96 + 2 + 1 + 1

Table 76.3 Personal dairy information of criminals

PCN Particulars Age Text size in Type Recognition
KB
02-027-098-00-0000-S001 Badri Ranganath, Nandyala 32 400 Suspect Criminal
02-040-028-22-0038-A002 Banawath Rajeswari, Vijayawada 22 1000 Accused Criminal
02-040-028-22-00-0038-A001 Kota Kumar 34 2000 Accused Criminal
02-027-098-00-0000-S002 Vasam Srinivas 33 400 Suspect Criminal
02-040-028-22-0038-A001 B. Ranadheer 21 1000 Accused Criminal
02-040-028-22-0038-A002 Kumara Rajesh 32 2000 Accused Criminal
512 Algorithms in Advanced Artificial Intelligence

8. Kadar, C., Maculan, R., & Feuerriegel, S: Public decision

6. Conclusion support for low population density areas: An imbalance-
Crime prediction has become a popular topic of study due to aware hyper-ensemble for spatio-temporal crime prediction,
its potential benefits to society and national security. Many Decision Support Systems, 119, 107–117, https://doi.
researchers have used supervised learning methods in this area, org/10.1016/j.dss.2019.03.001,2019
9. Safat W, Asghar S, Gillani SA. Empirical analysis for crime
with data mining approaches proving to be more accurate than
prediction and forecasting using machine learning and deep
machine learning methods. It has been observed that machine
learning techniques. IEEE access. 2021 May 6;9:70080-94.
learning algorithms, on average, outperform data mining 10. Kounadi O, Ristea A, Araujo A, Leitner M. A systematic
algorithms in predicting crimes. The analysis of standard review on spatial crime forecasting. Crime science. 2020
deviation of crime prediction accuracies of both methods Dec;9:1-22.
suggests that machine learning is generally more effective 11. Khairuddin AR, Alwee R, Haron H. A comparative analysis
than data mining for crime prediction purposes. Crime of artificial intelligence techniques in forecasting violent
investigation using generative AI and information retrieval crime rate. InIOP Conference Series: Materials Science and
focuses on personal data and its impact on crime involvement Engineering 2020 May 1 (Vol. 864, No. 1, p. 012056). IOP
probability. It uses Bayes Belief Network Classification and Publishing.
Retriever Augmented Generation to provide accurate crime 12. Sardana D, Marwaha S, Bhatnagar R. Supervised and
unsupervised machine learning methodologies for crime
information for large language models.
pattern analysis. International Journal of Artificial Intelligence
and Applications (IJAIA). 2021 Jan 2;12(1).
References 13. Sivanagaleela B, Rajesh S. Crime analysis and prediction
using fuzzy c-means algorithm. In2019 3rd International
1. Azwad Tamir et al: Crime Prediction and Forecasting using Conference on Trends in Electronics and Informatics (ICOEI)
Machine Learning Algorithms, International Journal of 2019 Apr 23 (pp. 595-599). IEEE.
Computer Science and Information Technologies, Vol. 12 (2), 14. K.Dileep kumar: Unsupervised based Crimes Cluster Data
26-33,2021. Using Decision Tree Classification, Solid State Technology,
2. Bandekar, S. R., & Vijayalakshmi, C:Design and analysis of Volume- 63, Issue-5, 2020.
machine learning algorithms for the reduction of crime rates 15. Kim, S., Joshi, P., Kalsi, P. S., & Taheri, P: Crime analysis
in India, Procedia Computer Science, 172, 122–127. https:// through machine learning, In 2018 IEEE 9th annual information
doi.org/10.1016/j.procs.2020.05.018,2020. technology, electronics and mobile communication
3. Bowen, D. A., Mercer Kollar, L. M., Wu, D. T., Fraser, D. conference, IEMCON 2018 (pp. 415–420),https://doi.
A., Flood, C. E., Moore, J. C., Mays, E. W., & Sumner, S. A: org/10.1109/IEMCON.2018.8614828,2019.
Ability of crime, demographic and business data to forecast 16. Li, Z., Zhang, T., Jing, X., & Wang, Y:Facial expression-based
areas of increased violence. International Journal of Injury analysis on emotion correlations, hotspots, and potential occur
Control and Safety Promotion, 25(4), 443–448, https://doi.org rence of urban crimes. Alexandria Engineering Journal, 60(1),
/10.1080/17457300. 2018.1467461,2018. 1411–1420. https://doi.org/10.1016/j.aej.2020.10.061,2021.
4. E. Ahishakiye, E. Opiyo, and I. Niyonzima: Crime Prediction 17. R.N.V.Jagan Mohan: Crime Data Optimization Using
Using Decision Tree (J48) Classification Algorithm, Neutrosophic Logic, Concurrency and Computation Practice
International Journal of Computer and Information and Experience, https:/doi.org/10.1002 /cpe.553,Wiley
Technology (ISSN: 2279 – 0764), 05/15, 2017. Online Library,29,March, 2022, ISSN:1532-0634, https://doi.
5. Forradellas, R. F. R., Alonso, S. L. N., Rodriguez, M. L., org/10.1002/cpe.6973.
& Jorge-Vazquez, J. (2021). Applied machine learning in 18. Saravanan, P., Selvaprabu, J., Arun Raj, L., Abdul Azeez
social sciences: Neural networks and crime prediction. Khan, A., & Javubar Sathick, K.: Survey on crime analysis
Social Sciences, 10(1), 1–20. https://doi.org/10.3390/ and prediction using data mining and machine learning
socsci10010004,2020. techniques. Lecture Notes in Electrical Engineering, 688, 435–
6. Gao, Y., Wang, X., Chen, Q., Guo, Y., Yang, Q., Yang, K., & 448. https://doi.org/10.1007/978-981-15-7241-8_3,2021.
Fang, T: Suspects prediction towards terrorist attacks based 19. Shukla, S., Jain, P. K., Babu, C. R., & Pamula, R:A multivariate
on machine learning. In Proceedings – 2019 5th international regression model for identifying, analyzing and predicting
conference on big data and information analytics, crimes, Wireless Personal Communications, 113(4), 2447–
BigDIA 2019 (pp.126–131). https://doi.org/10.1109/ 2461, https://doi.org/10.1007/s11277-020-07335-w,2020.
BigDIA.2019.8802726,2019. 20. Tayal DK, Jain A, Arora S, Agarwal S, Gupta T, Tyagi N.
7. Jha, G., Ahuja, L., & Rana, A: Criminal behavior analysis Crime detection and criminal identification in India using data
and segmentation using K-means clustering. ICRITO mining techniques. AI & society. 2015 Feb;30:117-27.
2020 - IEEE 8th International Conference on Reliability, 21. Gupta VM, Murthy KV, Shankar RS. A novel approach for
Infocom Technologies and Optimization (Trends and image denoising and performance analysis using SGO and
Future Directions), 1356–1360. https://doi.org/10.1109/ APSO. InJournal of Physics: Conference Series 2021 Nov 1
ICRITO48877.2020.9197791,2019. (Vol. 2070, No. 1, p. 012139). IOP Publishing.
GenerativeAI in Personal Dairy Information Retrieval for Criminal Investigation 513

22. Shankar RS, Gupta VM, Murthy KV, Someswararao C. Object 34. Prathap BR, Krishna AV, Balachandran K. Crime analysis
oriented fuzzy filter for noise reduction of Pgm images. and forecasting on spatio temporal news feed data—an indian
In2012 8th International Conference on Information Science context. InArtificial intelligence and blockchain for future
and Digital Content Technology (ICIDT2012) 2012 Jun 26 cybersecurity applications 2021 May 1 (pp. 307-327). Cham:
(Vol. 3, pp. 776-782). IEEE. Springer International Publishing.
23. Shankar RS, Sravani K, Srinivas LV, Babu DR. An approach 35. Tayal DK, Jain A, Arora S, Agarwal S, Gupta T, Tyagi N.
for retrieving an image using Genetic Algorithm. International Crime detection and criminal identification in India using data
Journal of Latest Trends in Engineering and Technology. mining techniques. AI & society. 2015 Feb;30:117-27.
2017;9(8):057-64. 36. Hou M, Hu X, Cai J, Han X, Yuan S. An integrated graph
24. Shankar RS, Srinivas LV, Raju VS, Murthy KV. A model for spatial–temporal urban crime prediction based on
Comprehensive Analysis of Deep Learning Techniques for attention mechanism. ISPRS International Journal of Geo-
Recognition of Flower Species. In2021 Third International Information. 2022 Apr 30;11(5):294.
Conference on Intelligent Communication Technologies and 37. Ilhan F, Tekin SF, Aksoy B. Spatio-temporal crime prediction
Virtual Mobile Networks (ICICV) 2021 Feb 4 (pp. 1172 with temporally hierarchical convolutional neural networks.
1179). IEEE. In2020 28th Signal Processing and Communications
25. Devareddi RB, Shankar RS, Murthy K, Raminaidu C. Image Applications Conference (SIU) 2020 Oct 5 (pp. 1-4). IEEE.
segmentation based on scanned document and hand script 38. Meskela TE, Afework YK, Ayele NA, Teferi MW, Mengist
counterfeit detection using neural network. InAIP Conference TB. Designing time series crime prediction model using long
Proceedings 2022 Dec 9 (Vol. 2576, No. 1). AIP Publishing. short-term memory recurrent neural network. International
26. Shankar RS, Mahesh G, Murthy KV, Ravibabu D. A novel Journal of Recent Technology and Engineering (IJRTE).
approach for gray scale image colorization using convolutional 2020;9:402-5.
neural networks. In2020 International Conference on System, 39. Hussain FS, Aljuboori AF. A crime data analysis of prediction
Computation, Automation and Networking (ICSCAN) 2020 based on classification approaches. Baghdad Science Journal.
Jul 3 (pp. 1-8). IEEE. 2022 Oct 1;19(5):1073-.
27. Shankar RS, Mahesh G, Murthy KV, Rajanikanth J. A novel 40. Ahishakiye E, Taremwa D, Omulo EO, Niyonzima I.
approach for sharpening blur image using convolutional neural Crime prediction using decision tree (J48) classification
networks. Journal of Critical Reviews. 2020 Apr;7(7):139-48. algorithm. International Journal of Computer and Information
28. Yu CH, Ward MW, Morabito M, Ding W. Crime forecasting Technology. 2017 May;6(3):188-95.
using data mining techniques. In2011 IEEE 11th international 41. Bogomolov A, Lepri B, Staiano J, Oliver N, Pianesi F,
conference on data mining workshops 2011 Dec 11 (pp. 779 Pentland A. Once upon a crime: towards crime prediction
786). IEEE. from demographics and mobile data. InProceedings of the
29. Shekhar S, Evans MR, Kang JM, Mohan P. Identifying 16th international conference on multimodal interaction 2014
patterns in spatial information: A survey of methods. Wiley Nov 12 (pp. 427-434).
Interdisciplinary Reviews: Data Mining and Knowledge 42. El Bour HA, Ounacer S, Elghomari Y, Jihal H, Azzouazi M.
Discovery. 2011 May;1(3):193-214. A crime prediction model based on spatial and temporal data.
30. Sharma H, Kumar S. A survey on decision tree algorithms of Periodicals of Engineering and Natural Sciences. 2018 Nov
classification in data mining. International Journal of Science 24;6(2):360-4.
and Research (IJSR). 2016 Apr 5;5(4):2094-7. 43. Mahmud S, Nuha M, Sattar A. Crime rate prediction using
31. Yerpude P. Predictive modelling of crime data set using data machine learning and data mining. InSoft Computing
mining. International Journal of Data Mining & Knowledge Techniques and Applications: Proceeding of the International
Management Process (IJDKP) Vol. 2020 Jul 21;7. Conference on Computing and Communication (IC3 2020)
32. Almanie T, Mirza R, Lor E. Crime prediction based on crime 2021 (pp. 59-69). Springer Singapore.
types and using spatial and temporal criminal hotspots. arXiv 44. Almuhanna AA, Alrehili MM, Alsubhi SH, Syed L. Prediction
preprint arXiv:1508.02050. 2015 Aug 9. of crime in neighbourhoods of New York City using spatial
33. Mahmud S, Nuha M, Sattar A. Crime rate prediction using data analysis. In2021 1st International conference on artificial
machine learning and data mining. InSoft Computing intelligence and data analytics (CAIDA) 2021 Apr 6 (pp. 23
Techniques and Applications: Proceeding of the International 30). IEEE.
Conference on Computing and Communication (IC3 2020)
Note: All the figures and tables in this chapter were designed by
2021 (pp. 59-69). Springer Singapore.
the author.
Algorithms in Advanced Artificial Intelligence – Dr. Dr. R. N. V. Jagan Mohan et al. (eds)
© 2024 Taylor & Francis Group, London, ISBN 978-1-032-86798-4
514 Algorithms in Advanced Artificial Intelligence

PCACSO Feature Selection for Prediction

of Breast Cancer NAC Response 77

Susmitha Uddaraju1
Assistant Professor, Department of Artificial Intelligence
Shri Vishnu Engineering College for Women, Vishnupur, Bhimavaram, West Godavari Dist., AP, India
G. P. Saradhi Varma2
Professor, Department of Computer Science and Engineering
Koneru Lakshmaiah Education Foundation (Deemed to be University), Vaddeswaram, Guntur Dist., AP, India
I.Hemalatha3
Professor, Department of Information Technology
S. R. K. R. Engineering College, Bhimavaram, Indian

Abstract: For most patients with breast cancer who have had neo-adjuvant chemotherapy, surgery is the preferred course
of treatment due to the highly difficult evaluation of PCR (pathologic complete response). However, early prediction made
possible by technological advancements means that patients can now receive the appropriate treatment sooner. Thanks to
advanced techniques and the availability of large volumes of data, accurate and timely prediction is now possible. The goal of
this research is to assess how well a number of recently developed machine learning algorithms perform in terms of forecasting
the NAC response to breast cancer. Principal component analysis with cuckoo search optimisation (PCACSO) is incorporated
into five popular classifiers (random forest, decision tree, naive bayes, K-nearest neighbour, and support vector machine) to
enhance the prediction model’s accuracy. In accordance, the figures are 87%, 82.83%, 74.23%, 71.7%, and 78.41%. On the
other hand, achieving the same outcomes for feature selection with and without PCACSO resulted in accuracy of 81.8%,
56.06%, 43.9%, 71.21%, and 76.78%, respectively. The proposed PCACSO model can be used to improve most classification
methods by utilising feature selection techniques to reduce the number of features. Certain features have a greater bearing and
influence on the results generated by the classification algorithms than others.
Keywords: Breast cancer; NAC; Feature selection; Random forest, Decision tree; Naïve bayes; K-nearest neighbor; PCA;
Cuckoo search optimization etc

1. Introduction limits the amount of native surgery by enabling treatment

monitoring and cancer downstaging [1]. [4]. Both the sentry
Historically, surgeons have treated operable cancer with lymph gland diagnostic test and lumpectomy, a breast-
adjuvant therapy, radiation, and surgery combined in a tri- conserving procedure, are now possible thanks to their
modal approach. Since they started in 2001, the findings of practical application. Complete dissection and ablation of
randomised clinical studies demonstrating the superiority the auxiliary lymph glands were formerly mandated by the
of neoadjuvant therapy over adjuvant therapy have changed World Health Organisation. A pathologic complete response
the game. Neoadjuvant chemotherapy (NAC) before surgery (pCR), or the absence of breast cancer, is the ultimate goal

1
susmithauddaraju@gmail.com, 2gpsvarma@gmail.com, 3indukurihemalatha@gmail.com

DOI: 10.1201/9781003529231-77
PCACSO Feature Selection for Prediction of Breast Cancer NAC Response 515

of NAC [2]. A pCR could lead to improved rates of overall picking out a set of pertinent attributes that may be used
and disease-free survival. If pCR can be ascertained non to construct a model, while instance reduction involves
invasively, surgeons can avoid surgery; however, currently, decreasing the number of orthogonal instances in a dataset
surgery is still necessary to confirm a pCR following NAC with the goal of improving classification accuracy [16].
[3].Currently, it is advised to use breast magnetic resonance While these ancillary factors do not impact the categorization
imaging before and after NAC because it has higher PCR process, they may lower the findings’ accuracy. Feature
detection accuracy compared to physical examination, selection is a useful tool for removing irrelevant, noisy, or
diagnostic procedures, and ultrasound [5] and [6]. The stated redundant features from a model that aren’t adding anything
sensitivity of MRI for PCR is fairly different; however, a useful. Therefore, rather than trying to use every possible
meta-analysis indicated a combined sensitivity of 64%, taxonomy choice, it is easier to zero in on those that are really
which is insufficient to rule out tissue confirmation and suitable and useful [18]. By reducing the overall number of
surgery. When combined with machine learning algorithms, possible outcomes, this unexpectedly simplifies the model
the data from high-resolution breast MRI scans may provide and makes it more readable. By integrating feature selection
extremely accurate, non-invasive cancer response detection into healthcare knowledge, we can decrease the number of
approaches [7]. Computer-assisted qualitative image tests required for a diagnosis, which in turn saves patients
analysis in the discipline of computational biology known both time and money on testing. There are three main types of
as “radiomics” identifies traits that are unseen to the human traditional feature-selection methods: embedding, wrapper,
eye, augmenting visual evaluation [9]. These techniques are and filtering strategies [20]. Despite the abundance of feature
compatible with machine learning and have shown promise selection algorithms developed for the healthcare industry,
for non-invasively identifying therapeutic responses in breast there is a dearth of studies on breast. Analysis of polymerase
and other cancers. Consequently, our goal was to develop and chain reactions for cancer. This research aims to fill that
evaluate a radiomics biomarker that may classify cancer pCR need by developing a breast cancer PCR prediction system
on MRI following NAC [8]. that integrates feature selection and optimisation techniques
[19]. To alleviate the problem of high-dimensional data in
2. Related Work breast cancer risk factors and improve prediction accuracy,
we developed a hybrid approach that combines training and
2.1 Dimensionality Issues application time reduction with a selection model [20].
Burgess talks about a method called “dimensional reduction,”
which maps data to a low-dimensional space by lowering 2.2 Machine Learning in Healthcare
its informative variance [11]. This finds the topological Experts in many domains, including business, medicine,
space where the data exists. Two steps are involved in and research, face the challenge of deciphering massive and
implementing dimensionality reduction: feature extraction intricate datasets [21, 22]. The ability to understand and
and feature selection. The process of feature extraction make sense of massive datasets is in high demand across
involves going through a dataset in search of unnecessary, many industries as a result [21, 22]. This includes business,
irrelevant, or duplicated measurement features and then research, medicine, and many more. The capacity to derive
removing them [12] and [14]. In order to construct reliable actionable insights from this deluge of data is paramount in
learning models, feature selection enables the discovery and the modern, cutthroat business environment. Data mining
removal of as much redundant and unnecessary data as is standards are based on artificial intelligence, statistics,
practically feasible. The end result is an improved model with computers, and probability [24]. Two major types of data
reduced processing times and expenses because of feature processing models are descriptive and predictive models.
selection [10]. Numerous earlier investigations, especially The use of predictive models is common in supervised
those involving healthcare data, have made use of the feature learning functions when attempting to forecast the present or
selection technique. Our classification system for feature future values of critical variables. When it comes to finding
selection algorithms is as follows: filters, wrappers, and potentially intelligible data patterns, unsupervised learning
embedding techniques [13]. Burgess described dimensional functions depend on interpretive models [23]. Classification
reduction as the act of reducing the number of dimensions of in the healthcare industry is done using statistical approaches.
a body of knowledge in order to extract useful information Swarm intelligence, decision trees, logistic regression,
while eliminating irrelevant details [15]. [17]. One subset of ANNs, SVMs, k-nearest neighbour, association rules, and
dimensional reduction is reduction and selection techniques, genetic classifiers [25] [26].
while another is instance selection. Feature selection is
516 Algorithms in Advanced Artificial Intelligence

3. Methodology i. Data Cleaning

3.1 Data Collection Standard Scalar Normalization

To make a standard scalar distribution with zero median
We used the American Oncology Institute’s breast cancer data
cost and one fashionable deviation, one must first modify
to identify potential independent and dependent variables
the statistics in a certain way. Feature-smart modification is
during the data selection phase. There are 221 cases and 12
performed along with multivariate statistics, ensuring that
characteristics in the Breast Cancer Prognostic Dataset..
it is done separately for each column of statistics. In order
to determine the average cost, deduct all test costs from the
3.2 Data Preprocessing
dataset and divide the result by the overall dataset’s standard
The process begins with converting the float and string deviation (or, in the case of a multivariate instance, the
variables to integers. We excluded patient ID and date characteristic), taking into account the statistical distribution.
as irrelevant variables for analysis. I used the variables’
Test x¢s regular score is calculated as:
correlations to identify features with a correlation probability
higher than 95%. The data increased to 465 after using the z = (x – u) s (1)
SMOTE approach to address the class imbalance problem.. Where u is the sample mean and s is the standard deviation.
ii. Data Splitting
We created a training dataset and a test dataset by dividing
the data during the data splitting phase. As a general rule,
70% of the time is spent training, and 30% is spent testing
[61]. To avoid overfitting the model, data splitting is used
when testing it with the testing dataset.
Evaluation of Classifications without Feature SelectionAt this
point, we built three models using the classifiers naive bayes,
random forest, decision tree, and KNN, and we compared
their relative outcomes, which came out to 43.9%, 81.8%,
56.6%, and 71.21%.
Applying Principal Component Analysis for ClassificationAt
this point, we used principal component analysis (PCA)
to lower the data’s dimensionality, and the classifiers were
Naïve Bayes, Random Forest, Decision Tree, and KNN. We
compared their respective results, which came out at 71.2%,
82%, 81.59%, and 73%.
Assessing Categories with PCACSOWe used the PCA with
Cuckoo Search Optimisation (PCACSO) hybrid feature
selection model at this stage, and we compared the results
with the classifiers Naïve Bayes, Random Forest, Decision
Fig. 77.1 Proposed methodology Tree, and KNN, which yielded 60.73%, 87%, 82.82%, and
74.23%, respectively.
Table 77.1 3 x 3 confusion matrix
Confusion matrix & Predicted False negative Recall
classification metrices
Class0 Class1 Class2
Class 0 P00 P10 P20 P10+P20 P00/(P00+P10+P20)
Class 1 P01 P11 P21 P01+P21 P11/(P11+P01+P21)
Actual Class 2 P02 P12 P22 P02+P12 P22/(P02+P12+P22)
False Positive P01+ P10+ P20+P2 Overall Accuracy = P00+P11+P22/(
P02+ P12+ 1 P00+P01+P02+P10+P11+P12+P20+P21+P22)
Precision P00/( P11/( P22/(
P00+ P11+ P20+P2
P01+ P10+ 1+P22)
P02) P12)
PCACSO Feature Selection for Prediction of Breast Cancer NAC Response 517

Comparative Analysis dataset, which includes 216 examples with 12 attributes. The
At the moment, we use PCA and PCACSO and depend characteristics and areas covered by the dataset include
solely on feature selection to assess the efficacy of several
classifiers, including KNN, GNB, DT, and RF. Our analysis 6. Classification Algorithm
determined that all samples met the criteria for this study. Descriptions
The confusion matrix, which details the recommended
classifier’s actual and anticipated classifications, is one For the purpose of predicting breast cancer PCR, this
way to assess a prediction model’s accuracy. An oncologist study compared three well-known [61] prediction model
validated and assessed the suggested model to ensure the classification algorithms: naive Bayes, support vector
accuracy of the prediction model and classification. We also machines, and K-nearest neighbour. In the paragraph that
used ROC curves to look at the confusion matrix and the follows, we provide a brief description of each algorithm.
representations of the categorical measures. Assessments of Naive Bayes Algorithm
sensitivity, specificity, positive and negative predictive value,
and accuracy in classification are all part of this process. In Both Bayesian classification and the statistical method of
order to ensure the correctness, precision, and efficacy of the classification are supervised learning approaches. It works
suggested strategy, we compared it to data mining techniques. on the premise of a probabilistic model and lets us capture,
The variable P00 represents the exactly identified predictions in theory, the uncertainty relative to the model through event
pertaining to class 0. Additionally, this is the true benefit of probability calculations. Diagnostic and predictive issues
Class 0. P10 displays the frequency of incorrectly labelling can be addressed with its help. Thomas Bayes (1702–1761)
class 0 variables as class 1. The number of instances where proposed Bayes’ theorem [27] [28], which served as the
class 0 predictions are mistakenly labelled as class 2 is shown basis for this categorization. Using a combination of prior
by P20. Variable P01 represents the misclassification rate of knowledge and observed data, Bayesian classification offers
class 1 as class 0. P11 represents the number of times class 1 effective learning approaches. Bayesian classification provides
variables were correctly classified as class 1. For class 1, this a useful framework for understanding and evaluating various
is likewise a true positive. The number of class 1 projections learning techniques. It computes transparent probabilities for
that were incorrectly labelled as class 2 is represented hypotheses and is robust against input data noise.
by variable P21. Variable P02 indicates that the model K-Nearest Neighbors Algorithm
incorrectly identified the class 2 prediction count as class 0. The efficacy of merging feature selection and classification
P12 shows how many predictions got Class 2 variables wrong algorithms in breast cancer prognosis prediction is the subject
and thought they were Class 1. P22 represents the number of of this research. Feature selection strategies that aim to reduce
successfully detected class 2 predictions. For class 1, this is the number of features can increase the performance of most
likewise a true positive. classification algorithms, according to our suggestion. When
it comes to the outcomes of categorization algorithms, some
5. Dataset Description features are far more influential than others. Using five
popular classification algorithms—Gaussian naive basis,
Studying the aforementioned approaches and procedures, this support vector machines, K-nearest neighbour, random
research made use of the American Oncology Institute’s (AOI) forest, and decision tree—we present our testing results.
Table 77.2 Dataset description We also looked at how Cuckoo Search Optimisation (CSO),
a feature selection method, affected these algorithms. In
Variable Description
summary, Random Forest outperformed the other methods
AGE Patient Age regardless of the use of PCACSO. However, when PCACSO
ERPOS Estrogen Receptor Status was included, the performance of the other four methods also
PGRPOS Progesterone Receptor Status
HR POS Hormone Receptor Status
improved. Evaluating more complex algorithms to enhance
HER2MOSTPOS Her2 Status accuracy will be a part of future studies. Cluster methods and
HR_HER2_CATEGORY 3-level HR/Her2 category pre ensemble algorithms are the subjects of our experiments.
treatment
BILATERALCA Does the patient have bilateral
Decision Tree
breast cancer Decision trees, a categorization method, use recursive
LATERALITY Index Tumor Laterality left or right splitting of the instance space. By associating data about
MRI 1 MRI LD Baseline
MRI 2 MRI LD 1-3d AC
specific nodes with their expected values, it builds a model
MRI 3 MRI LD InterReg for making predictions in the future. The leaves of the tree
MRI 4 MRI LD PreSurg structure represent the class labels, while the branches reflect
the feature combinations that lead to category labels [20].
518 Algorithms in Advanced Artificial Intelligence

7. Feature Selection Algorithm Principal component analysis (PCA) complements feature

extraction and dimensionality reduction. Use Principal
Descriptions Component Analysis (PCA) feature extraction comparisons
PCA stands for Principal Component Analysis.Principal to narrow down feature sets. To invoke parameters, The
Component Analysis (PCA) is a well-known approach for variables include the following: k host nests, N cuckoos/
reducing data dimensionality.A technique for reducing the nested, i stages/generation, and T maximum/generation.
number of statistical variables from a large set to a smaller set Establishing a host nest population is the initial undertaking.
without significantly losing any useful information is known Py(y=1,2,..., k)
as dimensionality reduction [29]. To carry out dimensionality
reduction, form an orthogonal foundation vector. Here are the While (i<T)
basic steps of the PCA set of rules: {
Data Input: Select the full dataset during the initial For(y=0; y<=N; y++)
phase. Think about a scenario where the typical area has {
d-dimensional samples but no output labels. A matrix of Move cuckoo to the new nest L size step
dimensions u ¥ n needs to be translated into a N dimensional Calculate fitness value of Fy.
vector using the following input information: X0, X0,1,, … , Select nest Z randomly
Xm,n. If (Fy>Fz)
X11 X12 X13 Fz>Fy
}
InputDatam*n = X21 X22 X23 (1)
XM1 XM2 XMN We build new nests and dispose of a fraction of the worst
ones, with a probability of Pa. I am holding on to the greatest
Compute the N-dimensional mean vector the usage of the aspects for now. Future generations will have the best features
equation given below. that are currently available.
Xmean = (1/N)SNi=1 Xi (2) Each generation passes on the most fit nest to the next one.
Compute the covariance matrix for the dataset as given below.
Cov11 Cov12 Cov13 8. Experimental Results and
Cov21 Cov22 Cov23 (3) Discussion
CovN1 CovN2 CovNN Here we briefly go over the experimental outcomes from all
Calculate the Eigen values and Eigenvectors. three stages: the feature selection phase-free classification
assessment, the PCA classification evaluation, the PCACSO
After determining the eigenvalues, process the resulting classification evaluation, and the comparison analysis.
eigenvectors in decreasing order. During component selection
and feature vector formation, we follow this process. We
choose the k-eigenvectors from the sorted eigenvectors to
form a W matrix with dimensions d × k. Every column in this
matrix stands for a unique eigenvector.
The generated d × k eigenvector matrix W transforms the
samples into a new subspace, which is then employed for the
formation of principal components.
Cuckoo Search Based Feature Selection:
The optimisation metaheuristic that Yang and Deb created
was based on the cuckoo search (CS), which is the approach
that cuckoo birds use to expand their wings. The idea is
based on the fact that many different kinds of birds lay their
eggs in different kinds of nests.Choosing the optimal subset
of features is the primary goal of cuckoo search. Host nests
Fig. 77.2 Comparison of existing and novel feature selection
represent the characteristics of principal component analysis.
Researchers choose the properties of Cuckoo Search in this Current techniques such as decision trees, random forests,
way:Type breast cancer details into the PCR prognosis tool. K-nearest neighbours, support vector machines, and Gaussian
PCACSO Feature Selection for Prediction of Breast Cancer NAC Response 519

naive Bayesian are used for the classification evaluation. The 8. Susmitha, Uddaraju. “A Review of Machine Learning
image below shows the outcomes of the experiment. Using Frameworks For Early And Accurate Prediction Of
PCACSO as a feature selection clearly improves the accuracy Neoadjuvant Chemotherapy Responses.” European Journal
of the machine learning model, according to the experimental Of Molecular & Clinical Medicine 7.4 (2020): 1040-1050.
9. S.Şahan, K. Polat, H. Kodaz, And S. Güneş, “A New Hybrid
results.
Method Based On Fuzzy-Artificial Immune System And
K-Nn Algorithm For Breast Cancer Diagnosis,” Comput. Biol.
9. Conclusion and Future Perspective Med., Vol. 37, Pp. 415–423, March 2007.
10. Avikadutta, Stacking, Https://Www.Geeksforgeeks.Org/
This study set out to investigate how breast cancer prognosis Stacking-In-Machine-Learning/.
is affected by combining feature selection and classification 11. Himanshisingh, Advanced Ensemble Learning Technique–
methods. We propose that most classification systems can be Stacking And Its Variants, Https://Www.Analyticsvidhya.
improved by reducing the amount of features using feature Com/Blog/2021/03/Advanced-Ensemble-Learning
selection methods. The outputs produced by categorization Technique-Stacking-And-Its-Variants.
algorithms are more heavily impacted by some characteristics 12. Sun, Wei, and Jingyi Sun. “Daily PM2. 5 Concentration
than others. Decision trees with and without feature selection, Prediction Based On Principal Component Analysis And
support vector machines, K-nearest neighbour, decision LSSVM Optimized By Cuckoo Search Algorithm.” Journal
Of Environmental Management 188 (2017): 144-152.
trees, random forests, and cuckoo search optimisation (CSO)
13. Uddaraju, Susmitha, GP Saradhi Varma, And M. R.
are the five prominent classification techniques that we
Narasingarao. “Prediction Of NAC Response In Breast
tested. In the end, Random Forest outperformed the other Cancer Patients Using Neural Network.” Scalable Computing:
four approaches, particularly when PCACSO was included. Practice And Experience 23.4 (2022): 211-224.
Even without PCACSO, Random Forest outperformed the 14. Naik, Manoj Kumar, and Rutuparna Panda. “A Novel Adaptive
others. Modern algorithms that aim to increase accuracy will Cuckoo Search Algorithm For Intrinsic Discriminate Analysis
be the focus of the following section of this research. Our Based Face Recognition.” Applied Soft Computing 38 (2016):
experiments mostly centre on approaches for grouping and 661-675.
ensembles.. 15. Katarya, Rahul, and Om Prakash Verma. “An Effective
Collaborative Movie Recommender System With Cuckoo
Search.” Egyptian Informatics Journal 18.2 (2017): 105-112.
References 16. \Uddaraju, Susmitha, and M. R. Narasingarao. “Predicting The
Ductal Carcinoma Using Machine Learning Techniques—A
1. RM, SwarnaPriya, Et Al. Effective Feature Engineering For
Comparison.” Journal Of Computational And Theoretical
DNN Using Hybrid PCA-GWO For Intrusion Detection In
Nanoscience 16.5-6 (2019): 1902-1907.
IoMT Architecture.” Computer Communications 160 (2020):
17. Sannasi Chakravarthy, S. R., and HarikumarRajaguru.
139-149.
“Comparison Analysis of Linear Discriminate Analysis And
2. Sakri, Sapiahbinti, Nurainibinti Abdul Rashid, and Zuhaira
Cuckoo-Search Algorithm In The Classification of Breast
Muhammad Zain. “Particle Swarm Optimization Feature
Cancer From Digital Mammograms.” Asian Pacific Journal
Selection for Breast Cancer Recurrence Prediction.” IEEE
Of Cancer Prevention: APJCP 20.8 (2019): 2333.
Access 6 (2018): 29637-29647.
18. Sudha, M. N., And S. Selvarajan. “Feature Selection Based
3. Uddaraju, Susmitha, And M. Narasingarao. “A Survey Of
On Enhanced Cuckoo Search For Breast Cancer Classification
Machine Learning Techniques Applied For Breast Cancer
In Mammogram Image.” Circuits and Systems 7.04 (2016):
Prediction.” International Journal Of Pure And Applied
327.
Mathematics 117.19 (2017): 499-507.
19. Lavanya, D., and Dr K. Usha Rani. “Analysis of Feature
4. Cortazar, P., & Geyer, C. E. (2015). Pathological Complete
Selection with Classification: Breast Cancer Datasets.” Indian
Response In Neoadjuvant Treatment Of Breast Cancer. Annals
Journal of Computer Science And Engineering (IJCSE) 2.5
Of Surgical Oncology, 22(5), 1441-1446.
(2011): 756-763.
5. A. Bhardwaj And A. Tiwari, “Breast Cancer Diagnosis Using
20. Akay, Mehmet Fatih. “Support Vector Machines Combined
Genetically Optimized Neural Network Model,” Expert Syst.
With Feature Selection For Breast Cancer Diagnosis.” Expert
Appl., Vol. 42, Pp. 4611–4620, 15 June 2015.
Systems with Applications 36.2 (2009): 3240-3247.
6. W. C. Yeh, W. -W. Chang, And Y. Y. Chung, “A New Hybrid
21. [26] Eswar, “An Enhanced and Naive Clustering Algorithm
Approach For Mining Breast Cancer Pattern Using Discrete
for Text Classification Based on Weight,” International Journal
Particle Swarm Optimization And Statistical Method,” Expert
& Magazine of Engineering, Technology, Management and
Syst. Appl., Vol. 36, Pp. 8204–8211, May 2009.
Research, Dec. 2012.
7. B. Zheng, S. W. Yoon, And S. S. Lam, “Breast Cancer
22. Chen, Hui-Ling, Et Al. “A Support Vector Machine Classifier
Diagnosis Based On Feature Extraction Using A Hybrid Of
With Rough Set-Based Feature Selection For Breast Cancer
K-Means And Support Vector Machine Algorithms,” Expert
Diagnosis.” Expert Systems with Applications 38.7 (2011):
Syst. Appl., Vol. 41, Pp. 1476–1482, March 2014.
9014-9022.
520 Algorithms in Advanced Artificial Intelligence

23. Aalaei, Shokoufeh, Et Al. “Feature Selection Using Genetic Spectroscopic Imaging (MRSI).” NMR In Biomedicine 23.3
Algorithm For Breast Cancer Diagnosis: Experiment On (2010): 233-241.
Three Different Datasets.” Iranian Journal Of Basic Medical 28. King, Tari A., and Monica Morrow. “Surgical Issues
Sciences 19.5 (2016): 476. in Patients with Breast Cancer Receiving Neoadjuvant
24. Susmitha, U., and D. Rajeswara Rao. “Optimized Secure Chemotherapy.” Nature Reviews Clinical Oncology 12.6
Confirmations Using Smart Card Evaluation In Multi Cloud (2015): 335-343.
Storage.”(2006). 29. Sharma, Uma, Et Al. “Longitudinal Study of The Assessment
25. Uppendra, “Predict Early Pneumonitis in Health Care Using By MRI And Diffusion Weighted Imaging Of Tumor Response
Hybrid Model Algorithms,” Journal of Artificial Intelligence, In Patients With Locally Advanced Breast Cancer Undergoing
Machine Learning and Neural Network (JAIMLNN), vol. 3, Neoadjuvant Chemotherapy.” NMR In Biomedicine: An
issue 03, pp. 14-26,ISSN: 2799-1172, Apr. 2023. International Journal Devoted To the Development and
26. Fayanju, Oluwadamilola M., Et Al. “The Clinical Significance Application Of Magnetic Resonance In Vivo 22.1 (2009):
of Breast-Only And Node-Only Pathologic Complete 104-113.
Response (PCR) After Neoadjuvant Chemotherapy (NACT): 30. Kümmel, S., J. Holtschmidt, And S. Loibl. “Surgical
A Review Of 20,000 Breast Cancer Patients In The National Treatment Of Primary Breast Cancer In The Neoadjuvant
Cancer Data Base (NCDB).” Annals Of Surgery 268.4 (2018): Setting.” Journal Of British Surgery 101.8 (2014): 912-924.
591. 31. Graham, Peter J., Et Al. “Neoadjuvant Chemotherapy For
27. Danishad, KarikanniKalathil A., Et Al. “Assessment Of Breast Cancer, Is Practice Changing? A Population-Based
Therapeutic Response Of Locally Advanced Breast Cancer Review of Current Surgical Trends.” Annals of Surgical
(LABC) Patients Undergoing Neoadjuvant Chemotherapy Oncology 22.10 (2015): 3376-3382.
(NACT) Monitored Using Sequential Magnetic Resonance
Note: All the figures and tables in this chapter were designed by
the author.

Technologies for Modern Digital Entrepreneurship: Understanding Emerging Tech at the Cutting-Edge of the Web 3.0 Economy Abeba N. Turi download
100% (5)
Technologies for Modern Digital Entrepreneurship: Understanding Emerging Tech at the Cutting-Edge of the Web 3.0 Economy Abeba N. Turi download
62 pages
Data-Driven Decision Making in Entrepreneurship: Tools for Maximizing Human Capital 1st Edition Nikki Blacksmith pdf download
100% (2)
Data-Driven Decision Making in Entrepreneurship: Tools for Maximizing Human Capital 1st Edition Nikki Blacksmith pdf download
64 pages
Ai Federated Learning Fundamentals Challenges
No ratings yet
Ai Federated Learning Fundamentals Challenges
309 pages
Combating Fake News With Computational Intelligence Techniques
No ratings yet
Combating Fake News With Computational Intelligence Techniques
432 pages
Apple Silicon CPU Optimization Guide
No ratings yet
Apple Silicon CPU Optimization Guide
169 pages
The Hitchhiker's Guide To Ethereum - Delphi Digital
No ratings yet
The Hitchhiker's Guide To Ethereum - Delphi Digital
60 pages
Efficient Processing of Deep Neural Networks
No ratings yet
Efficient Processing of Deep Neural Networks
58 pages
A Learning Algorithm For Continually Running Fully Recurrent Neural Networks
No ratings yet
A Learning Algorithm For Continually Running Fully Recurrent Neural Networks
10 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
31 pages
978-981-97-6806-6
No ratings yet
978-981-97-6806-6
410 pages
An Introduction To Deep Reinforcement Learning PDF
No ratings yet
An Introduction To Deep Reinforcement Learning PDF
140 pages
Natural Language Processing and Information Retrieval Principles and Applications (Muskan Garg Etc.) (Z-Library)
100% (1)
Natural Language Processing and Information Retrieval Principles and Applications (Muskan Garg Etc.) (Z-Library)
271 pages
Blockchain-Powered Learning Pathway Suggestion System: 1 Udit Kumar Mahaldar 2 Dr. Dinakaran M
No ratings yet
Blockchain-Powered Learning Pathway Suggestion System: 1 Udit Kumar Mahaldar 2 Dr. Dinakaran M
5 pages
Structured Deep Neural Networks For Speech Recognition
No ratings yet
Structured Deep Neural Networks For Speech Recognition
196 pages
Solanki A. Applications of Blockchain and Big IoT Systems... 2023
100% (1)
Solanki A. Applications of Blockchain and Big IoT Systems... 2023
561 pages
(Business Guides on the Go) Ralf T. Kreutzer, Sonja Klose - Next Stop Metaverse_ A Quick Guide to Concepts, Uses, and Potential for Research and Practice-Springer (2023)
No ratings yet
(Business Guides on the Go) Ralf T. Kreutzer, Sonja Klose - Next Stop Metaverse_ A Quick Guide to Concepts, Uses, and Potential for Research and Practice-Springer (2023)
148 pages
Data Economy in The Digital Age: Samiksha Shukla Kritica Bisht Kapil Tiwari Shahid Bashir
No ratings yet
Data Economy in The Digital Age: Samiksha Shukla Kritica Bisht Kapil Tiwari Shahid Bashir
139 pages
Demon Hunter Birthright An Urban Fantasy Harem Adventure 1st Edition Michael Dalton Download PDF
86% (7)
Demon Hunter Birthright An Urban Fantasy Harem Adventure 1st Edition Michael Dalton Download PDF
52 pages
Lecture 1 Introduction by Dr. Fazeel Abid
No ratings yet
Lecture 1 Introduction by Dr. Fazeel Abid
26 pages
The Anatomy of Blockchain Database Systems Blockchain Database System
No ratings yet
The Anatomy of Blockchain Database Systems Blockchain Database System
7 pages
GNN-XAI 学习提纲.md
No ratings yet
GNN-XAI 学习提纲.md
4 pages
A Review of Lightweight Blockchain Technology Implementation To The Internet of Things PDF
No ratings yet
A Review of Lightweight Blockchain Technology Implementation To The Internet of Things PDF
6 pages
Blockchain Technologyasa Mechanismfor Digital Railway Ticketing
No ratings yet
Blockchain Technologyasa Mechanismfor Digital Railway Ticketing
10 pages
Artificial_Intelligence_for_Blockchain_-_Mariya_Ouaissa
100% (1)
Artificial_Intelligence_for_Blockchain_-_Mariya_Ouaissa
377 pages
Use C# and ML - Net Machine Learning To Predict Taxi Fares in New York
No ratings yet
Use C# and ML - Net Machine Learning To Predict Taxi Fares in New York
19 pages
VR Lecture Note 2023
No ratings yet
VR Lecture Note 2023
71 pages
blockchain-initial-hype-might-be-fading-blockchain-needs-to-get-less-complex
No ratings yet
blockchain-initial-hype-might-be-fading-blockchain-needs-to-get-less-complex
77 pages
Blockchain: Blueprintfor A New Economy
No ratings yet
Blockchain: Blueprintfor A New Economy
5 pages
Sanjay K. Kuanar, Brojo Kishore Mishra, Sheng-Lung Peng, Daniel D. Dasig Jr. - The Role of IoT and Blockchain - Techniques and Applications-Apple Academic Press (2022)
100% (1)
Sanjay K. Kuanar, Brojo Kishore Mishra, Sheng-Lung Peng, Daniel D. Dasig Jr. - The Role of IoT and Blockchain - Techniques and Applications-Apple Academic Press (2022)
501 pages
Stas Bekman - Machine Learning Engineering
No ratings yet
Stas Bekman - Machine Learning Engineering
217 pages
Dbatu university blockchain notes BCT-4 Unit
No ratings yet
Dbatu university blockchain notes BCT-4 Unit
32 pages
Bitstamp Crypto Categories Handbook
No ratings yet
Bitstamp Crypto Categories Handbook
20 pages
Astroinformatics Initiatives Actual Offer Future Demand and Opportunities Study
No ratings yet
Astroinformatics Initiatives Actual Offer Future Demand and Opportunities Study
20 pages
AI and GenAI Adoption by LRAs
No ratings yet
AI and GenAI Adoption by LRAs
172 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
406 pages
Google ML Advanced Manufacturing 02
No ratings yet
Google ML Advanced Manufacturing 02
95 pages
Overview of Blockchain
No ratings yet
Overview of Blockchain
9 pages
10 Years in AI - Foundations of General AI Gary Markus
No ratings yet
10 Years in AI - Foundations of General AI Gary Markus
55 pages
Download full HBR Insights Web3, Crypto, and Blockchain Collection (3 Books) (HBR Insights Series) Review ebook all chapters
100% (4)
Download full HBR Insights Web3, Crypto, and Blockchain Collection (3 Books) (HBR Insights Series) Review ebook all chapters
76 pages
ALLEN, Booz The Modern Artificial Intelligence Primer
No ratings yet
ALLEN, Booz The Modern Artificial Intelligence Primer
48 pages
Video Diffusion Tutorial Prof Mike Shou NUS 2023 Dec 15
No ratings yet
Video Diffusion Tutorial Prof Mike Shou NUS 2023 Dec 15
274 pages
Blockchain Certification Brochured
No ratings yet
Blockchain Certification Brochured
16 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
Machine Learning-1
No ratings yet
Machine Learning-1
64 pages
The Little Book of Deep Learning
100% (1)
The Little Book of Deep Learning
140 pages
Practical Data Mining 1st Edition Monte F. Hancock Jr 2024 scribd download
100% (4)
Practical Data Mining 1st Edition Monte F. Hancock Jr 2024 scribd download
81 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
7 pages
Predictive Machine Learning Applying Cross Industry Standard Process For Data Mining For The Diagnosis of Diabetes Mellitus Type 2
No ratings yet
Predictive Machine Learning Applying Cross Industry Standard Process For Data Mining For The Diagnosis of Diabetes Mellitus Type 2
14 pages
Docs Scrapy Org en Latest
No ratings yet
Docs Scrapy Org en Latest
382 pages
1 Overview of Artificial Intelligence
No ratings yet
1 Overview of Artificial Intelligence
61 pages
Automated Planning Acting
No ratings yet
Automated Planning Acting
359 pages
Deep Learning Approaches For Network Int
No ratings yet
Deep Learning Approaches For Network Int
116 pages
A Comprehensive Study and Performance Analysis of Deep Neural Network-Based Approaches in Wind Time-Series Forecasting
No ratings yet
A Comprehensive Study and Performance Analysis of Deep Neural Network-Based Approaches in Wind Time-Series Forecasting
18 pages
Web3: The Next Internet Revolution
No ratings yet
Web3: The Next Internet Revolution
12 pages
A Comprehensive Survey On Applications of Transformers For Deep Learning Tasks
No ratings yet
A Comprehensive Survey On Applications of Transformers For Deep Learning Tasks
58 pages
Complete Download Bayesian Artificial Intelligence 1st Edition Kevin B. Korb PDF All Chapters
100% (9)
Complete Download Bayesian Artificial Intelligence 1st Edition Kevin B. Korb PDF All Chapters
60 pages
A Review of Prospective Applications of Blockchain
No ratings yet
A Review of Prospective Applications of Blockchain
22 pages
(Synthesis Lectures On Artificial Intelligence and Machine Learning) Philip Osborne, Kajal Singh, Matthew E. Taylor - Applying Reinforcement Learning On Real-World Data With Practical Examples in Pyth
No ratings yet
(Synthesis Lectures On Artificial Intelligence and Machine Learning) Philip Osborne, Kajal Singh, Matthew E. Taylor - Applying Reinforcement Learning On Real-World Data With Practical Examples in Pyth
105 pages
A Comparative Study Deepfake Detection Using Deep-Learning
No ratings yet
A Comparative Study Deepfake Detection Using Deep-Learning
5 pages
Gururaj H. Recent Trends in Computational Sciences... 2023
No ratings yet
Gururaj H. Recent Trends in Computational Sciences... 2023
365 pages
ARTIGO ASRS 2
No ratings yet
ARTIGO ASRS 2
13 pages
Maximising Operational Uptime: A Strategic Approach To Mitigate Unplanned Machine Downtime and Boost Productivity Using Machine Learning Techniques
No ratings yet
Maximising Operational Uptime: A Strategic Approach To Mitigate Unplanned Machine Downtime and Boost Productivity Using Machine Learning Techniques
13 pages
Titanic Eda
No ratings yet
Titanic Eda
14 pages
2022 10 12 Exam Pa Project Statement
No ratings yet
2022 10 12 Exam Pa Project Statement
25 pages
Machine Learning-Driven Credit Risk A Systemic Rev
No ratings yet
Machine Learning-Driven Credit Risk A Systemic Rev
14 pages
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
No ratings yet
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
22 pages
Artificial Intelligence in Dental Research - Checklist For Authors, Reviewers, Readers
No ratings yet
Artificial Intelligence in Dental Research - Checklist For Authors, Reviewers, Readers
8 pages
machine-learning-module-3-logistic-regression
No ratings yet
machine-learning-module-3-logistic-regression
22 pages
Dementia Diagnosis in Seven Languages - The Addenbrooke's Cognitive Examination-III in India
No ratings yet
Dementia Diagnosis in Seven Languages - The Addenbrooke's Cognitive Examination-III in India
11 pages
Development and Validation of College Students ' Tuberculosis Knowledge, Attitudes and Practices Questionnaire (CS-TBKAPQ)
No ratings yet
Development and Validation of College Students ' Tuberculosis Knowledge, Attitudes and Practices Questionnaire (CS-TBKAPQ)
11 pages
Automatic Detection of Schizophrenia by Applying Deep Learning over Spectrogram Images of EEG Signals
No ratings yet
Automatic Detection of Schizophrenia by Applying Deep Learning over Spectrogram Images of EEG Signals
10 pages
Obfuscated Malware Detection Using Dilated Convolutional Network
0% (1)
Obfuscated Malware Detection Using Dilated Convolutional Network
6 pages
Lab Manual
No ratings yet
Lab Manual
44 pages
Balaji Capstone Project 2
No ratings yet
Balaji Capstone Project 2
56 pages
Diagnosis of Postoperative Urinary Retention Using A Simplified Ultrasound Bladder Measurement (Anesth Analg 2015)
No ratings yet
Diagnosis of Postoperative Urinary Retention Using A Simplified Ultrasound Bladder Measurement (Anesth Analg 2015)
6 pages
Machine Learning For Science and Society: Cynthia Rudin
No ratings yet
Machine Learning For Science and Society: Cynthia Rudin
9 pages
Comprehensive Guide To Multiclass Classification With Sklearn - Towards Data Science
No ratings yet
Comprehensive Guide To Multiclass Classification With Sklearn - Towards Data Science
19 pages
Amogh Gulati, Anvit Mangal, Pranav Goyal: Attention in Cnns - Pls Change
No ratings yet
Amogh Gulati, Anvit Mangal, Pranav Goyal: Attention in Cnns - Pls Change
1 page
Malware Detection Using Statistical Analysys Byte Level File Content
No ratings yet
Malware Detection Using Statistical Analysys Byte Level File Content
9 pages
Fintech Assignment5
No ratings yet
Fintech Assignment5
4 pages
Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features
No ratings yet
Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features
10 pages
A Novel Deep Learning Framework Approach For Sugarcane Disease Detection
No ratings yet
A Novel Deep Learning Framework Approach For Sugarcane Disease Detection
20 pages
Dlsu Aki Working Paper Series 2022-09-085
No ratings yet
Dlsu Aki Working Paper Series 2022-09-085
48 pages
Predicting Cricket Match 490021 1 en
No ratings yet
Predicting Cricket Match 490021 1 en
13 pages
Machine Learning For Drilling Applicat 2022 Journal of Natural Gas Science A
No ratings yet
Machine Learning For Drilling Applicat 2022 Journal of Natural Gas Science A
17 pages
HIGT
No ratings yet
HIGT
11 pages
Customer Segmentation & Churn Prediction
No ratings yet
Customer Segmentation & Churn Prediction
21 pages
Lay It Out: Detecting of Fake News Publishers Through Website Structure Data
No ratings yet
Lay It Out: Detecting of Fake News Publishers Through Website Structure Data
10 pages
ai q
No ratings yet
ai q
15 pages
Computational Method For Single Cell Data Analysis
No ratings yet
Computational Method For Single Cell Data Analysis
270 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.