Icicit 2020
Icicit 2020
S. Smys
Valentina Emilia Balas
Khaled A. Kamel
Pavel Lafata Editors
Inventive
Computation
and Information
Technologies
Proceedings of ICICIT 2020
Lecture Notes in Networks and Systems
Volume 173
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Fernando Gomide, Department of Computer Engineering and Automation—DCA,
School of Electrical and Computer Engineering—FEEC, University of Campinas—
UNICAMP, São Paulo, Brazil
Okyay Kaynak, Department of Electrical and Electronic Engineering,
Bogazici University, Istanbul, Turkey
Derong Liu, Department of Electrical and Computer Engineering,
University of Illinois at Chicago, Chicago, USA;
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Witold Pedrycz, Department of Electrical and Computer Engineering,
University of Alberta, Alberta, Canada;
Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Marios M. Polycarpou, Department of Electrical and Computer Engineering,
KIOS Research Center for Intelligent Systems and Networks, University of Cyprus,
Nicosia, Cyprus
Imre J. Rudas, Óbuda University, Budapest, Hungary
Jun Wang, Department of Computer Science, City University of Hong Kong,
Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest
developments in Networks and Systems—quickly, informally and with high quality.
Original research reported in proceedings and post-proceedings represents the core
of LNNS.
Volumes published in LNNS embrace all aspects and subfields of, as well as new
challenges in, Networks and Systems.
The series contains proceedings and edited volumes in systems and networks,
spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor
Networks, Control Systems, Energy Systems, Automotive Systems, Biological
Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems,
Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems,
Robotics, Social Systems, Economic Systems and other. Of particular value to both
the contributors and the readership are the short publication timeframe and the
world-wide distribution and exposure which enable both a wide and rapid
dissemination of research output.
The series covers the theory, applications, and perspectives on the state of the art
and future developments relevant to systems and networks, decision making, control,
complex processes and related areas, as embedded in the fields of interdisciplinary
and applied sciences, engineering, computer science, physics, economics, social, and
life sciences, as well as the paradigms and methodologies behind them.
Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago.
All books published in the series are submitted for consideration in Web of Science.
Editors
Inventive Computation
and Information
Technologies
Proceedings of ICICIT 2020
123
Editors
S. Smys Valentina Emilia Balas
RVS Technical Campus “Aurel Vlaicu” University of Arad
Coimbatore, India Arad, Romania
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
We are honored to dedicate this book to all
the participants and editors of ICICIT 2020.
Preface
This conference proceedings volume contains the written versions of most of the
contributions presented during the conference of ICICIT 2020. The conference
provided a setting for discussing recent developments in a wide variety of topics
including cloud computing, artificial intelligence, and fuzzy neural systems. The
conference has been a good opportunity for participants coming from various
destinations to present and discuss topics in their respective research areas.
This conference tends to collect the latest research results and applications on
computation technology, information, and control engineering. It includes a
selection of 71 papers from 266 papers submitted to the conference from univer-
sities and industries all over the world. All of the accepted papers were subjected to
strict peer-reviewing by two–four expert referees. The papers have been selected for
this volume because of the quality and the relevance to the conference.
We would like to express our sincere appreciation to all authors for their
contributions to this book. We would like to extend our thanks to all the referees for
their constructive comments on all papers; especially, we would like to thank the
organizing committee for their hard work. Finally, we would like to thank the
Springer publications for producing this volume.
vii
Contents
ix
x Contents
Dr. S. Smys received his M.E. and Ph.D. degrees all in Wireless Communication
and Networking from Anna University and Karunya University, India. His main
area of research activity is localization and routing architecture in wireless net-
works. He serves as Associate Editor of Computers and Electrical Engineering
(C&EE) Journal, Elsevier, and Guest Editor of MONET Journal, Springer. He has
served as Reviewer for IET, Springer, Inderscience and Elsevier journals. He has
published many research articles in refereed journals and IEEE conferences. He
has been General Chair, Session Chair, TPC Chair and Panelist in several con-
ferences. He is a member of IEEE and a senior member of IACSIT wireless
research group. He has been serving as Organizing Chair and Program Chair of
several international conferences, and in the Program Committees of several
international conferences. Currently, he is working as Professor in the Department
of Information Technology at RVS Technical Campus, Coimbatore, India.
xvii
xviii Editors and Contributors
been General Chair, Session Chair, TPC Chair and Panelist in several conferences
and acted as Reviewer and Guest Editor in refereed journals. His research interest
includes networks, computing and communication systems.
Dr. Pavel Lafata received his M.Sc. degree in 2007 and the Ph.D. degree in 2011
from the Department of Telecommunication Engineering, Faculty of Electrical
Engineering, Czech Technical University in Prague (CTU in Prague). He is now
Assistant Professor at the Department of Telecommunication Engineering of the
CTU in Prague. Since 2007, he has been actively cooperating with several leading
European manufacturers of telecommunication cables and optical network com-
ponents performing field and laboratory testing of their products as well as con-
sulting further research in this area. He also cooperates with many impact journals
as Fellow Reviewer, such as International Journal of Electrical Power & Energy
Systems, Elektronika ir Elektrotechnika, IEEE Communications Letters, Recent
Patents on Electrical & Electronic Engineering, International Journal of Emerging
Technologies in Computational and Applied Sciences and China Communications.
Contributors
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 1
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_1
2 J. Uma et al.
processors, I/O, networks, storage and servers. It is imperative that these resources
are effectively utilized in the cloud environment. With varying resource availability
and workloads, keeping up with the quality of service (QoS) and simultaneously
maintaining an effective usage of resources and system performance are critical tasks
at hand. This gives rise to issues between the cloud resource provider and user for
maximizing the resource usage effectively. Thus, the basic tenet of cloud computing
is resources allotment [2].
Some of the key issues in cloud computing have been resolved using meta-
heuristic algorithms that have a stronghold in the research field, due to its efficacy and
efficiency. Despite a lot of attention being garnered for resource allocation in cloud
computing from the global research community, several of late studies have drawn
attention towards the progress made in this area. The objective of resource alloca-
tion [3] is finding an optimal and feasible allocation scheme for a certain service.
Classification of effective resource assignment schemes which efficiently utilize the
constrained resources in the cloud environment has been performed. Resource assign-
ment in distributed clouds and concentrating are the chief issues with regard to the
challenges faced in the cloud paradigm. The issues extend to resource discovery, its
availability, selecting the appropriate resource, treating and offering the resource,
monitoring the resource, etc. Despite various issues present in the new research,
distributed cloud is promising for usage across different contexts.
Provisioning of resources is done by means of virtual machine (VM) technology
[4]. Virtual environment has the potential of decreasing the mean job response time
and executing the tasks as per the resource availability. The VMs are assigned to
the users based on the nature of the job to be executed. A production environment
involves several tasks being submitted to the cloud. Thus, the job scheduler software
should comprise interfaces for defining the workflows and/or job dependencies for
automatically executing the submitted tasks. All of the required VM images that are
needed for running the user-related tasks are preconfigured by the cloud broker and
stored in the cloud. All jobs that enter are sent into a queue. These jobs and a pool
of machines are all managed by a system-level scheduler which runs on a particular
system which also decides if new VM has to be provisioned from clouds and/or jobs
are assigned to the VMs. This scheduler works from time to time. There are five tasks
performed by the scheduler at every moment: (1) forecasting the possible workloads,
(2) provisioning the required VMs beforehand from the cloud, (3) assigning tasks
to VMs, (4) releasing the VM if its billing time unit (BTU) is close to surge and (5)
starting the required amount of VMs when a lot of unassigned jobs are present.
The progress in virtualization and distributed computing for supporting the cost-
effective utilization of computing resources is the basis for cloud computing. There
is an emphasis on the scalability of resources as well as services-on-demand. Based
on the business requirements, the resources can be scaled up or scaled down in cloud
paradigm. There are issues in on-demand-resource allocation as the needs of the
customers need to be factored. Provisioning of resources has been done via resource
provisioning. Thus, based on the features of the task that requires the resources, the
VMs are accordingly assigned. The execution of jobs of greater priority must not be
delayed due to low-priority jobs. This kind of a context can cause a resource access
A Heuristic Algorithm for Deadline-Based Resource … 3
contention between jobs of low priority and high priority. The chief input in our
allocation is the information contained in the resource request tuple. Nonetheless,
the benefit of the Dcloud’s resource allotting algorithm for effectively using the cloud
resources is severely undercut; if a selfish user keeps declaring short deadlines, it
will affect the balance of the VM and bandwidth adversely [5]. A mechanism based
on job as well as strategy-proof charging has been formulated for the DCloud. This
allows the users to honestly declare deadlines so that their costs are minimized.
There are several meta-heuristic algorithms [6] in use. Many new variations are
often proposed for resource assignment in several fields. Several meta-heuristic algo-
rithms that are popular in the cloud computing arena include firefly algorithm (FA),
league championship algorithm (LCA), immune algorithm (IA), harmony search
(HS), cuckoo search (CS), differential evolution (DE), memetic algorithm (MA) and
ant colony optimization (ACO), among others [7]. There are several benefits of the
artificial FSA (AFSA) including its ability for global search, strength and intense
robustness. It also has rapid convergence and better precision for global search. For
enabling flexible as well as effective usage of resources in the data centres, the Dcloud
influences the deadlines of the cloud computing jobs. The literature associated with
the proposed work has been explained in the second section briefly. The details on
the schemes used for this work are explained in the third section, and the outcomes
obtained are delineated in the fourth section with the conclusion of the work presented
in the fifth section.
In this work, an optimization algorithm-based FSA for minimizing the overall
workflow execution cost while meeting deadline constraints. Section 2 briefly
explains the literatures related to this work. Section 3 presents the techniques used
in methodology. Section 4 explains the result and discussion. Section 5 concludes
the entire work.
2 Literature Survey
On the basis of evaluating the job traits, Saraswathi et al. [2] focussed on the assign-
ment of VM resources to the user. The objective here is that jobs of low importance
(whose deadlines are high) should not affect the execution of highly important jobs
(whose deadlines are low). The VM resources have to be allocated dynamically for
a user job within the available deadline. Resource and deadline-aware Hadoop job
scheduler (RDS) has been suggested by Cheng et al. [8]. This takes into consideration
the future availability of resources, while simultaneously decreasing the misses in the
job’s deadline. The issue of job scheduling is formulated as an online optimization
problem. This has been solved using an effective receding horizontal control algo-
rithm. A self-learning prototype has been designed for estimating the job completion
times for aiding the control. For predicting the availability of resources, a simple, yet
an effective model has been used. Open-source Hadoop implementation has been
used for implementing the RDS. Analysis has been done considering the varying
benchmark workloads. It has been shown via experimental outcomes that usage of
4 J. Uma et al.
RDS decreases the penalty of missing the deadline by at least 36% and 10% when
compared, respectively, with fair scheduler and EDF scheduler.
Cloud infrastructure permits the simultaneous demand of the cloud services to the
active users. Thus, effective provisioning of resources for fulfilling the user require-
ments is becoming imperative. When resources are used effectively, they cost lesser.
In the virtualization scheme, the VMs are the resources which can map the incoming
user requests/ tasks prior to the execution of the task on the physical machines. An
analysis of the greedy approach algorithms for effectively mapping the tasks to the
virtual machines and decreasing the VM usage costs has been explored by Kumar
and Mandal [9]. The allocation of resources is crucial in several computational areas
like operating systems and data centre management. As per Mohammad et al. [10],
resource allocation in cloud-based systems involves assuring the users that their
computing requirements are totally and appropriately satisfied by the cloud server
set-up. The efficient utilization of resources is paramount to the context of servers
that provide cloud services, so that maximum profit is generated. This leads to the
resource allocation and the task scheduling to be the primary challenges in cloud
computing.
The review of the AFSA algorithm has been presented by Neshat et al. [11]. Also
described is the evolution of the algorithm including its improvisations as well as
its combinations with several methods. Its applications are also delineated. Several
optimization schemes can be used in combination with the suggested scheme which
may result in improvisation of performance of this technique. There are, however,
some drawbacks including high complexity of time, absence of balance between the
local search and the global search and also the lack of advantage from the experiences
of the members of the group for forecasting the movements. The proposed deadline-
aware two-stage scheduling schedules the VM for the requested jobs submitted from
users. Every job is specified to need 2 types of VMs in a sequential for completing its
respective tasks, as per Raju et al. [12]. This prototype takes into consideration the
deadlines with regard to the reply time and the waiting time, and it allocates the VMs
as resources for the jobs that require them, based on the processing time and the jobs
scheduling. This prototype has been evaluated using a simulation environment by
considering the analysis of several metrics like deadline violations, mean turnaround
time and mean waiting time. It has been contrasted with first come first serve (FCFS)
and shortest job first (SJF) scheduling strategies. In comparison with these schemes,
the suggested prototype has shown to decrease the evaluation metrics by constant
factor.
From the CSP’s perspective, the issue of global optimization of the cloud system
has been addressed by Gao et al. [13]. It takes into consideration lowering of oper-
ating expenses by maximizing the energy efficiency and simultaneously fulfilling
the user-defined deadlines in the service-level agreements. For the workload to be
modelled, viable approaches should be considered for optimizing cloud operation.
There are two models that are currently available: batch requests and task graphs with
dependencies. The latter method has been adopted. This micro-managed approach to
workloads allows the optimization of energy as well as performance. Thus, the CSP
can meet the user deadlines at lesser operational expenses. Yet, some added efforts
A Heuristic Algorithm for Deadline-Based Resource … 5
3 Methodology
Most work in the literature focusses on job completion time or job deadline along
with bandwidth constraints and VM capabilities. The challenge, however, is to map
the deadline against the job completion time so that deadlines are met with minimum
cost and job completion time. A novel allocation algorithm which benefits from the
added information in the requested attraction has been formulated. It is based on two
schemes: time sliding and bandwidth scaling. In time sliding, a delay between job/task
submission and execution is permitted. This requires smoothening the maximum
demand of the cloud and decreasing the quantity of excluded users at busy intervals,
whereas in bandwidth scaling, dynamic adaptation of the bandwidth assigned to the
VMs is allowed. Deadline, greedy-deadline and FSA-based deadlines are detailed in
this section.
3.1 Deadline
In cloud computing, for fulfilling a user’s task, a job needs cloud resources. One of
the available models for resource scheduling is deadline-aware two-stage scheduling
6 J. Uma et al.
model. The cloud resources are present as virtual machines. After scheduling the
given n job requests, the scheduler allocates the needed cloud resources/VMs for
every job that requests for it [12]. The scheduler, on receiving the n jobs from different
users, allocates the VMs as resources by means of job scheduling, in deadline-aware
two-stage scheduling. In this prototype, a job needs several VMs of various types
sequentially for task completion.
The overall workflow deadline is distributed across the tasks. A part of the deadline
is allocated to every task based on the VM which is the most cost-effective for that
particular task.
3.2 Greedy-Deadline
As they have the greedy approach, resource allocation algorithms (RAAs) are
extremely suited for dynamic and heterogeneous cloud resource environments. These
are linked to a process scheduler by means of cloud communication [9]. For deter-
mining the issues of task scheduling, the greedy approach for optimized profit is
effective. The greedy-deadline resource allocation algorithm [16] can be explained
as follows:
1. The input is the virtual machine input.
2. Every resource in the resource cache is checked to see if it is in suspended state
or waking state. If yes, then the remaining capacity of the resource is found and
checked.
3. The remaining capacity of the resource is found if it is in the sleeping state.
4. The function is processed to obtain the resource from the cache.
The priorities of the incoming tasks are evaluated, and the newly allocated priority
is compared with the previously allocated ones. This is followed by assigning the
tasks into the previously formulated priority queues. After allocation of the tasks,
tasks in the high-priority queues are selected and are executed. This is followed by
the transfer of tasks from medium-priority queues to high-priority queues. Thus, the
remaining tasks in the queues are executed until the queue has been exhausted of all
tasks.
Another population-based optimizer is the AFSO. Initially, the process begins with
a random set of generated probable solutions. An interactive search is performed
for obtaining an optimum solution. Artificial fish (AF) [17] refers to the fictitious
form which is used for analysing and explaining the problem. It may be understood
through the concept of animal ecology. Object-oriented analytical scheme has been
employed for considering the artificial fish as an object enclosed in its own data
A Heuristic Algorithm for Deadline-Based Resource … 7
Xv − X
xnext = X + .Step.rand() (2)
||X v − X ||
where Rand () is random number between 0 and 1, Step is the step length, x i is the
optimization variable and n is the number of variables. There are two components
included in the AF model—variables and functions. Variables are as follows: the
existing position of AF is denoted by X. The moving length step is denoted by ‘Step’.
The visual distance is denoted by ‘Visual’. The try number is given by try_number,
and the crowd factor whose value is between 0 and 1 is given by δ. The functions
include the behaviours of the AF: preying, swarming, following, moving, leaping
and evaluating. The flow chart for artificial fish swarm optimization has been shown
in Fig. 1.
The FSA is easily stuck in local optimal solutions, so an improved FSA is proposed to
avoid local optima using appropriate cost function for evaluating the solutions. AFSA
also finds increased usage in complex optimization fields. It offers an alternate to well-
known evolutionary methods of computing which can be applied across domains.
The service is considered to be a supererogatory one that a cloud service provider
8 J. Uma et al.
Start
Swarming Following
Behaviour Behaviour
Meet criteria?
Final Solution
offers as potential tenants may make use of the room to submit a job run time lesser
than the actual profiled outcome on purpose (anticipating the cloud service provider
to protract it by relaxing). Practically however, there may be job requests where
the profiling error is greater than the profiling relaxing index provided by the cloud
service provider. There are two schemes employed by the cloud provider to deal with
these tasks. First approach is to kill the jobs at the presumed end duration. The second
approach is when a cloud provider makes use of small part of the cloud resource for
specifically servicing those jobs. The virtual machines associated with those jobs are
sent at once to the specific servers at their expected end times. Thereafter, they are
run on the basis of best effort.
The algorithmic procedure of implementation is described below:
(1) Fishes are positioned in random on a task node. That is, each fish represents a
solution towards meeting the objectives of deadline and job completion time.
(2) Fishes choose a path to a resource node with a certain probability, determining
if the limits of the optimization model are met. If they are met, the node is
A Heuristic Algorithm for Deadline-Based Resource … 9
included to the list of solutions by the fish. Else, the fish goes on to search for
another node.
If X i is the current state of fish, a state X j is chosen randomly within visual
distance, Y = f (X) is the food consistence of fish:
X j = X i + a f _visual.rand() (3)
If Yi < Y j , then the fish moves forward a step in the direction of the vector sum
of the X j and the X best_af , X best_af is the best fish available.
⎡ ⎤
X j − X tj X − X t
X it+1 = X it + ⎣ + best_af i ⎦
∗ af_step ∗ ranf() (4)
t X best_af − X t
X j − X j i
Else, state X j is chosen randomly again and check if it complies with the forward
requirement. If the forward requirement is not satisfied, then the fish would move
a step randomly; this helps to avoid local minima.
(3) Fish move arbitrarily towards the next task node for the assignment of their next
task.
(4) Assigning all the tasks is regarded to be an iterative procedure. The algorithm
is terminated at the time when the number of iterations reaches its maximum.
Table 1 displays the parameter of FSA. Tables 2, 3 and 4 and Figs. 2, 3 and 4 show the
makespan, VM utilization and percentage of successful job completion, respectively,
for deadline, greedy-deadline and FSA-deadline.
It is seen from Table 2 and Fig. 2 that the makespan for FSA-deadline performs
better by 8.7% and by 10.9% than deadline and greedy-deadline, respectively, for
number of jobs 200. The makespan for FSA-deadline performs better by 7.6% and
325
275
225
Makespan (sec)
175
125
75
25
200 400 600 800 1000
Number of Jobs
84
83
82
VM Utilization (%) 81
80
79
78
77
76
75
74
200 400 600 800 1000
Number of Jobs
89
Sucessful job completion (%)
87
85
83
81
79
77
200 400 600 800 1000
Number of Jobs
by 9.6% than deadline and greedy-deadline, respectively, for number of jobs 600.
The makespan for FSA-deadline performs better by 8.1% and by 9.7% than deadline
and greedy-deadline, respectively, for number of jobs 1000.
It is seen from Table 3 and Fig. 3 that the VM utilization for FSA-deadline performs
better by 3.87% and by 1.27% than deadline and greedy-deadline, respectively, for
number of jobs 200. The VM utilization for FSA-deadline performs better by 4.94%
and by 2.44% than deadline and greedy-deadline, respectively, for number of jobs
600. The VM utilization for FSA-deadline performs better by 3.9% and no change
than deadline and greedy-deadline, respectively, for number of jobs 1000.
It is seen from Table 4 and Fig. 4 that the percentage of successful job completion
for FSA-deadline performs better by 4.41% and by 1.45% than deadline and greedy-
deadline, respectively, for number of jobs 200. The percentage of successful job
12 J. Uma et al.
completion for FSA-deadline performs better by 4.6% and by 2.7% than deadline and
greedy-deadline, respectively, for number of jobs 600. The percentage of successful
job completion for FSA-deadline performs better by 4.28% and by 0.599% than
deadline and greedy-deadline, respectively, for number of jobs 1000.
5 Conclusion
In the cloud computing paradigm, the computation as well as the storage of resources
is migrated to the “cloud.” These resources can be accessed anywhere by any user,
based on the demand. Suspicious modification of optimization parameters in meta-
heuristic algorithms is needed in order to find better solutions that lack extreme
computational time. The artificial fish swarm algorithm (shortened to AFSA) is
regarded as one among the top optimization methods within the set of swarm intelli-
gence algorithms. AFSA is chosen because it has global search ability, good robust-
ness as well as tolerance of parameter setting. This work proposes heuristic algorithm
for deadline-based resource allocation in cloud using modified fish swarm algorithm.
Outcomes have shown that the makespan for FSA-deadline performs better for two
hundred jobs by 8.7% and by 10.9% than deadline and greedy-deadline, respectively.
For 600 jobs, the FSA-deadline makespan is better compared to deadline by 7.6%
and for greedy-deadline, and it is better by 9.6%. The same parameters for 1000 jobs
are better by 8.1 and 9.7%. This task could be implemented by a trusted third party
in future that is reliable to both tenant and provider.
References
1. Wei W, Fan X, Song H, Fan X, Yang J (2016) Imperfect information dynamic stackelberggame
based resource allocation using hidden Markov for cloud computing. IEEE Trans Serv Comput
11(1):78–89
2. Saraswathi AT, Kalaashri YR, Padmavathi S (2015) Dynamic resource allocation scheme in
cloud computing. Proc Comput Sci 47:30–36
3. Chen X, Li W, Lu S, Zhou Z, Fu X (2018) Efficient resource allocation for on-demand mobile-
edge cloud computing. IEEE Trans Veh Technol 67(9):8769–8780
4. Jin S, Qie X, Hao S (2019) Virtual machine allocation strategy in energy-efficient cloud data
centres. Int J Commun Netw Distrib Syst 22(2):181–195
5. Li D, Chen C, Guan J, Zhang Y, Zhu J, Yu R (2015) DCloud: deadline-aware resource allocation
for cloud computing jobs. IEEE Trans Parallel Distrib Syst 27(8):2248–2260
6. Madni SHH, Latiff MSA, Coulibaly Y (2016) An appraisal of meta-heuristic resource allocation
techniques for IaaS cloud. Indian J Sci Technol 9(4)
7. Asghari S, Navimipour NJ (2016) Review and comparison of meta-heuristic algorithms for
service composition in cloud computing. Majlesi J Multimedia Proces 4(4)
8. Cheng D, Rao J, Jiang C, Zhou X (2015, May) Resource and deadline-aware job scheduling
in dynamic hadoop clusters. In Parallel and Distributed Processing Symposium (IPDPS), 2015
IEEE International, pp 956–965. IEEE
A Heuristic Algorithm for Deadline-Based Resource … 13
9. Kumar D, Mandal T (2016, April) Greedy approaches for deadline based task consolidation
in cloud computing. In: 2016 International Conference onComputing, Communication and
Automation (ICCCA), pp 1271–1276. IEEE
10. Mohammad A, Kumar A, Singh LSV (2016) A greedy approach for optimizing the problems
of task scheduling and allocation of cloud resources in cloud environment
11. Neshat M, Sepidnam G, Sargolzaei M, Toosi AN (2014) Artificial fish swarm algorithm: a
survey of the state-of-the-art, hybridization, combinatorial and indicative applications. Artif
Intell Rev 42(4):965–997
12. Raju IRK, Varma PS, Sundari MR, Moses GJ (2016) Deadline aware two stage scheduling
algorithm in cloud computing. Indian J Sci Technol 9(4)
13. Gao Y, Wang Y, Gupta SK, Pedram M (2013, September) An energy and deadline aware
resource provisioning, scheduling and optimization framework for cloud systems. In: Proceed-
ings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Co design
and System Synthesis, p 31. IEEE Press
14. Rodriguez MA, Buyya R (2014) Deadline based resource provisioning and scheduling
algorithm for scientific workflows on clouds. IEEE Trans Cloud Comput 2(2):222–235
15. Xiang Y, Balasubramanian B, Wang M, Lan T, Sen S, Chiang M (2013, September) Self-
adaptive, deadline-aware resource control in cloud computing. In: 2013 IEEE 7th international
conference onSelf-adaptation and self-organizing systems workshops (SASOW), pp 41–46.
IEEE
16. Wu X, Gu Y, Tao J, Li G, Jayaraman PP, Sun D, et al. (2016) An online greedy allocation of
VMs with non-increasing reservations in clouds. J Supercomput72(2):371–390
17. Shen H, Zhao H, Yang Z (2016) Adaptive resource schedule method in cloud computing system
based on improved artificial fish swarm. J Comput Theor Nanosci 13(4):2556–2561
18. Li D, Chen C, Guan J, Zhang Y, Zhu J, Yu R (2016) DCloud: deadline-aware resource allocation
for cloud computing jobs. IEEE Trans Parallel Distrib Syst 27(8):2248–2260
Dynamic Congestion Control Routing
Algorithm for Energy Harvesting
in MANET
Abstract Energy harvesting (EH) is seen as the key enabling innovation for the mass
sending of mobile ad hoc networks (MANETs) for the IoT applications. Effective EH
methodologies could oust the necessities of successive energy source substitution,
subsequently offering a near interminable system working condition. Advances in
EH systems have moved the plan of routing conventions for EH-MANET from
“energy-mindful” to “energy-harvesting-mindful.” Right now, Dynamic Congestion
Control Routing Algorithm is using Energy Harvesting in MANET. The presentation
of the Dynamic Congestion Control Routing Algorithm-Based MANET Algorithm
scheme is evaluated using various measurements, for instance, Energy Consumption
Ratio, Routing Overhead Ratio, and Throughput Ratio.
1 Introduction
M. M. Karthikeyan (B)
Ph.D Research Scholar, PG and Research Department of Computer Science, Hindusthan College
of Arts & Science, Coimbatore, Tamil Nadu, India
e-mail: mmk.keyan90@gmail.com
G. Dalin
Associate Professor, PG and Research Department of Computer Science, Hindusthan College of
Arts & Science, Coimbatore, Tamil Nadu, India
e-mail: profgdalin@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 15
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_2
16 M. M. Karthikeyan and G. Dalin
center uses an "evade" course for bypassing the potential congestion region to the
first non-blocked center on the basic course. Traffic is part probabilistically over these
two courses, fundamental and evades, therefore sufficiently decreasing the chance
of congestion occasion. The congestion watching will use various measurements
to screen the congestion status of the center points. When the no of packs setting
off to the centers outperforms its passing on limit, the center gets stopped up and
its beginnings losing packages. Manager among these is the degree of all packages
discarded for the nonappearance of support space, the run of the mill line length,
the quantity of groups facilitated out and retransmitted, the ordinary package delay,
and the customary deviation of pack delay. In all cases, extending numbers exhibit
creating congestion. Any of these strategies can work with CRP by and by [8].
2 Literature Survey
versatile regarding creating deferrals, data move limit and different clients using the
system. ACP is depicted by its learning limit which connects with the show to adjust
to the fundamentally dynamic system condition to keep up security and unbelievable
execution. This learning limit is appeared by a novel estimation calculation, which
"learns" about the quantity of streams using each relationship in the system [10].
Merits:
Demerits:
• The routing choice is dynamically strange; in this way, the preparing trouble on
arrange center points increments.
• In most cases, adaptive frameworks depend upon status data that is amassed at one
spot in any case utilized at another. There is a tradeoff here between the possibility
of the data and the extent of overhead [10].
• An adaptive technique may respond too rapidly, causing congestion-passing on
affecting, or too bit by bit, being immaterial.
Uddin et al. [11] proposed an energy-proficient multipath routing show for versa-
tile specially appointed mastermind to use the health work recalls this particular
issue of energy consumption for MANET by applying the fitness function system
to streamline the energy consumption in Ad hoc On-Demand Multipath Distance
Vector (AOMDV) routing show. The proposed show is called Ad hoc On-Demand
Multipath Distance Vector with the Fitness Function (FF-AOMDV). The wellbeing
work is utilized to locate the ideal way from the source to the objective to lessen
the energy consumption in multipath routing. The presentation of the proposed FF-
AOMDV show was overviewed by utilizing Network Simulator Version 2 (NS-2),
where the show was separated and AOMDV and Ad hoc On-Demand Multipath
Routing with Life Maximization (AOMR-LM) conventions, the two most prominent
conventions proposed here.
Merits:
• FF-AOMDV figuring has performed unmistakably better than both AOMR-LM
and AOMDV in throughput, pack transport degree and starts to finish delay.
• Performed well against AOMDV for proportioning more energy and better
structure lifetime.
Demerits:
• More Energy consumption and Less Network lifetime.
Dynamic Congestion Control Routing Algorithm … 19
Demerits:
• Right when the structure is familiar with empowering multi-hop interchanges
among center points, or in multi-skip networks with compelled establishment
support.
Lee et al. [13] proposed an assembled TDMA space and power orchestrating
plans which develop energy effectiveness (EE) considering quality-of-service (QoS)
utility, and this arrangement redesigns the unfaltering quality and survivability of
UVS key MANET. The proposed calculation has three stages Dinkelbach strategy,
animating the Lagrangian multiplier and the CCCP procedure. To update the EE,
the length of a TDMA design is dynamically balanced. The drawback of this show
is that as the all out concede stretches out as appeared by the edge round, it cannot
ensure the diligent transmission.
Merits:
• The proposed calculation is certified by numerical outcomes.
• Those ensure least QoS and show the really unprecedented energy productivity.
Demerits:
• Using TDMA progression is that the clients increase some predefined encounters
opening.
• When moving from one cell site to other, if all the availabilities right presently
full the client may be isolated.
Jabbar et al. [14] proposed cream multipath energy and QoS—mindful improved
association state routing show adaptation 2 (MEQSA-OLSRv2), which is made to
adjust to the challenges showed up by constrained energy assets, mobility of focus
focuses, and traffic congestion during information transmission in MANET-WSN
association conditions of IoT systems. This show utilizes a middle point rank as
exhibited by multi-criteria hub rank estimation (MCNR). This MCNR totals various
20 M. M. Karthikeyan and G. Dalin
parameters identified with energy and nature of administration (QoS) into a cautious
estimation to fundamentally reduce the multifaceted thought of different obliged
contemplations and dodge the control overhead acknowledged via independently
communicating different parameters. These estimations are the middle’s lifetime,
remaining battery energy, focus’ inert time, focus’ speed, and line length. The MCNR
metric is used by another association quality assessment work for different course
computations.
Merits:
• MEQSA-OLSRv2 maintained a strategic distance from the assurance of focuses
with high flexibility.
Demerits:
• Audiences whine about information over-burden, and they can be overpowered
and feel that it is annoying.
• The quickly changing of progression has upset the gathering’s exercises.
Kushwaha et al. [15] proposed a novel response for move server load starting with
one server then onto the accompanying server. Energy effectiveness is a basic factor
in the activity of specially appointed systems. The issue of sorting out routing show
and overpowering nature of impromptu headway may decrease the life of the middle
point like the life of the system.
Merits:
• MANETs over networks with a fixed topology join flexibility (an impromptu
system can be made any place with versatile devices).
• Scalability (you can without a lot of a stretch add more focuses to the system) and
lower organization costs (no persuading inspiration to gather a framework first).
Demerits:
• Mobile focus focuses allow the land to pass on and plan a brief system.
• The significant issue with the impromptu community focuses is resource goals.
3 Proposed Work
DCCR is unicast routing show process for mobile ad hoc network. It decreases to
organize congestion by techniques for diminishing pointless flooding of bundles
and finding a without congestion route starts the source and the objective. At the
present time, present the all out plan and an all around appraisal of the DCCR show.
Exactly when a source have expected to communicate a data bundle to an objective,
Dynamic Congestion Control Routing Algorithm … 21
the DCCR show initially builds up a without congestion set (CFS) to relate both
one-bounce and 2-jump neighbors. By then, the source begins the course exposure
methodology using the CFS to perceive a sans congestion path to the objective. In
case, the DCCR show cannot build up a CFS in view of the system being as of
now stopped up, it cannot begin the course disclosure process. In any case, when
another course has been set up, the transmission of data packs will continue. Right
now, essential objective of DCCR is to find a without congestion course between the
source and the objective. In doing accordingly, it reduces the overhead and flooding
of groups. The DCCR show contains the going with parts:
1. Technique process of dynamic congestion,
2. Construction of CFS,
3. Routing way of congestion free,
4. Path discovery of congestion free.
The proposed calculation controls arrange congestion by strategies for dimin-
ishing the futile flooding of bundles and finding a sans congestion path between the
source and the objective. The EDCDCR framework at first perceives the congestion,
by then forms a sans congestion set (CFS) to relate both one-bounce and two-jump
neighbors and the source begins the course disclosure system using the CFS to recog-
nize a sans congestion route to the objective. The proposed calculation involves three
portions to perceive and control congestion on MAC layer in MANET fuses: dynamic
congestion detection.
1. CFS construction,
2. Route discovery congestion free.
The detecting congestion is based on estimation of the link stability (LS), residual
bandwidth (RB), and residual battery power (RP).
Link Stability
The link stability (LSD) is used to define link’s connection strength. In MANET, to
improve QoS LSD is essential and is defined as:
LSD characterizes the level of the connection dependability. The higher estimation
of LSD, higher is the dependability of the connection and more noteworthy is the
duration of its reality. In this way a course having the whole connection with LSD >
LSD thr is practical.
22 M. M. Karthikeyan and G. Dalin
4 Experimental Results
shows a reliable outcome for proposed novel procedure. Consequently, the proposed
strategy created a superior improvement energy consumption ratio results. Thus, the
proposed strategy delivered a noteworthy improvement in results.
Routing Overhead Ratio
Figure 3 shows the examination of routing overhead ratio is characterized as routing
and information parcels need to have a similar system data transfer capacity the vast
majority of the occasions, and henceforth, routing bundles are viewed as an overhead
in the system. Their current CODA esteems are commonly characterized as between
39 and 58, and another current CCF values are commonly characterized as between
26.77 and 44.56. Proposed DCCRA values are characterized as between 66 and
85. These outcomes are mimicked utilizing NS2 test system. This outcome shows a
predictable outcome for proposed novel procedure. Consequently, the proposed tech-
nique delivered a superior improvement routing overhead ratio results. Subsequently
the proposed strategy delivered a huge improvement in results.
Throughput Ratio
Figure 4 exhibits the examination of average throughput ratio. Normal throughput
ratio is characterized as the proportion of parcels effectively got to the all out sent.
Their current CODA esteems are commonly characterized as between 0.09 and 0.3.
Existing CCF values are commonly characterized as between 0.04 and 0.22. Proposed
DCCRA values are characterized as between 0.13 and 0.45. These outcomes are
recreated utilizing NS2 test system. This outcome shows a reliable outcome for
proposed novel procedure. Consequently, the proposed technique delivered a superior
improvement average throughput ratio results. Subsequently, the proposed strategy
created a huge improvement in results.
5 Conclusion
References
1. Arora B, Nipur (2015) An adaptive transmission power aware multipath routing protocol for
mobile ad hoc networks © 2015 Published by Elsevier. https://creativecommons.org/licenses/
by-nc-nd/4.0/
2. Divya M, Subasree S, Sakthivel NK (2015) Performance analysis of efficient energy routing
protocols in MANET. pp. 1877–0509 © 2015 The Authors. Published by Elsevier. https://cre
ativecommons.org/licenses/by-nc-nd/4.0/
3. Sandeep J, Satheesh Kumar J (2015) Efficient packet transmission and energy optimization in
military operation scenarios of MANET, pp 1877–0509 © 2015 The Authors. Published by
Elsevier. https://creativecommons.org/licenses/by-nc-nd/4.0/
4. Kim D, Kim, J-h, Moo C, Choi J, Yeom I (2015) Efficient content delivery in mobile ad-hoc
networks using CCN, pp 1570–8705 © 2015 Elsevier. http://dx.doi.org/https://doi.org/10.1016/
j.adhoc.2015.06.007
5. Anish Pon Yamini K, Suthendranb K, Arivoli T (2019) Enhancement of energy efficiency using
a transition state mac protocol for MANET, pp 1389–1286 © 2019 Published by Elsevier.
https://doi.org/https://doi.org/10.1016/j.comnet.2019.03.013
6. Taheri S, Hartung S, Hogrefe D (2014) Anonymous group-based routing in MANETs, pp
2214–2126 © 2014 Elsevier Ltd. http://dx.doi.org/https://doi.org/10.1016/j.jisa.2014.09.002
Dynamic Congestion Control Routing Algorithm … 25
Abstract While considering the routing process in mobile wireless sensor network
as the greatest complex task, it would get affected mainly based on mobility behavior
of nodes. Successful routing ensures the increased network performance by sending
packets without loss. This is confirmed in the previous research work by introducing
the QoS-oriented distributed routing protocol (QOD) which measures the load level
of channels before data transmission; thus, the successful packet transmission is
ensured. However, this research method does not concentration prediction about
mobility behavior which would cause the path breakage and network failure. It is
completely determined in this proposed method by specifically presenting predictable
mobility-based routing scheme (PMRS) in which successful data transmission can
be guaranteed by avoiding the path breakage due to mobility. In this work, node
movement will be predicted based on node direction and motion angles toward the
destination node. By predicting the node mobility in the future, it is concluded that
whether the node is nearest to the destination or not. Thus, the better route path can be
established for deploying a successful data transmission. Based on node movement,
the optimal cluster head would be selected, and thus, the shortest and reliable path
can be achieved between source and destination nodes. In this work, cluster head
selection is prepared by using the genetic algorithm, which can ensure the nodes
reliable transmission without any node failure. Finally, data transmission is done
through cluster head node by using time division multiple access (TDMA) method.
The overall implementation of the proposed scheme is on the NS2, from which it is
shown that this technique provides best possible result than the other recent schemes.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 27
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_3
28 G. Sophia Reena and M. Punithavalli
1 Introduction
rendered using genetic algorithm (GA). The CH possibly will regularly accumulate
data from the sensors or TDMA scheduling might be performed to collect the data
from the sensors [13].
The update protocol is essential for the dissemination of knowledge about geograph-
ical location and services. Measured resources, such as battery power, queuing space,
processor speed, range of transmission.
1. Type 1 update: A Type 1 update is regularly produced. The time between subse-
quent changes to form 1, i.e., the time remains set at the specified frequency.
Otherwise the frequency of the type 1 modified may differ linearly between a
maximum (f max ) and a minimum (f min ) of the threshold, but node v is defined.
The characteristics are shown in Fig. 1.
2. Type 2 update: Type 2 updates are provided in case if there is a significant shift
in the node’s pace or path. The mobile node can assess the approximate location
in which it is positioned at a certain time in its current record (specifically from
latest information) (Fig. 2).
Subsequently, anticipated position (x e , ye ) is provided by the following equations:
2.1.1 Predictions
When connecting to a specific target b, source a must initially determine the desti-
nation b’s geographic position as well as the intermediate hops when the first packet
enters the individual nodes. This phase therefore engrosses a spot, in addition to the
prediction of propagation delay. It is to be observed that the location prediction is
effectively employed for the purpose of determining the geographical position of any
node, either an intermediary node or target in the future when the packet enters it at
a given time t p .
For updates containing node motion direction information, only 1 preceding
update is necessary if the position is to be predicted. In the case of a given node,
the measurement of the projected position is then exactly the same as the periodic
measurement of the actual position in node b itself.
Predictable Mobility-Based Routing Protocol in Wireless … 31
An adaptive genetic algorithm (GA) was announced by J.Holland for usage as search
algorithm. GAs effectively handled several fields of applications and were capable of
resolving an extensive array of complicated numerical optimization complications.
GAs need no gradient details and is comparatively less possible to be cornered in
local minima on multi-modal search spaces. GAs create to be reasonably not sensitive
to the existence of noise. The pseudocode of the GAs method is given below:
g=g+1
Crossover P (g)
Mutate P (g)
Evaluate P (g)
end while
end GA
The above problem is encoded via gas within chromosomes which represent every
possible solution.
Min g(s), s S
where N represents the function of the neighborhood or problem format (S, g) where
it is represented from S to its powerset by the given mapping format:
32 G. Sophia Reena and M. Punithavalli
N: S
N(s) is also symbolic of the value of the neighborhoods and it contains each possible
solution which is reached via a single move from s. The move represents operators
who convert multiple solutions with minute changes. x then represents the solutions
which is also known as the local minimum of g in accordance with the neighborhood
N if:
The process of minimizing cost functions g or the local search function is the
consecutive strides in each of which the existing solution x is being exchanged by a
solution y in order that:
Maximum local search starts with arbitrary solution and end with the selection of
local minimums. There are multiple ways to conduct local searches and the complex-
ities in local search computations are dependent on neighborhood set sizes and its
approximate time required to evaluate moves. It is thus noted that neighborhood size
grows in size and this effects the time required to search for it, in order to determine a
better local minima. Local search makes use of concept of state space, neighborhood,
and objective function.
i. State space S: It is the collection of potential states that can be extended at some
point in the search.
ii. Neighborhood N(s): It is the collection of states, neighbors that during which
can be extended from the state, s in one step.
iii. Objective function f (s): It is a value that signifies the excellence of the state, s.
The best possible value of the function is attained during at a stage where s is a
solution.
Pseudocode for local search is as follows:
Select an initial state s0eS.
While s0 is not a solution D0.
Select by some heuristic, seN(s0) such that f(s) > f(s0)
Replace s0 by s.
Predictable Mobility-Based Routing Protocol in Wireless … 33
In genetic algorithm, four parameters are accessible. The size of the population,
cross-probability, mutation probability, and weight accuracy of influence factors.
Figure 3 shows the flowchart for proposed method.
The performance evaluation parameters considered here are packet delivery ratio,
throughput, end-to-end delay, and network lifetime is evaluated by using existing
MADAPT algorithm, previous work QoS-aware channel load-based mobility adap-
tive routing protocol (QoS-CLMARP) and the proposed work on predictable
mobility-based routing scheme (PMRS).
The results of end-to-end delay are illustrated in Fig. 5. In existing MADAPT
and QoS-CLMARP method, the end-to-end delay is lower. In case of the proposed
system, the end-to-end delay is improved considerably by PMRS method (Fig. 5).
The results of network lifetime are illustrated in Fig. 6. In existing MADAPT
and QoS-CLMARP method, the network lifetime is lower. In proposed system, the
network lifetime is improved significantly by PMRS method. In Fig. 7, the packet
distribution ratio performance is substantially improved by the PMRS approach in
the proposed system.
In Fig. 8, the proposed PMRS accomplishes less packet loss ratio of compared
with other two methods.
5 Conclusion
In this work, node movement in future will be predicted based on node direction and
motion angles toward the destination node. By predicting the node mobility in future,
it is concluded that whether the node is nearest to destination or not. Thus, the better
34 G. Sophia Reena and M. Punithavalli
Chromosome’s coding
Initial population
Local search
No
Meet
optimization
Yes
End
Predictable Mobility-Based Routing Protocol in Wireless … 35
route path can be established for the successful data transmission. Based on node
movement optimal cluster head would be selected, thus the shortest and reliable
path can be achieved between source and destination nodes. In this work, cluster
head selection is completed by using the genetic algorithm which can confirm the
nodes reliable transmission without node failure. Finally, data transmission is finished
through cluster head node using time division multiple access (TDMA) method.
References
1. Kim BS, Park H, Kim KH, Godfrey D, Kim KI (2017) A survey on real-time communications
in wireless sensor networks. Wireless Commun Mob Comput
2. Oliver R, Fohler G (2010) Timeliness in wireless sensor networks: common misconceptions.
In: Proceedings of international workshop on real-time networks, July 2010
3. Collotta M, Costa DG, Falcone F, Kong X (2016) New challenges of real-time wireless sensor
networks: theory and applications. Int J Distrib Sens Netw 12(9)
Predictable Mobility-Based Routing Protocol in Wireless … 37
Abstract Due to vicious competition in the electrical power industry, growing envi-
ronmental issues and with an ever-increasing demand for electric energy, optimiza-
tion of the economic load dispatch problem has become a compulsion. This paper
emphasizes on a novel modified version of PSO to obtain an optimized solution of
the economic load dispatch problem. In the paper, exponential particle swarm opti-
mization (EPSO) is introduced and comparison has been performed on the basis of
speed of convergence and its stability. The proposed novel method of exponential
PSO has shown better performance in the speed of convergence and its stability.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 39
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_4
40 N. Bansal et al.
1 Introduction
PSO is an evolutionary and intelligent process that is influenced by the fish colonies
and the congregation of birds. The symbiotic cooperation between inhabitants of
the society is the principle behind the revolutionary computational technique. PSO
requires fewer parameters for evaluation and has more convergence speed than
any arithmetic techniques. All through these years, there has been massive anal-
ysis on the PSO technique and algorithms which are acquired based on the better-
ment of diversity of population and parameter adjustment. The primary parameter
is used for finding the equilibrium between global searching and local searching
[4]. These consist of algorithms such as CPSO, DPSO, and LWPSO. To shun away
from premature convergence, the second parameter is employed for obtaining algo-
rithms. For significant improvement of performance, techniques like natural selection
are used [5]. In this paper, the primary focus will be on our first parameter due to
less computational cost and relatively lesser complexities; efficiency of parameter
strategies.
Novel Exponential Particle Swarm Optimization Technique … 41
In the early moderation of PSO, there was an introduction of the inertia weight
coefficient to the velocity update equation. There was a decrement of inertia weight
in a constant linear manner (LWPSO) [6]. This technique helped increase the conver-
gence speed and obtain a balance between local and global search exploitation. But a
compromised situation with local exploration was observed because of the reduction
of inertia weight in a constant linear way. Thus, a further modification was made,
and inertia weight was decreased by a damping factor rather than a linear manner
[7]. This process increased the convergence speed but affected the balance between
local and global probing of global optimum value. Further research introduced a
constriction factor, thus, removing the inertia weight introduced which was brought
up earlier in papers in the velocity update equation. Thus, 0.729 was revealed to be
the value constriction factor for rendering the best optimum solutions [8]. This tech-
nique demonstrated that the dynamic updating of the velocity equation could upgrade
local search of an optimal solution and convergence speed without accumulating any
complexities in the PSO technique.
Deeply inspired by the improvement in the modification of the PSO techniques, a
novel method of exponential partial swarm optimization (EPSO) has been introduced
in this paper. In this method, the inertia weight eliminated by CPSO has been re-
introduced. The inertia weight in this method is dependent on the Max I t, which is the
maximum number of iterations [9]. This gives a big relay decay step in an early stage
of the algorithm which enhances the speed of convergence, and in the later stage,
the decay step decreases considerably allowing local exploration, which balances
local and global exploration. In this paper, the ELD problem has been exposed to
techniques like LWPSO, DPSO, CPSO, and EPSO, and the solutions obtained by
each algorithm, their convergence speed, and convergence stability are compared.
where c1 and c2 being the positive coefficients, and random variable functions are r 1
(.) and r 2 (.)
The inertia weight LWPSO was introduced by earlier research papers where inertia
constant equation was given by
vi (n + 1) = wvi (n) + c1 r1 (Pb (n) − xi (n)) + c2 r2 Pg (n) − xi (n) (3)
(wmax − wmin ) · it
w = wmax − (4)
Max I t
where ‘MaxIt’ denotes the count of iterations, ‘It’ being the instantaneous iteration,
wmin and wmax are constants having values 0.4 and 0.9, respectively. This technique
discovered a balance between local and global searching and also improved conver-
gence speed. Further research in this field deduced out that in the updated equation
of velocity damping factor in inertia weight produced a better speed of convergence.
w = w ∗ wdamp (5)
And,
2
x= √ (7)
2−∅− ∅2 − 4∅
Several experiments were conducted to note the value of . The value was found
to be 4.1 which results in χ = 0.729. In this case, the algorithm gives the best
performance for finding the optimal solution.
Deeply inspired by this, a new method exponential PSO (EPSO) is introduced in
this paper. In this method, the inertia weight in (3) was modified. The new fixed of
inertia weight is:
Maxlt
1
w = 1− (8)
Maxlt
Novel Exponential Particle Swarm Optimization Technique … 43
Now, since the maximum number of iterations is large. The expression (8) can be
expressed as
Thus, with the help of this algorithm, we can have a large step in the initial
stage of competition of computation and the smaller one toward the end of the
computation. Thus, the equilibrium is maintained between the local searching and
global searching for the problem. This is achieved without having any complications
in the algorithm which is actually quite essential for an evolutionary algorithm. A
convergence speed better than that of damped particle swarm optimization (DPSO),
constriction particle swarm optimization (CPSO), and linear weight particle swarm
optimization (LWPSO) were obtained with the help of this algorithm, and the stability
of convergence was found to be the best when ELD problem was exposed to the
algorithms. The numerical results for these algorithms will be discussed in the further
section.
Power generation in a thermal power plant takes place by the rotation of prime mover
in the turbine with the action of steam. The working fluid in the thermal power plant
is water. Water is fed to the boiler and super-heater which convert it to steam. Steam
which carries thermal energy is allowed to expand in the boiler which rotates the
rotor shaft of generators. The steam loses energy and is condensed and then pumped
to the boiler back to be heated up again. The factors which affect the operating cost
include the transmission losses, fuel costs, and efficiency of the generators in action.
Usually, labor, maintenance, and operation costs are fixed. A typical fuel cost curve
of the generating unit is depicted below in Fig. 1.
Minimum power which can be extracted from the generating unit and below
which it’s not feasible to operate the plant is Pimin [10]. Maximum power which can
be obtained from generating units is Pimax .
The main motive of ELD is to reduce the total generation cost. Execution of the
problem can be as follows:
n
Minimise X T = Fi (Pi ) (10)
i=1
where α i , β i and γ represent the fuel cost coefficient for ith generating units.
This problem has inequality and equality constraints [12].
The cumulative real power being generated out of generating units in the case under
study should be equal to the total of the transmission losses and systems demand
power which therefore generates the equality constraints.
n
Pi = PD + P2 (12)
i=1
where
PD = Demand Power (MegaWatt).
PL = Transmission Losses (MegaWatt).
Here,
For ith unit, Pimax is maximum possible real power and Pimin is minimum possible
real power [10].
Pi is the power being generated in the generating unit. It may not exceed real power
being generated in the preceding interval by a fixed amount, URi is the up ramp rate
unit and it might not be less than real power, and DRi is the down rate limit [13].
So, the following constraints arise:
◦ ◦
Max Pimin , Pi − D Ri ≤ Pi ≤ Min Pimax , Pi + U Ri (15)
The progressive fuel cost curve of generating units in ELD is presumed to be a mono-
tonically increasing linear function. Therefore, the nature of input–output character-
istics is quadratic. However, because of the value point effect, linearity and disconti-
nuities of high order are displayed by the input–output curve [9]. Thus, the original
function is modified to consider these constraints. The adjusted periodic sinusoidal
function demonstrates the value point effect which is represented as:
Fi (Pi ) = αi + βi Pi + γi Pi2 + ei × sin f i × Pimin − Pi (16)
where F i , ei represent fuel cost coefficient for ith generating unit in correspondence
with valve point effect.
46 N. Bansal et al.
The existence of a steam valve inside the thermal power plant generates vibrations
in the shaft bearing which results in the creation of zones that are however restricted
for performance in the fuel cost function. Non-segregated auxiliary operating tools
like boilers and feed pumps are part of the other reason. The position of the fuel cost
curve cannot be predicted in the prohibited zones. Precluding operations of units in
given regions are the optimal solution. The cost curve in prohibited zones is given
below in Fig. 2.
This can be mathematically represented as follows:
Pimin ≤ Pi ≤ Pi,1
lower
(17)
upper
Pi,k−1 ≤ Pi ≤ Pi,k
lower
, k = 2, 3, . . . n j (18)
upper
Pi,ni ≤ Pi ≤ Pimax (19)
lower
where lower real power limit of prohibited kth zone of ith unit is depicted by Pi,k ,
upper
the upper limit of prohibited k − 1th zone of ith unit is denoted by Pi,k−1 and ni : is
the number of prohibited zones in ith generating unit [14].
Thus, these are constraints taken into consideration in the ELD Problem, and
solutions have been acquired using different versions of PSO.
The power system is considered which has six generating units. It is used for denoting
the application of PSOs various modified methods and results were obtained. Table
1 consists of the fuel cost coefficient of generating units and Table 2 consists of
characteristics of generating units. Table 3 shows prohibited zones.
The B-coefficients are given below to compute the transmission losses in the given
power system:
Different modified techniques of PSO are deployed for calculation of total power
generation cost and power being generated by each unit. Apart from this, the compar-
ison is drawn among different modified techniques of PSO. Two indices govern the
assessment of different optimization methods and they are convergence speed and
convergence stability. A modified stochastic algorithm is one having better conver-
gence stability and speed. The time taken for the computation of various algorithms
is also compared in this paper.
By considering all the inequality and equality constraints, the problem of ELD is
solved by different innovative techniques of PSO. Table 4 shows the results obtained.
1200 MW of demand power is obtained from the system. PSO algorithm is carried
out with each algorithm being run for 200 iterations and also a population size of
300 is taken.
The total power generated is 1146.37 MW, out of which 1100 MW is used to
meet up demand and 46.37 MW is wasted in transmission loss. The mean total cost
is nearly the same in all the modified versions of PSO as shown in Table 4.
The convergent algorithm is the one which after a definite no. of iterations reaches
an optimal region. An algorithm is called divergent when the optimal region is not
reached. The slope of the convergence curve determines the convergence speed [15].
The convergence curve of all the versions of PSO is shown in Fig. 3. In the
convergence curve, the vertical axis represents the total cost, whereas the horizontal
axis denotes the no. of iterations for the modified algorithms. Thus, a conclusion can
be drawn that EPSO performs better than DPSO, CPSO, and LWPSO when each of
these algorithms is made to run for 200 iterations.
6 Conclusion
In this paper, there has been the successful implementation of PSO and the modified
algorithms for solving the ELD problem. PSO is a stochastic algorithm which has
been inspired from nature and the presence of a fewer number of variants provides
it lead over the other evolutionary techniques which have also been nature-inspired,
thus, a newer version of PSO has been successfully implemented and the comparison
has also been drawn with the present versions of PSO based on convergent stability
and speed and also the total mean cost. Digital analysis of the stability of conver-
gence for different versions has been performed. The new method (EPSO) has better
convergence stability and convergence speed than the pre-existing models (DPSO,
LWPSO, and CPSO). However, the new methods of total min cost are almost equal
to those of the existing models. The dynamic step in the modification of the velocity
of a particle is given by the iterative weighted term given in the velocity equation.
At a later stage, the step gets smaller, therefore, ensuring local exploration. Thus,
an equilibrium has been established between the local and global exploration. The
novel PSO technique can easily be employed in various applications on optimization
of the power system.
References
1. Alam MN (2018) State-of-the-art economic load dispatch of power systems using particle
swarm optimization. arXiv preprint arXiv:1812.11610
2. Shi Y (2001) Particle swarm optimization: developments, applications and resources. In:
Proceedings of the 2001 congress on evolutionary computation (IEEE Cat. No. 01TH8546),
pp 81–86. IEEE
3. Sharma J, Mahor A (2013) Particle swarm optimization approach for economic load dispatch:
a review. Int J Eng Res Appl 3:013–022
4. Kalayci CB, Gupta SM (2013) A particle swarm optimization algorithm with neighborhood-
based mutation for sequence-dependent disassembly line balancing problem. Int J Adv Manuf
Technol 69:197–209
5. Shen Y, Wang G, Tao C (2011) Particle swarm optimization with novel processing strategy and
its application. Int J Comput Intell Syst 4:100–111
6. Abdullah SLS, Hussin NM, Harun H, Abd Khalid NE (2012) Comparative study of random-
PSO and Linear-PSO algorithms. In: 2012 international conference on computer & information
science (ICCIS), pp 409–413. IEEE
7. He M, Liu M, Jiang X, Wang R, Zhou H (2017) A damping factor based particle swarm
optimization approach. In: 2017 9th international conference on modelling, identification and
control (ICMIC), pp 13–18. IEEE
Novel Exponential Particle Swarm Optimization Technique … 51
8. Eberhart RC, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm
optimization. In: Proceedings of the 2000 congress on evolutionary computation. CEC00 (Cat.
No. 00TH8512), pp 84–88. IEEE
9. Pranava G, Prasad P (2013) Constriction coefficient particle swarm optimization for economic
load dispatch with valve point loading effects. In: 2013 international conference on power,
energy and control (ICPEC), pp 350–354. IEEE
10. Mondal A, Maity D, Banerjee S, Chanda CK (2016) Solving of economic load dispatch problem
with generator constraints using ITLBO technique. In: 2016 IEEE students’ conference on
electrical, electronics and computer science (SCEECS), pp 1–6. IEEE
11. Arce A, Ohishi T, Soares S (2002) Optimal dispatch of generating units of the Itaipú
hydroelectric plant. IEEE Trans Power Syst 17:154–158
12. Dihem A, Salhi A, Naimi D, Bensalem A (2017) Solving smooth and non-smooth economic
dispatch using water cycle algorithm. In: 2017 5th international conference on Electrical
Engineering-Boumerdes (ICEE-B), pp 1–6. IEEE
13. Dasgupta K, Banerjee S, Chanda CK (2016) Economic load dispatch with prohibited zone and
ramp-rate limit constraints—a comparative study. In: 2016 IEEE first international conference
on control, measurement and instrumentation (CMI), pp 26–30. IEEE
14. Hota PK, Sahu NC (2015) Non-convex economic dispatch with prohibited operating zones
through gravitational search algorithm. Int J Electr Comput Eng 5
15. Li X (2004) Better spread and convergence: particle swarm multiobjective optimization using
the maximin fitness function. In: Genetic and evolutionary computation conference, pp 117–
128. Springer, Berlin
16. Ding W, Lin C-T, Prasad M, Cao Z, Wang J (2017) A layered-coevolution-based attribute-
boosted reduction using adaptive quantum-behavior PSO and its consistent segmentation for
neonates brain tissue. IEEE Trans Fuzzy Syst 26:1177–1191
17. Clerc M, Kennedy JJItoEC (2002) The particle swarm-explosion, stability, and convergence in
a multidimensional complex space 6:58–73
Risk Index-Based Ventilator Prediction
System for COVID-19 Infection
Amit Bhati
Abstract The current epidemic of the corona virus disease (COVID-19) in 2019
comprises a general wellbeing crisis of worldwide concern. Ongoing research shows
that factors similar to ımmunity, environmental effect, age, heart and diabetes are
significant supporters of this chronic infections. In this paper, a combined machine
learning model and rule-based framework is proposed to offer medical decision
support. The proposed system consists of a robust machine learning model utilizing
gradient boosted tree technique to calculate CRI index for patients suffering from
COVID-19 disease. This index is a measurement of COVID-19 patient mortality risk.
Based on CRI index system predicts required number of ventilators in forthcoming
days. The suggested model is trained and evaluated using a real-time dataset of
5440 COVID-19 positive patient obtained from John Hopkins University, World
Health Organization and dataset of Indian COVID-19 patients obtained from open
government data (OGD) platform India.
1 Introduction
A. Bhati (B)
Institute of Engineering and Technology, Dr. RML Awadh University, Ayodhya 224001, UP, India
e-mail: amitsbhati@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 53
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_5
54 A. Bhati
pandemics [3]. Similarly, current scenario also prevails where critical COVID-19
infected patients struggling are all around the globe because of the absence of access
to a few of these technologies [4]. Ventilators are one of the cases in this context that
are as of now in basic short supply [5, 6]. Ventilators are integrated for the treatment
of both flu and COVID-19 patients in serious intense respiratory failure [7, 8]. Earlier
investigations have indicated that intensive care units (ICUs) won’t have adequate
assets for providing a better treatment to all patients, who have a need of ventilator
support in the pandemic period [9, 10].
A compelling report of Imperial College London assesses that 30% of hospital-
ized COVID-19 patients are probably going to require ventilator support [11]. As a
result, shortage of ventilator stays unavoidable in several places of the world. Andrew
Cuomo’s, Governor of New York, requested for 30,000 units of ventilator for treat-
ment of COVID-19 patients [12]. Even in India, government has also suggested the
automobile companies to produce low-cost ventilators rather than producing vehicles
in such pandemic situation.
Regarding their functionality, ventilators are incredibly reliable machines
comprising of sophisticated pumps which control the oxygen and flow of air from
the patient’s lungs, supporting them while they cannot accomplish their work. As per
World Health Organization (WHO), COVID-19 can overpower our clinical facilities
at the territorial level by causing rapid growth in death rates [13, 14].
2 Background of Study
Since the seriousness of COVID-19 infection is firmly related to the prediction, the
fundamental and basic techniques to improve results are the early identification of
high-risk and critically sick patients. Zhou et al. [15] reported discoveries from 191
COVID-19 patients during the primary days of the spread in Wuhan and follow
the patient conditions till their discharge. Their discoveries reported about critical
suffered patients having age more than 56 years, a high rate of men (62%), and almost
half of patients with at least one disease (48%) [15]. In another report identified with
Wuhan City of China, mortality rate was 62% among critically sick patients suffered
from COVID-19, 81% of those required ventilators [16].
the pandemic, there has been a scramble to utilize and investigate ML, and other
scientific analytic methods, for these reasons. ML methods can precisely anticipate
how COVID-19 will affect the resource needs like ventilators, ICU beds, and so
forth at the individual patient level and the clinic level. In this manner giving a solid
image of future resource utilization and empowering medicinal services experts to
make well-informed decisions about how these rare resources can be utilized to
accomplish the greatest advantage. In this paper, we used gradient boosted machine
learning technique to identify optimally required number of ventilators for COVID-
19 patients.
3.1 Dataset
The propsoed research work has examined the dataset containing clinical records
of 5440 COVID-19 patients collected from confirmed source, for example, John
Hopkins University, WHO and open government data (OGD), and Government of
India sites. The sites have announced the details of COVID-19 cases. In our experi-
mentation, we have considered cases enlisted during month of February, March, and
first week of April 2020. The patients include both women and men with age ranges
from 21 to 91 years. The dataset comprises of 9 features reports about age, sexual
orientation, and clinical history of patients experiencing COVID-19.
In Table 1, except date of admission, age and sexual orientation, all features
are of binary nature, such as high blood pressure, cardiac disease, diabetes, nervous
system illness, respiratory disease, pregnancy childbirth, cancer disease, tuberculosis
ailment. Table 2 displays the demise pace of humanity specifically highlight class
56 A. Bhati
In data preparation step, the proposed framework clears out the missing qualities
from input dataset with mean values specially for numerical data. For each feature,
we have calculated co-relation coefficient as shwon in Table 1.
10
CRIi = Fi X Ai (3)
i=0
CRI index obtained from Eq. (3) for ith patient is not normalized. Use of this
CRI index values can degrade the performance of the entire learning model. Hence-
forth, the data ought to be improvised in quality before beginning a learning model.
Normalization of CRI can resolve this issue, so in next step we normalize the CRI
as:
CRIi − Min(CRI0−n )
CRI(N )i = (4)
Max(CRI0−n ) − Min(CRI0−n )
Normalized CRI index value obtained from Eq. (4) is calculated for every patient
record. This processed dataset is now ready for training purpose. For training and
validation of trained model, the dataset is divided into two parts. 70% of records
are used for training purpose and remaining 30% records are used for validating the
prediction of CRI index.
In order to trained the model for prediction, training operation is performed using
random forest, deep learning, gradient boosted trees, decision tree, support vector
machine approach. Gradient boosting technique is a ML procedure for relapse and
characterization problems, which delivers a prediction model as a collection of weak
prediction models, general decision trees. It is likely to be say, gradient boosting is
commonly utilized with decision trees [17]. Like other boosting techniques, gradient
boosting joins weak “learners” into a solitary solid learner in an iterative style. It is
58 A. Bhati
At each G(r+1) step calculation attempts to correct the error of its parent Gr .
The output from GBT model is applied to ventilator prediction process which takes
adaptive threshold value based on mortality rate in the region and forecasts the
expected number of ventilators required in near future based on last 10 days statistics
with predicted CRI index of patients. Adaptive threshold is computed automatically
with mortality rates in specific region as requirement of ventilator also depends on
immunity factor of person live in particular region. For example, immunity of people
lives in India may differs from people live in other countries. So in order to provide
good estimation adaptive threshold is utilized.
To check proposed system acceptability, we are using T-test statistic method. In our
case T-test allow us to compare the mean of number of required ventilators obtained
from our prediction model with actual number of ventilators used for curing of
COVID-19 patients. T-test for single mean can be given as:
X − µ
t= √ (6)
S/ n
where X , are sample mean (calculated with the help of predicted output) and popu-
lation mean (actual output), respectively, which can be calculated using Table 3.
S represents standard deviation of predicted output where n is the total number of
Risk Index-Based Ventilator Prediction System … 59
Table 3 Predicted number of ventilators required versus actual ventilators from testing dataset
Date No. of patient Actual ventilator Predicted required % Accuracy
registered used during No. of ventilators
treatment
20-Mar-2020 248 38 32 84.21
04-Apr-2020 3299 331 296 89.42
12-Apr-2020 7790 372 331 88.97
19-Apr-2020 13,888 446 417 93.49
26-Apr-2020 20,483 538 510 94.79
03-May-2020 29,549 612 579 94.60
10-May-2020 43,989 752 703 93.48
17-May-2020 55,875 881 799 90.69
24-May-2020 76,809 971 901 92.79
01-Jun-2020 97,008 1024 956 93.35
09-Jun-2020 133,579 2241 2159 96.34
17-Jun-2020 160,517 2839 2607 91.82
25-Jun-2020 190,156 3512 3374 96.07
03-Jul-2020 236,832 4587 4302 93.78
sample used. Degree of Freedom is (n − 1) Simplified form of Eq. (6) can be specified
by Eq. (7).
|36.1 − 33.4|
t= √ (7)
16.07/ 10
Degree of freedom = 9 and t cal = 0.91. By using T-Table value with one-tail
having α = 0.01, t9,0.01 = 2.821. Because t cal <<< t9,0.01 , hence we can say that
proposed system is highly acceptable.
The effectiveness of GBT learning model is extensively studied and compared with
other learning techniques. Performance of proposed model is tested over 1632 patient
records used for testing purpose from the dataset obtained from web resource of John
Hopkins University and WHO. Root mean square error (RMSE) and absolute error
(AE) define how well a machine learning model perform for traning. RMSE and
AE for all machine learning models used in our experimentation is calculated using
Eqs. (8) and (9), respectively.
60 A. Bhati
n
2
i=1 (Yi − Yi)
RMSE = (8)
n
n
i=1 Yi − Y i
AE = (9)
n
where Y i is the actual value of CRI index calculated using Eq. (4) and Y i is the CRI
Index value predicted by machine learning model for ith test dataset record.
Experimentation is done on Intel Xeon processor with 32 GB RAM, Nvidia-
GeForce GTX1080 GPU supported hardware. All the machine learning models using
in experimentation are trained and tested with python programming utlizing jyupter
tool having Scikit-learn library. In experimentation, gradient boosted trees are found
best model for training of our dataset as it has low RMS, AE compare to counterpart
methods which can be depicted in Fig. 1. The next set of experiment with trained
model has been done on the Indian COVID-19 confirm patient to find effectiveness
of our trained model. The dataset of Indian COVID-19 patients is made available
by open government data, India with limiting patient medical detail and hiding their
personal identification.
Table 3 depicts the accuracy of prediction model for number of COVID-19 patients
registered on a perticular date among which the patients actually utilized ventilator
in real scenario. Figure 2 depicts predicted required number of ventilators versus
actual required ventilators particularly in the case of COVID-19 in India.
Fig. 1 Performance evaluation of machine learning approach in terms of root mean squared error,
absolute error
Risk Index-Based Ventilator Prediction System … 61
Fig. 2 Predicted number of ventilators required and actual ventilators used in Rajasthan, India, for
COVID-19 positive cases
COVID-19 pandemic has just demise a large number of lives, and the number is
increasing step by step with exponential rate. As healthcare services assets are
compelled by a similar scarcity limitation that influence every one of us, it got
imperative to get ready for serious consideration to battle against such sickness. As
said by Honbl’e prime minister of India that in future we need to produce more venti-
lators to fight this pandemic. In this paper, we focused on prediction for ventilators
requirement based on CRI index which is calculated with COVID-19 patient medical
history. Physician by finding CRI index of COVID-19 patients pay more attention
toward their specific treatment. With proposed model, it could be hope that healthcare
data science communities, widespread adoption will lead to more effective interven-
tion strategies and ultimately help to curtail the worst effect of this pandemic. The
average performance of proposed model could be enhanced with utilizing stacking
of model with having training on large number of COVID-19 patients dataset.
References
1. World Health Association Coronavirus disease 2019 (COVID-19) situation Report—61. Avail-
able from: https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200322-
sitrep-62-covid-19.pdf. Accessed 22 Mar 2020
2. Chen N, Zhou M, Dong X (2019) Epidemiological and clinical characteristics of 99 cases
of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet (2019).
doi:https://doi.org/10.1016/s0140-6736(20)30211-7
3. Zhang X, Meltzer M, Wortley PM (2006) FluSurge–a tool to estimate demand for hospital
services during the next pandemic influenza. Med Decis Making 26(6):617–623
4. Miller J (2020) Germany Italy rush to buy life-saving ventilators as manufacturers warn of
shortages. Technical report, Reuters
5. Neighmond P (2020) As the pandemic spreads, will there be enough ventilators. NPR
6. Rubinson L (2010) Mechanical ventilators in US acute care hospitals. Disaster Med Public
Health Prep 4(3):199–206
62 A. Bhati
7. Huang HC, Araz OM, Morton DP, Jhonson GP, Damien P, Clement B, Meyers LA (2017)
Stockpiling ventilators for influenza pandemics. Emerg Infect Dis 23(6):914–921
8. Maclaren G, Fisher D, Brodie D (2020) Preparing for the most critically Ill patients with
COVID-19: the potential role of extracorporeal membrane oxygenation. JAMA
9. Smetanin P, Stiff D, Kumar A (2009) Potential intensive care unit ventilator demand/capacity
mismatch due to novel swine-origin H1N1 in Canada. Can J Infect Dis Med Microbiol
20(4):e115–e123
10. Stiff D, Kumar A, Kissoon N, Fowler R (2011) Potential pediatric intensive care unit
demand/capacity mismatch due to novel pH1N1 in Canada. Pediatr Crit Care Med 12(2):e51–
e57
11. Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and
healthcare demand. Available from: https://www.imperial.ac.uk/media/imperial-college/med
icine/sph/ide/gida-fellowships/Imperial-CollegeCOVID19-NPI-modelling-16-03-2020.pdf
Accessed 25 Mar 2020
12. Coronavirus spreading in New York like ‘a bullet train’. Available From: https://www.bbc.
com/news/world-us-canada-52012048 Accessed 25 Mar 2020
13. World Health Organization (2020) Critical preparedness, readiness and response actions for
COVID-19: interim guidance. World Health Organization, 7 March 2020
14. Ramsey L (2020) Hospitals could be overwhelmed with patients and run out of beds and
ventilators as the coronavirus pushes the US healthcare system to its limits. Business Insider
15. Zhou F, Yu T, Du R, Fan G, Liu Y (2020) Clinical course and risk factors for mortality of adult
in patients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. https://doi.
org/10.1016/S0140-6736(20)30566-3
16. Yang X, Yu Y, Xu J, Shu H (2020) Clinical course and outcomes of critically ill patients
with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational
study. Lancet Respir Med. https://doi.org/10.1016/S2213-600(20)30079-5
17. Son LH, Tripathy HK, Acharya BR (2019) Machine learning on big data: a developmental
approach on societal applications. In: Big data processing using spark in cloud. Studies in big
data. Springer, vol 43, pp 143–165. doi: https://doi.org/10.1007/978-981-13-0550-4
IoT-Based Smart Door Lock
with Sanitizing System
Abstract During pandemic situation, safety and security play a major role in main-
taining a person’s health and well-being. The safe and secure environment influences
social habits, reduces stress (feeling of freedom) and increases health protection.
When people feel safe, they find it easier to relax, do all the things that comfort and
focus on the work. The ultimate goal of this paper is on the complete integration of the
sanitizer dispenser into the door lock system to monitor the home using a smartphone
with hand hygiene for advanced safety and security against any anomalies that have
been detected in houses, office buildings and various construction sites—everywhere
there is a need for advanced frontline security. The door lock system opts security by
allowing the owner to control the buildings with a Bluetooth-connected smartphone-
controlled system using Arduino UNO with an android application developed and
opts for safety by using the sanitizer dispenser with PIR sensor by force cleaning the
user’s hands to open or close the door lock. In this method, the users should provide
valid login credentials in the application which is verified with the database over the
Internet and sanitize the user’s hands before permitting entry, thereby reducing the
spread of germs. If the credentials are invalid or if the sanitizer is not used buzzer
rings and an SMS alert will be sent to the owner of the building and keeps the door
locked which enhances the security along with safety or hygiene.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 63
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_6
64 M. Shanthini and G. Vidya
1 Introduction
A smart home automates the entire home for the benefits of the individuals but for
satisfied living, there is a need for safety and security. Some of the diseases, such
as corona, have been found to be transmittable easily because people fail to wash
their hands [1]. Accordingly, there is need for a disinfectant sanitizer mounted on the
wall near the main door which makes the users use it mandatorily before entering
any facility, such as professional offices, smart homes and any other locations where
the site owner desires to have every person entering or exiting through a doorway to
sanitize their hands, or the like, such as on any building that features an automated
digital door lock system for ensuring the security of the building. All the existing
door locking systems are old-fashioned ways of accessing the system with either a
traditional key or some means of radio-frequency identification (RFID) chips [2]. As
security is considered to be a primary concern humans find a solution that provides
reliable and automized security. This paper describes a security system that can
control the home door lock. The safety enhancements in the system should not only
improve the robustness of the system but also not complicate the system accessibility;
in other words, it should provide the ease of access. A Bluetooth module, namely
HC-05 interfaced with Arduino UNO, connects to the Bluetooth of the phone [3].
Each user will have a unique login credential and a type of authentication, i.e. either
password or pin lock stored in the firebase database which is a cloud-hosted database
and maintained by the owner. Users can access the door lock once the user credentials
are verified with the database over the Internet by using smart devices like tablets and
mobile phones with a developed application installed in it which will communicate
via Bluetooth to the lock by sending signals. The user after providing the valid login
credentials in the application, the user shows the hand below the sanitizer and the
motion detector detects the hand and dispenses a pre-set amount of disinfectant, after
which the door lock is opened.
The remainder of this paper is organized as follows: Section 1 includes the intro-
duction. Related works have been discussed in Sect. 2, and the proposed system is
discussed in Sect. 3. Section 4 discusses the result of the functional prototype of
the system implemented, and Sect. 5 contains the conclusion. Section 6 includes the
future scope. References are added at the end of the paper.
2 Literature Review
As per the survey, there exist many such systems to control the door. Each system
has a unique feature. The system aims [4] to develop a door security system using
an LDR sensor, ultrasonic sensor, servo motor and laser module connected with the
Arduino and Bluetooth application. Here, the Bluetooth module controls the door
through an application and the Arduino UNO receives and processes data such as the
intensity at a particular place and the distance from all these sensors continuously.
IoT-Based Smart Door Lock with Sanitizing System 65
This project focussed more on the ultrasonic sensor and LDR. The main drawback
of the system is that it does not have intrusion detection. In the study [5], a smart
door lock and lighting system using IoT for the smart home is presented. The user
can control the opening and closing of the door and can also control the lighting
using the Internet. One of the demerits here is that the relay is used to lock or unlock
the door and to switch ON/OFF the lighting. Relay usually requires a high voltage
to operate, and the motors also need a higher voltage and current which cannot be
given from the microcontroller.
The paper [6] uses a biometric lock using a fingerprint sensor with the door lock
system. The Arduino Nano is the microcontroller, and a Bluetooth module will set up
a communication between the microcontroller and the smartphone. The fingerprint
scanned in the smartphone is verified with the one stored in the android application
developed in kodular and installed on the smartphone. If it matches, a unique ID
of the lock will be sent by the application via the Bluetooth to move the servo to
the unlock position. If the fingerprint does not match, the servo moves to the lock
position. Even though a security measure is included in the system, it lacks the method
of intimation to the owner in case of a mismatch. A Bluetooth-controlled Arduino-
based home automation system proposed in [7] consists of a Bluetooth module,
relay module, LCD, LM35 temperature sensor and water sensor are connected to the
Arduino UNO. Once the microcontroller is powered, the water level and temperature
are displayed on the LCD. The motor is turned ON/OFF automatically if the water
crosses the defined levels. The doors, fans and lights are also controlled by the user.
The Arduino performs operations based on the information received from the user
who controls the whole system using an application. The main advantage of the paper
is that it saves electricity and reduces human effort.
Unlike other door lock systems, Bluetooth communication has been used to
transfer signals to control the door lock as it consumes less power and a database is
maintained which can be accessed only by the owner to monitor the home. To include
a measure of safety, the sanitizer dispensing system is attached along with the door
lock system. The most important feature added in the proposed system is that it can
alert the owner of the house with an SMS and the neighbours with a buzzer ring in
case of intrusion.
The main objective of this paper is to enhance the security and safety of the door lock
system with hand sanitizer dispenser. The hardware and software requirements for the
proposed system are shown in Table 1. The mobile device (android application) will
be sending a signal via Bluetooth to the Arduino circuit [8] that acts as a connection
between the smartphone and the servo motor. The Arduino makes decisions based
on the signal received. The use of Bluetooth on smartphones is suitable for the home
environment as it provides ease of access with better security as it covers only a shorter
66 M. Shanthini and G. Vidya
range than the conventional key. The PIR sensor detects the motion of the user’s hand
when shown below the sanitizer and dispenses a pre-set amount of disinfectant.
The UNO board can be powered from either the Universal Serial Bus (USB) or
an external power supply. The Arduino UNO board is connected to the computer
using the USB cable to program the board and also to power it up. In the integrated
development environment (IDE) of Arduino, under the tools menu, select the board
as Arduino UNO and port as Arduino UNO (COM3). The Arduino sketch is written
in C++ and uploaded to the Arduino UNO from the IDE [9]. The circuit connections
of the devices interfaced with Arduino are shown in Fig. 1. The sanitizer dispenser is
connected to the servo motor interfaced with pin 6 of the Arduino, and the door lock
is connected to the servo motor interfaces with pin 9 of the Arduino. This completes
the hardware setup. Next, the android application is developed in Massachusetts
Institute of Technology (MIT) application inventor which is an online platform to
create android applications and it is installed on the smartphone and paired to HC-05
using Bluetooth. Once the Bluetooth module is paired with the phone, the user can
start using the application. The application has two types of authentication first type
is by using passwords and second uses pin. The pin lock is included for making
it easy to use for illiterate or aged people. The application also uses the firebase
real-time database to store the user credentials such as username, authentication and
password or pin [10]. The stored information can be updated or modified by the
owner to ensure the privacy and security of the data.
IoT-Based Smart Door Lock with Sanitizing System 67
Open the android application then enter the username and choose the type of
authentication, i.e. either password lock or pin lock (pre-defined by owner in the
database). Once the login credentials are provided, the username is then verified in
the database if the user credentials are valid, then the next screen appears to either
enter the password or pin according to the authentication type. For pin lock, the
keypad to enter the pin will be shown, a four-digit pin (set by the owner) is entered
and the Bluetooth devices are paired automatically. For password lock, click on
CONNECT TO BLUETOOTH and select Door Lock, i.e. HC-05 from the list of
paired Bluetooth devices that appear on the screen, and then the user needs to enter
the pre-defined username and password and click on LOGIN. The entered password
or pin is verified by retrieving the user credentials from the database over the Internet.
The verification over the Internet is validated by capturing the network packets, i.e.
the requests sent by the application to the database and the responses received by the
application from the database using network packets analysing software Wireshark
[11]. The block diagram of the proposed system is shown in Fig. 2.
Once the user is verified, the buttons (LOCK and the UNLOCK buttons) to control
the door lock will be enabled for the user on the screen. If the user clicks the UNLOCK
button, the user will be notified to disinfect the hands simultaneously the application
will send a value to the servo motor interfaced with the Arduino via Bluetooth
module and the servo motor will rotate with that value and the lock will be opened
after a delay of 50s for the user to sanitize the hands with the sanitizer which is
68 M. Shanthini and G. Vidya
pumped automatically by a servo motor when the user’s hands are detected using the
passive infrared sensor (PIR sensor). Likewise, if the user clicks the LOCK button,
the application will send a value to the servo motor interfaced with the Arduino via
Bluetooth module, and thus, the servo motor will rotate with that value and the door
lock will be closed.
If the user credentials are incorrect or if the user does not disinfect the hands, the
buttons will not be enabled to lock or unlock the door and a signal will be sent to the
buzzer that makes it ring along with which an alert SMS will be sent from the current
user’s phone number to the house owner’s phone. A warning notification also pops
up on the screen to the user. Figure 3 shows the flowchart, which is the step-by-step
approach that was followed in the writing the automated door security program, that
enables the execution of a command from the android application developed.
4 Result
The verification over the Internet was validated by capturing the requests and
responses sent and received between the application and database using the Wireshark
software during the live testing of the application as shown in Figs. 8 and 9.
If the password was valid, the control buttons are enabled as shown in Fig. 10.
When the UNLOCK button was pressed, the servo motor attached to the door lock
will rotate with the value received from the application via Bluetooth, and once the
hands are disinfected, the door was unlocked.
In pin authentication, the Bluetooth devices were connected automatically as in
Fig. 11. If a valid pin was entered, the control buttons were enabled as shown in
Fig. 12. Similarly, the door lock will be unlocked in the click of the UNLOCK
button once the hands are sanitized.
The user credentials stored and maintained in the firebase database by the owner
are shown in Fig. 13.
When the UNLOCK button in the user application was pressed after database
validation, the user was notified to sanitize the hands as shown in Fig. 14 and the
corresponding values were sent to the Arduino via Bluetooth and once the hands
are disinfected, the servo motor rotates to unlock the door lock as shown in Fig. 15.
Similarly, when the LOCK button was pressed, the door lock was closed as in Fig. 16.
If the sanitizer was not used, the buzzer rings as a reminder for the user of hand
hygiene. If the user credentials were incorrect, the owner received an alert SMS along
with the generation of a buzzer ring, and the user was prompted to try again with a
notification in the application as shown in Fig. 17.
IoT-Based Smart Door Lock with Sanitizing System 71
The operations performed by the Arduino based on the signal received from the
application via Bluetooth were displayed in the serial monitor of the Arduino IDE
as in Fig. 18.
5 Conclusion
In this paper, considering safety and security as the main objectives digital door lock
with a sanitizing system are proposed. This system locks or unlocks the door when
the user provides valid login credentials in the installed android application and uses
the disinfectant by showing the hand in front of the sanitizer. An alarm is generated
with an SMS alert and the door remains locked if the invalid credentials are provided
in the application or if the user misses using the disinfectant, the buzzer rings as
a remainder which enhances the safety and security of the proposed method. It is
flexible and simple to install the system at a low cost with no overhead like drafting
and construction works.
72 M. Shanthini and G. Vidya
6 Future Scope
Fig. 11 Automatic
Bluetooth connection in pin
authentication
76 M. Shanthini and G. Vidya
References
1. Brow G, Raymond CA (2013) Door locking hand sanitizer system. In: Canadian patent applica-
tion. https://patentimages.storage.googleapis.com/54/3f/d1/9f1ecf1009a2f5/CA2776280A1.
pdf
2. Gupte NN, Shelar MR (2013) Smart door locking system. Int J Eng Res Technol 2(11):2214–
2217
3. Agbo David O, Chinaza M, Jotham O (2017) Design and implementation of a door locking
system using android app. Int J Sci Technol Res 6(8):198–203
4. Rathod K, Vatti R, Nandre M, Yenare S (2017) Smart door security using Arduino and Bluetooth
application. Int J CurrEng Sci Res 4(11):73–77
5. Satoskar R, Misrac A (2018) Smart door lock and lighting system using internet of things. Int
J Comput Sci Inf Technol 9(5):132–135
6. Patil KA, Vittalkar N, Hiremath P, Murthy MA (2020) Smart door locking system using IoT.
Int Res J EngTechnol (IRJET) 7(5):3090–3094
7. Al Mamun A, Hossain MA, Rahman Md.A, Abdullah Md.I, Hossain Md.S (2020) Smart
home automation system using Arduino and Android application. J Comput Sci EngSoftw
Test 6(2):8–12
8. Sohail S, Prawez S, Raina CK (2018) A digital door lock system for the internet of things with
improved security and usability. Int J Adv Res Ideas InnovTechnol 4(3):878–880
9. BhuteLK, Singh G, Singh A, Kansary V, Kale PR, Singh S (2017) Automatic door locking
system using bluetooth module. Int J Res Appl Sci EngTechnol 5(5):1128–1131
10. Khawas C, Shah P (2018) Application of firebase in Android App development—a study. Int
J Comput Appl 179(46):49–53
11. Das R, Tuna G (2017) Packet tracing and analysis of network cameras with Wireshark. In: 5th
international symposium on digital forensic and security (ISDFS)
Aspect-Based Sentiment Analysis
in Hindi: Comparison of Machine/Deep
Learning Algorithms
Abstract With the evolving digital era, the amount of online data generated such as
product reviews in different languages via various social media platforms. Informa-
tion analysis is very beneficial for many companies such as online service providers.
This task of interpreting and classifying the emotions behind the text (review) using
text analysing techniques is known as sentiment analysis (SA). Sometimes, the
sentence might have positive as well as negative polarity at the same time, giving
rise to conflict situations where the SA models might not be able to predict the
polarity precisely. This problem can be solved using aspect-based sentiment analysis
(ABSA) that identifies fine-grained opinion polarity towards a specific aspect asso-
ciated with a given target. The aspect category helps us to understand the sentiment
analysis problem better. ABSA on the Hindi benchmark dataset, having reviews from
multiple web sources, is performed in this work. The proposed model has used two
different word embedding algorithms, namely Word2Vec and fastText for feature
generation and various machine learning (ML) and deep learning (DL) models for
classification. For the ABSA task, the LSTM model outperformed other ML and DL
models with 57.93 and 52.32% accuracy, using features from Word2Vec and fast-
Text, respectively. Mostly, the performance of classification models with Word2Vec
embedding was better than the models with fastText embedding.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 81
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_7
82 T. Sai Aparna et al.
1 Introduction
The volume of online data has increased tremendously in recent years, giving rise
to various new opportunities as well as challenges in the field of research. Social
media websites such as e-commerce have user’s opinions or feedback systems about
a service or a product. This information is valuable for brands to understand the
sentiment and views of the customers about the product. It improves the quality of
the service. The prediction, analysis, and classification of sentiment expressed in
a review can perform with the help of sentiment analysis (SA) and aspect-based
sentiment analysis (ABSA).
Sentiment analysis (SA) is one way of finding the polarity based on the sentiment
associated with the reviews of an overall text. It is also known as opinion mining or
emotion AI sometimes. Further, the unorganized reviews and comments are found
in social media. Going through this manually remains inefficient and costly for
analysers. SA allows companies to sift through this group of data to achieve insights
that enable more organized form of it. SA becomes more accurate with ML as a
combination over time. It is analysed based on polarities, and most commonly used
polarities are positive and negative. SA works well when dealing with single polarity,
whereas in case of ABSA, the review along with aspect term adds extra information.
Using aspect terms, while prediction and classification make the analysis better and
this aspect dependent analysis for the prediction and classification of polarity may
be termed as aspect-based sentiment analysis (ABSA). The following review, “iska
performance kafi acha hai lekin iski banavat aur keyboard ki quality ne zarur hame
nirash kiya hai” has positive and negative polarity, respectively, due to two different
aspects (misc and hardware). For this example, ABSA can correctly predict and
classify the polarity of the review sentence, which was actually of a conflict type.
Conflict type of polarity is when the prediction was uncertain. For example, one
review has positive as well as negative polarity, then in that case, the model has to
learn and predict it as a conflict label. In the above example, the review has positive as
well as negative polarity which may produce uncertain polarity while classification.
The problem can be solved by training the model with a large number of data or by
using some class-specific features while setting the threshold limits for classification.
Furthermore, fine classification can be done, but as of now, in this work, it was
categorized as a separate label called conflict. Demand for more accurate prediction
or classification of sentiment expressed has increased in the current scenario. Trackers
can be set by the companies to the influencer’s social media account to see over time
and how their brands feature in the influencer’s conversation or news feed and what
followers feel about the same. It is also helpful for checking the impact and immediate
reaction and to monitor carefully. ABSA can be applied for Indian languages like
Hindi, Telugu, etc. (morphologically rich languages), but less amount of work is done
in Hindi language. Out of which many have transliterated or translated to the English
language. ABSA enhances the sentiment prediction by giving the extra information
of aspect term which acts as a keyword for classification and prediction.
Aspect-Based Sentiment Analysis in Hindi … 83
This paper has compared the performance of different word embedding models
with various classifiers that perform the task of ABSA in Hindi. The polarity was
predicted using the review as well as its aspect. Based on the reviews, a polarity label
of conflict type was also explored. For extracting features from the Hindi dataset, pre-
trained Hindi word embedding algorithms were used. Word embeddings algorithms
basically convert words into dense vectors with much lower dimensionality keeping
the context and semantics in place. The similarity between two words can be identified
by the distance between its vectors. Two different word embedding models, namely
fastText and Word2Vec were used in this work. Different classification models such
as support vector machines (SVM) [1 and 2], random forest (RF) were used for
classification.
Based on the experiments done in this work, using Word2Vec together with
SVM and Word2Vec along with long short-term memory (LSTM) achieved 59.34%
accuracy for SA (reviews-polarity) and 57.93% accuracy for ABSA (review-aspect-
polarity) task. For fastText pre-trained word embedding models in machine learning
(ML), SVM had performed better by achieving 51.37% accuracy for SA, whereas
deep learning (DL) had performed better by 52.32% accuracy for ABSA. All the
mentioned accuracies of Word2Vec performed better compared to the benchmark
accuracy of 54.05% for classification addressed in [3].
Section 2 describes a brief overview of the research done in the field of SA and
issues were addressed. In Sect. 3, the dataset used in this work is discussed. In Sect. 4,
the overview of the analysis process was discussed with the help of flow diagram,
in Sect. 5 the experiment results as well as observations obtained by our work are
discussed. Finally, Sect. 6 concluded the work with future directions of our research.
2 Literature Review
Overview of the research done in the field of SA and ABSA is summarized as follows.
Akhtar et al. [3] had created an annotated benchmark Hindi SA dataset, consisting
of product reviews crawled from different online websites. They had also used a
conditional random field (CRF) for aspect term extraction and achieved an average
F-measure of 41.07% and SVM algorithm for SA with an accuracy of 54.05% [3].
Also, this paper used the same dataset as [3]. DL models like neural networks for
classification used by only one team in SemEval-2014 task which ranked below
15 among all the top-ranked teams within all the subtasks. The other top rankers
of the task had not used DL for classification. Though for other datasets some of
the DL models were considered in [4]. Chen et.al had proposed a transfer capsule
network (TransCap) model for ABSA [5]. The concept of the model is dynamic
routing between capsules in a capsule network and transfer learning framework to
transfer knowledge from document level to aspect level sentiment classification. In
this work, experiments were performed on two SemEval datasets demonstrating that
the TransCap model had performed better by a large margin to the state-of-the-art
methods [6]. Wang et.al proposed a DL-based aspect and sentiment prediction model
84 T. Sai Aparna et al.
and outperformed the SVM model trained on the SemEval 2014 dataset and also
proposed a new method where a constituency parse tree used to connect sentiments
of the reviews with its corresponding aspects [7].
The UNITOR system participated in SemEval-2014 competition and used SVM
with different Kernel methods for various tasks addressed by Castellucci et al. in [8].
SVMhmm was used for aspect term extraction by tackling the problem as a sequential
tagging task, where multiple SVM kernels are linearly combined to generalise several
linguistic information. A shared-task on sentiment analysis of code-mixed data pairs
of Bengali-English (BN–EN) and Hindi-English (HI–EN) conducted in 2017, and the
overview of the task was provided [9]. The best performing team used character-level
n-grams and word-level features with SVM classifier to obtain a maximum macro-
average f-score of 0.569 for HI–EN and 0.526 for BN-EN datasets addressed by
Patra et.al. For DL models, Akhtar et.al has proposed the long short-term memory
(LSTM) architecture, which was built on top of bilingual word embeddings for
ABSA. The aim of the work was to reduce the data sparsity effect in resource-
poor language. The model had outperformed other state-of-the-art architectures in
two different setups, namely multilingual and cross-lingual [10]. A comprehensive
overview of DL architectures applied for ABSA was studied in [4]. Around 40
methods were summarized and categorized by taking DL architectures and tasks into
consideration. Santos et.al performed experiments using fastText word embedding
with different ML and DL frameworks. The results show that the proposed model
of CNN had outperformed other approaches of ML and DL [11]. The author in this
paper addressed the gap between research and existing implementations of many of
the popular algorithms such as inherent mathematical complexity of the inference
algorithms, high computational demands, and lack of a “sandbox” environment, it
would enable practitioners to apply the methods specific to the problems on real data.
The author’s contribution here is to fill the gap between academia and ready-to-use
software packages. Within an existing digital library DML-CZ, author demonstrated
the practicability of their approach on a real-world scenario of computing document
similarities [12].
Table 1 shows the issues related to the state-of-the-art model, which used the dataset
[3]. It also describes the solution to those problems.
3 Dataset Description
The dataset consists of 5417 reviews, 99 aspect terms, and four polarities [3]. The
polarity labels are positive, negative, neutral, and conflict. Conflict polarity labels
make the classification model work better by throwing the uncertain or multiple
prediction of polarities to the label conflict. The number of reviews per label is
presented in Table 2. Manual annotation performed for missing 1000 aspect labels
of respective reviews and category polarities. Without transliterating or translating,
the dataset was taken as such in the original Hindi language.
The overview of the process is given in Fig. 1. Initially, the preprocessing of the
dataset was performed. The further dataset was cleaned, tokenized, removed stop-
words, white space and zero padded. These preprocessed and tokenized reviews were
sent to a word embedding algorithm for converting words into dense vector represen-
tation taking their context and sequence into account. These numeric representations
were passed to various ML models and DL models for classification. The models
were evaluated using the statistical measures given in Sect. 4.2.
The state-of-the-art model was used in [3] for ABSA in Hindi, and dataset used
is explained in Sect. 3 of this paper. Implemented ML algorithms with N-gram
and other linguistic features. Further, the performance of various ML algorithms is
investigated with word embeddings generated using Word2Vec as well as fastText
features. The Word2Vec and fastText algorithms can embed semantic information
in the word vectors. It is required to check how the word embedding features are
generated using algorithms like Word2Vec and fastText to improve the classification.
The performance of DL algorithms like RNN, LSTM, and GRU for ABSA in Hindi
is also investigated because these algorithms can capture the sequential relationship
86 T. Sai Aparna et al.
Dataset
Preprocessing:
Tokenization, White Space removal, Stop Words removal
Word Embeddings:
(Word2Vec and fastText) Hindi Word Embeddings
Tasks:
SA – Review as X and labels as Y
ABSA – Review and aspect together as X and label as Y
Classification Models:
ML – NB, DT, AB, KNN, RF, SVM
DL- RNN, LSTM, GRU
Prediction
Evaluation Metrics:
Accuracy, Precision, Recall, F1-score
among the words in a sentence and can generate more meaningful representation for
the sentences.
Various statistical measures were utilized in order to evaluate the performance of the
models.
Metrics like accuracy, precision, recall, F1-score can be calculated using true
positive (TP), true negative (TN), false positive (FP), and false negative (FN).
TP denotes the quantity of positive samples that are correctly classified. TN
denotes the quantity of negative samples that are correctly classified. FP denotes
the quantity of negative samples that are misclassified. FN denotes the quantity of
positive samples that are misclassified.
Accuracy measure is the total number of correctly classified samples out of all
the classified samples.
Precision measure is the ratio of true positive with respect to all the positives
predicted.
Aspect-Based Sentiment Analysis in Hindi … 87
Recall measure is the ratio of true positive with respect to a total number of true
actual classifications.
The experiments consisted of four steps. First, the dataset was divided into train
data and test data with the ratio of 0.75:0.25. In the second step, the features were
extracted from the word embeddings using fastText and Word2Vec algorithms. For
the fastText algorithm, the model was trained directly using the Hindi pre-trained
model. For Word2Vec, the Hindi pre-trained model was trained on the review dataset
and was appended to the vocabulary to increase the number of data samples. In the
third step, features that were extracted from these models were used by various ML
and DL algorithms for classifying the polarity of the review. The ML models used
for classification [13] were Naive Bayes (NB), decision tree (DT), AdaBoost (AB),
K-nearest neighbours (KNN), RF, and SVM, whereas the DL models used were
recurrent neural network (RNN), LSTM, and gated recurrent unit (GRU). Keras [14]
and Scikit-learn [15] were used for implementing DL and ML algorithms respec-
tively. The parameters for each classification algorithm was fixed by hyperparameter
tuning. The fixed hyperparameters are given in Tables 3 and 4, respectively. The
above-mentioned steps were repeated for SA (reviews) and ABSA (review-aspect)
tasks. Further, the classification results for polarities were evaluated and compared
between SA and ABSA using various ML and DL algorithms for both fastText and
Word2Vec word embedding algorithms.
In sentiment analysis, the polarities were classified based on the reviews using
NB, DT, AB, KNN, RF, SVM, RNN, LSTM, and GRU algorithms are tabulated
in Table 5. The table shows a comparison between ML and DL algorithms using
Word2Vec and fastText algorithms. SVM with Word2Vec word embedding achieved
an accuracy of 59.34%, precision of 0.5888, recall of 0.5934, and F1-score of 0.5902.
The SVM with fastText word embedding acquired an accuracy of 51.37%, precision
of 0.5214, recall of 0.5137, and F1-score of 0.5165. These models outperformed other
ML models. In the case of DL models, LSTM performed better than other models
88 T. Sai Aparna et al.
Table 3 Hyperparameters
Algorithms Parameter Parameter values
for ML algorithms
NB Priors None
Var smoothing 1e-09
DT Random state 100
class_weight None
Criterion Gini
max_depth None
Splitter Best
AB random_state 100
KNN n_neighbours 5
Weights Uniform
Metrix Minkowski
RF random_state 100
class_weight Balanced
Criterion Entropy
max_depth 10
n_estimators 50
SVM C 100
Kernel rbf
Degree 3
random_state None
class_weight None
Table 4 Hyperparameters
Parameter values Parameter values
used for DL models (RNN,
LSTM, and GRU) Units 64
Batch size 32
Epochs 1000
Input dim 300
Optimizer Adam (lr = 0.01)
Recurrent dropout 0.0
Recurrent activation Sigmoid
Loss function Categorical cross entropy
Dense layer activation Linear
Output layer activation Softmax
and achieved an accuracy of 55.79%, precision of 0.5523, recall of 0.5579, and F1-
score of 0.5525 using Word2Vec embedding. GRU with fastText word embedding
outperformed other models and achieved an accuracy of 51.07%, precision of 0.5018,
recall of 0.5107, and F1-score of 0.5047. Among all the experiments, it is observed
that SVM with the Word2Vec word embedding algorithm achieved the best results.
In aspect-based sentiment analysis, both review and aspect terms are taken for
the classification of polarities using different ML and DL classification algorithms.
The results of classification using both fastText and Word2Vec algorithms are shown
Aspect-Based Sentiment Analysis in Hindi … 89
in Table 6. KNN with Word2Vec word embedding acquired 50.41% accuracy, the
precision of 0.5048, recall of 0.5041, F1-score of 0.5036, and SVM with fastText
embedding achieved an accuracy of 51.37%, precision of 0.5214, recall of 0.5137, and
F1-score of 0.5165. LSTM with Word2Vec with 57.93% accuracy, the precision of
0.5785, recall of 0.5646, and F1-score of 0.5594 performed well, whereas both GRU
and RNN with fastText algorithm achieved accuracy, precision, recall, and F1-score
of 52.10, 0.5133, 0.5210, and 0.5103%, respectively. The observation made was,
among all the results obtained, LSTM with Word2Vec word embedding algorithm
achieved better results (Table 6).
Table 7 shows the time taken by classifiers with various word embedding
algorithms to complete both training and testing. The time was considered in seconds.
6 Conclusion
In this work, performance comparisons were made for various ML and DL models
using fastText and Word2Vec word embedding algorithms for sentiment analysis and
aspect-based sentiment analysis. For comparison, different classification algorithms
were utilized. For sentiment analysis, SVM combined with Word2Vec gave a better
performance with an accuracy of 59.34% in comparison with other ML and DL
algorithms. In the case of aspect-based sentiment analysis, LSTM combined with
Word2Vec outperformed the rest of the algorithms with an accuracy of 57.93%. An
increase in dataset size may improve the classification accuracy. In future work, for
classification of polarity, DL models such as convolutional neural networks (CNN),
CapsuleNetwork, and transfer capsule network can be taken into consideration.
References
1. Soman KP, Loganathan R, Ajay V (2009) Machine learning with SVM and other kernel
methods. PHI Learning Pvt. Ltd.
2. Vapnik V (2013) The nature of statistical learning theory. Springer Science & Business Media,
Berlin
3. Akhtar MS, Ekbal A, Bhattacharyya P (2016) Aspect based sentiment analysis in Hindi:
resource creation and evaluation. In: Proceedings of the tenth international conference on
language resources and evaluation (LREC’16)
4. Do HH et al (2019) Deep learning for aspect-based sentiment analysis: a comparative
review.Exp Syst Appl 118:272–299
Aspect-Based Sentiment Analysis in Hindi … 91
5. Chen Z, Qian T (2019)Transfer capsule network for aspect level sentiment classification.
In:Proceedings of the 57th annual meeting of the association for computational linguistics
6. Sabour S, Frosst N, Hinton GE (2017)Dynamic routing between capsules.Advances in neural
information processing systems
7. Wang B, Liu M (2015)Deep learning for aspect-based sentiment analysis.Stanford University
Report
8. Castellucci G et al (2014)Unitor: aspect based sentiment analysis with structured learning. In:
Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014)
9. Patra BG, Das D, Das A (2018)Sentiment analysis of code-mixed Indian languages: an overview
of SAIL\_Code-Mixed Shared Task@ ICON-2017.arXiv preprint arXiv:1803.06745
10. Akhtar MS et al (2018)Solving data sparsity for aspect based sentiment analysis using cross-
linguality and multi-linguality. In: Proceedings of the 2018 conference of the North American
chapter of the association for computational linguistics: human language technologies, vol 1
(Long Papers)
11. Santos I, Nedjah N, de Macedo Mourelle L (2017) Sentiment analysis using convolutional
neural network with fastText embeddings. In: 2017 IEEE Latin American conference on
computational intelligence (LA-CCI). IEEE
12. Rehurek R, Sojka P (2010) Software framework for topic modelling with large corpora.In:
Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks
13. Premjith B et al (2019) Embedding linguistic features in word embedding for preposition sense
disambiguation in english—Malayalam machine translation context.Recent Adv Comput Intell
341–370
14. Pedregosa F et al (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–
2830
15. Chollet F (2015) Keras documentation.Keras.io
Application of Whale Optimization
Algorithm in DDOS Attack Detection
and Feature Reduction
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 93
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_8
94 P. Ravi Kiran Varma et al.
Mugunthan in [10]. Neural networks and SVM are explored to classify the DDOS
attack traffic by Pandian and Smys [11].
Khundrakpam et al. [12] discussed the network packet parameters like HTTP
GET and POST requests for the detection of DDOS attacks. They used classifiers,
viz. Naïve Bayes, multinominal, multilayer perception, random forest, to identify
the attacks with an accuracy of 93.67%. The intrusion detection system is the secret
tool for network protection for that proposed best, low-cost IDS model, which deals
with the less tagged data also [6]. This semi-supervised IDS uses fuzzy c-means
clustering and is tested with the NSL-KDD benchmark IDS dataset. Few works in
that used optimization methods in networking domain are [13, 14].
Narasimha et al. [15] discussed the latest method for abnormality detection by
using a machine minder to protect the network and to identify the pattern of attacks.
They used the algorithm, viz. Naïve Bayes classifier, for finding the DDOS traffic.
For testing and training, the NSL-KDD dataset was used, and PCA is used for the
feature extraction with an accuracy of 92%. Varma et al. [16] detected the network
attacks by implementing a rough set-based filter method with ant colony optimization
(ACO). Table 1 lists a brief comparison of contemporary literature on DDOS attack
mitigation. WOA is employed as a feature selection in [17] combination with feature
selection. In radial dispersion, WOA is used in the optimization of the size and
positioning of the capacitors [3]. A binary system is followed in feature selection
using WOA in [1].
The drawbacks of the existent work with respect to the DDOS attacks detection
using the CICDDOS2019 dataset are as follows: Since this is the latest dataset, there
are no works that have been investigated on identifying highly relevant attributes of
the traffic features. The usefulness of meta-heuristic search method like WOA in a
wrapper method is not studied on this dataset. Performance comparison of important
wrapper classifiers on WOA for feature selection is not done. There is a real need to
experiment on dynamic global meta-heuristic optimization logics like the WOA to
minimize and confirm the highly relevant and sufficient DDOS traffic features using
the CICDDOS2019 dataset. Minimal traffic attributes help to design wire speed
DDOS detection systems consuming less computing and memory resources. This
paper presents the result of applying WOA [18] in feature reduction of the DDOS
attack dataset.
Very recently, Mirjalili and Lewis came up with yet another nature-inspired meta-
heuristic global exploration optimization algorithm, WOA, mimicking the bubble-net
hounding behavior of humpback whales [19]. WOA constitutes a couple of hunting
methods. The first mechanism is to hunt the prey with the best or random nego-
tiator, and the second mechanism is replicating the bubble-net charging strategy. The
artificial whale optimization is described below.
96 P. Ravi Kiran Varma et al.
After discovering the position of prey, the humpback whale encircles around them.
First, they identified the location of the most appropriate pattern in search space;
then, the whale optimization algorithm supposes that the present dominant applicant
solution is the prey or it is closer. Then the remaining search agents can be altering
their location to the so-far best search agent. The whale positioning is represented as
follows:
−
→
P(t D
+ 1) = P ∗ (t) − P. (1)
Application of Whale Optimization Algorithm in DDOS … 97
−
→∗
B = S. P (t) − P(t) (2)
The present location of the whale is represented as P(t + 1) and the earliest
−
→
location of the best solution at repetition t is represented by P ∗ (t) the coefficient
vectors P and Q are expressed as
v + a
p = 2.l. (3)
S = 2.
v (4)
After calculating the range between prey and whale, in that situation, they generated
the spiral equation between the victim and the whale. The helix-shaped movement
of the whale is given as follows.
−
→ − →
+ 1) = e gh . cos(2π h) . B ∗ + P ∗ (t)
P(t (5)
→∗ −
− →
B = P ∗ (t) − P(t) (6)
During optimization, the behavior is used to change the location of whales. There
are 50% chances of selecting the encircling mechanism. Their component is designed
as
− →
−
→ P ∗ − p. B if A < 0.5
P (t + 1) = −
→∗ − →∗ (7)
e . cos(2π h). B + P (t) ifA ≥ 0.5
gh
where A is expressed in the range (0, 1), which is nothing but a random number.
The finding of prey is known as the exploration phase. The search agents discover their
prey using random search depending on the position of each other. The mathematical
expression is derived as follows:
+ 1) = −
P(t
−→
Pr pv − p. B (8)
98 P. Ravi Kiran Varma et al.
−−→
B = S. Pr pv − P (9)
−−→
where Pr pv is random position vector.
The WOA wrapper algorithm is given in Algorithm 1. The fitness function is
defined in Eq. (10), where ∂ is the tuning parameter, acc is the accuracy of the
classifier wrapper, ω is the full feature dimensionality, and L is the dimensionality
of an agent’s solution.
ω−L
f (L) = ∂ × acc + (1 − ∂) × (10)
ω
The process of feature selection by the whales as agents in the WOA is listed in
Algorithm 1. Here, binary WOA is used, wherein each agent forms a solution. A
population for an agent at any given point of time is nothing but a set of 0s and 1s,
where 0 depicts the absence of a feature and a 1 depicts that the particular feature
is present in the set. The count of 1s and the index values of 1s are nothing but the
solution finally. Initially, all the whale agents start with a population of randomly
selected features from the full set. The fitness measure is computed for every agent’s
population, and the best one is recorded. The fitness of a whale’s population depends
on the wrapper classifiers classification accuracy, higher the best, as well as the length
of the solution, smaller the best. In each iteration, one best set of features stands as
the local best, and at the end of all iterations the global best whale’s population is
the solution with the highest fitness measure. In each iteration, the swarm of whales
traverses according to the equations given from 1 to 9.
Application of Whale Optimization Algorithm in DDOS … 99
CICDDOS2019 [6] dataset is used in this paper to test the proposed feature reduction
algorithms. This dataset carries malware and the latest common DDOS traffic, which
represents the PCAP data. Realistic background traffic is used in the generation of
the dataset. An actual bearing of 25 users based on such protocols, viz. SSH, HTTP,
HTTPS, SSH, FTP, and email protocols is collected to construct the dataset. This
dataset contains six categories of DDOS attacks. It has a total of 60 lakhs samples.
However, 3890 samples are taken randomly.
The dataset includes 80 features. The feature selection of the dataset is imple-
mented in Python, and the classifier entity implemented scikit-learn. By imple-
menting the proposed WOA algorithm in the wrapper method, eleven features were
selected out of 80 by the random forest wrapper classifier. The decision tree wrapper
resulted in 12 attributes, Naive Bayes wrapper produced 16 attributes, and the multi-
layer perceptron (MLP) wrapper resulted in 19 attributes. Table 2 shows the results
of four different classifiers tested as wrappers with WOA. Random forest has given
the best accuracy of 99.94% post the reduction of features, whereas the accuracy of
the full feature was recorded to be 99.92%. Figure 1 is a graphical representation of
99 16
98 14
ACCUACY
12
97
10
96
8
95 6
94 4
93 2
92 0
Random J48 Trees Naïve Bayes MLP
Forest
100 P. Ravi Kiran Varma et al.
the outputs of the wrappers considered in this work. Table 3 is a comparison with
similar works that used the CICDDOS2019 dataset. All experiments are run with
50 whales and 50 iterations. Due to randomness and the stochastic nature of the
WOA algorithm and the wrapper classifier, different wrappers of Table 2 produced
different length solution. In the process of feature selection with WOA and a wrapper
classifier, data cleaning will be done and the noise in the data shall be eliminated and
hence an improvement of prediction accuracy can be observed.
The eleven features that are selected by the WOA random forest wrapper are listed
in Table 4.
4 Conclusion
One of the challenging threat vectors to be dealt with the IT security teams is the
DDOS attacks. Real-time extraction of network traffic attributes and further analysis
using machine learning techniques is a promising way to deal with the problem.
There exists a problem of large dimensionality of the network traffic attributes that
must be processed in real time for DDOS attack detection. Efficient attribute selection
methods are required to deal with the dimensionality problem. Meta-heuristic search
Application of Whale Optimization Algorithm in DDOS … 101
methods inspired by nature are of great use when combined with classifiers for eval-
uation. This paper proposed WOA, a humpback whale hunting behavior mimicking,
for selecting the appropriate shorter length attributes from the CICDDOS2019 DDOS
dataset. The results showcase a near 100% accuracy with as little as eleven features
out of 80 attributes.
References
1. Hussien AG, Hassanien AE, Houssein EH, Bhattacharyya S, Mohamed A (2018) S-shaped
binary whale optimization algorithm for feature selection. Adv Intell Syst Comput 79–87
2. Yusof AR, Udzir NI, Selamat A, Hamdan H, Abdullah MT (2017) Adaptive feature selection
for denial of services (DoS) attack. In: 2017 IEEE conference on application, ınformation and
network security (AINS), pp 81–84. Miri: IEEE
3. Prakash DB, Lakshminarayana C (2017) Optimal siting of capacitors in radial distribution
network using Whale Optimization Algorithm. Alexandria Eng J 56(4):499–509
4. Deka RK, Bhattacharyya DK, Kalita JK (2019) Active learning to detect DDoS attack using
ranked features. Comput Commun 145:203–222
5. Aamir M, Zaidi SMA (2019) Clustering based semi-supervised machine learning for DDoS
attack classification. J King Saud Univ Comput Inf Sci 1–11 (In Press)
6. Sharafaldin I, Lashkari AH, Saqib Hakak S, Ghorbani AA (2019) Developing realistic
distributed denial of service (DDoS) attack dataset and taxonomy. In: 2019 ınternational
carnahan conference on security technology (ICCST), pp 1–8. Chennai: IEEE
7. Hoque N, Kashyap H, Bhattacharyya DK (2017) Real-time DDoS attack detection using FPGA.
Comput Commun 110:48–58
8. Wang Meng, Yiqin Lu, Qin Jiancheng (2019) A dynamic MLP-based DDoS attack detection
method using feature selection and feedback. Comput Secur 88:1–14
9. Asosheh A, Ramezani N (2008) A comprehensive taxonomy of DDoS attacks and defense
mechanism applying in a smart classification. WSEAS Trans Comput 7(4):281–290
10. Mugunthan SR (2019) Soft computing based autonomous low rate DDOS attack detection and
security for cloud computing. J Soc Clin Psychol 1(2):80–90
11. Pasumpon Pandian A, Smys S (2019) DDOS attack detection in telecommunication network
using machine learning 1(1):33–44
12. Johnson Singh K, De T (2015) An approach of DDOS attack detection using classifiers. In:
Emerging research in computing, ınformation, communication and applications. Springer, New
Delhi, pp 429–437
13. Raj JS, Basar A (2019) QoS optimization of energy efficient routing in IoT wireless sensor
networks 1(1):12–23
14. Haoxiang W (2019) Multi-objective optimization algorithm for power management in cognitive
radio networks. UCCT 1(2):97–109
15. Mallikarjunan KN, Bhuvaneshwaran A, Sundarakantham K, Mercy Shalinie S (2017) DDAM:
detecting DDoS attacks using machine learning approach. In: Computational intelligence:
theories, applications and future directions—volume I, advances in intelligent systems and
computing. Springer, Singapore, vol 798, pp 261–273
16. Ravi Kiran Varma P, Valli Kumari V, Srinivas KS (2016) Feature selection using relative fuzzy
entropy and colony optimization applied to real-time intrusion detection system. Proc Comput
Sci 85:503–510
17. Mafarja M, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing
for feature selection. Neurocomputing 260:302–312
18. Mohammed HM, Umar SU, Rashid TA (2019) A systematic and meta-analysis survey of whale
optimization algorithm. Comput Intell Neurosci 1–25
102 P. Ravi Kiran Varma et al.
19. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67
20. Hoque N, Kashyap H, Bhattacharyya D (2017) Real-time DDoS attack detection using FPGA.
Comput Commun 48–58
21. Elsayed MS, Le-Khac N-A, Dev S, Jurcut AD (2020) DDoSNet: a deep-learning model forde-
tecting network attacks. In: 21ST IEEE ınternational symposium on a world of wireless, mobile
and multimedia networks (IEEE WOWMOM 2020). Cork
Social Media Data Analysis: Twitter
Sentimental Analysis on Kerala Floods
Using R Language
Keywords Kerala Floods · Rescue · Flood relief fund · Donate · Rebuild Kerala ·
Emotions · Save · Lives
1 Introduction
From the past many years, various new methods, algorithms and with the approach
of new time, technology has got its new and higher pace [1, 2]. This development
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 103
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_9
104 M. Katamaneni et al.
has driven the adjustment in individuals’ method for communicating their perspec-
tives, feelings and assessments, and furthermore, the stages in which they do as
such. Presently, the utilization of social locales, Web journals, online gatherings
have become possibly the most important factor, and therefore they create enormous
measure of information which whenever dissected can be helpful for business, inno-
vations, common issues and so forth. Information investigation is increasing immense
force over the world [3–5]. One of its applications is estimation investigation which
is a zone as of now under research. Assumption examination uses the intensity of
information investigation to extricate feelings communicated through words utilized
by individuals crosswise over different social stages, remarks, surveys and so on.
Everyday, many number of users in Twitter express their perspectives and feelings
day by day on a particular or unspecific point as tweets. Breaking down unstructured
information is in itself a troublesome undertaking and extricating valuable data from
it is a major test. For doing as such, there is need of ground-breaking devices [5,
6]. Furthermore, advancements which can deal with a huge number of tweets and
removing slant from them. In spite of the fact that there are different routes conceiv-
able to do as such in this paper R dialect is utilized to play out the activity. Nostalgic
analysis is a strategy to investigate whether a composed content is in positive, nega-
tive or unbiased state. Generally, this consists of expressions of the various users in
social media, i.e., Twitter [7–9].
2 Analysis Architecture
In Fig. 1, the process of sentimental analysis steps is explained and how the
functionality is worked as follows:
(a) The selected application confirmation about Twitter:
We have to connect with Twitter API utilizing the login credentials of Twitter
developer application. It is vital to confirm, to interface R Studio with Twitter
for extricating the tweets. Once the confirmation is finished, continue to further
steps.
(b) Various packages that are required:
Different types of R packages that are required to install to perform various
analysis of tweets. The selected packages consist of various functions that will
analyze the tweets.
(c) Tweets extraction:
The information is gathered from the tweets that are trending point will use hash
tag “#”.
(d) Pre-processing of data:
This process is done by expelling undesirable articulations and words.
(e) Data modeling and transforming:
After the completion of pre-processing and cleaning the data the better structure
is formatted to extract the sentiments from the tweets.
(f) Sentimental analysis:
Analysis of sentiments is done here.
(g) Graphical view:
It is the last advance, where the conclusions are plotted and are envisioned by
diagrams and word cloud.
The platform for using source code for free programming is R which is a language
for open source and programming essentially utilized by analysts and deep informa-
tion analysis can be used to contemplate different factual information. For example,
posts, reviews and so forth. There are several steps to be followed for analyzing the
sentiments. The steps are mentioned below:
Step 1:
Generate a Twitter developer account and make login to utilize the tweets in R Studio.
Take the “API key,” “API secret key,” “Access token” and “Access token secret” to
perform agreement with R console.
Step 2:
For the sentimental analysis, these packages are to be downloaded. Few packages
used are:
Twitter—this will create the interface to the Web API.
Twitter httr—reins request and mechanises with URLs.
Step 3:
Once the confirmation is done, the tweets can be extracted tweets by means of hash
tag.
106 M. Katamaneni et al.
Step 4:
As it is an important process for every data to be cleaned, i.e., data cleaning. There
may be inadequate tweets in the Twitter data, i.e., unnecessary data, so it is necessary
to clear the data to get good results.
Step 5:
The data that is cleaned in the first step is arranged in such a manner in a data frame
and a matrix representation so as to perform the operations
Step 6:
From a collection of tokenized words, a set of feelings are separated and information
analysis is done to watch slants. In this way, the task is done.
Tweets:
The analysis of the tweets can be very recent natural calamity Kerala Floods. Due to
heavy rainfall Kerala, the south India State is largely effected with severe floods. Big
number of people died and a million of people were affected. It was really a worst
flood in the century. So people expressed their emotions through Twitter by tweeting
the effect the Kerala.
Tweets Extraction:
Tweets extraction, in straightforward words, implies gathering information for exam-
ination. Here, it suggests to collection of tweets. The API is searched the actual
acceptance API for Twitter, this restores the tweets that, coordinates the given string
and composes it into the object. Add up to 3000 tweets were gathered on “#kerala
floods”.
The information assembled is not unadulterated. It contains hash labels, URLs, short-
ened forms, accentuation, stop words and so forth. To get better results and good
information, the tweets must be cleaned in a proper way. The libraries such as tm
package and stringr packages are being used for proper functioning of data cleaning
and mining [10, 11]. The corpus must not contain the tweets which are not included
such as retweets, joins, @, accentuations and different images which do not precise
any spirits. For the further analysis, the corpus is used [3, 4].
There are various functions for which unwanted strings are to be removed from
the tweets like
• removePunctuation()—to remove punctuation marks.
• removeNumbers()—to remove numbers as numbers are not important in sentences
to analyze emotions.
• tolower()—to convert the whole data to lower case.
Social Media Data Analysis: Twitter Sentimental … 107
The above figure gives the graphical view of the words which are used more
number of times in tweets [1, 2]. It is observed that the word kerala is used number
of times and this word has highest frequency compared to other words floods and
accident.
Nowadays, hashtags are most widely used in many social networking sites that
are used to trending the news and various tweets or messages. This is used to send
the idea or message of the person that can be passed to every person who uses this
hashtag. This represents the feature of one specific word. Some of the hashtag that are
present in Fig. 3 are #KeralaFloods, #KeralaFloodReliefSJM, #Kerala Relief Fund,
#ReBuildKerala. These types of hashtags represent the people’s views that will show
the deep insights of the hashtag creators.
Figure 4 shows the words with sentiments are rebuild, happens, livelihoods, lost,
help, tomorrow, munnar, witness, etc. The major emotion for word lost is sadness,
for help is trust and for both words the sentiment will be positive. Some of the like
keralafloods can be taken in both negative and positive sentiment.
5 Sentimental Analysis
This paper mainly focuses on analysis of the various user emotions about the Kerala
Floods that happened. These experiments consist of eight feelings and two slants
positive and negative [1, 2]. To represent these feelings visually bar graph is used
to show various sentiments on tweets. Based on the some of the words, the positive
words show the highest spike and words like help and rebuild are used. The next line
in the graph represents the some of the words such as save, help and these words
are used by the different types of people to save the victims of the floods. From
the package syuzhet, the sentiment method NRC is used to get and this is used to
compare all the tokenized words with the word sentiment EmoLex and these consists
of large number of words with verity of emotions. If the matching of the words that
present in the various sentiments and these are check the emotions in the pre-listed
Social Media Data Analysis: Twitter Sentimental … 109
emotions and these are increased one by one. On including every one of the qualities
adds up to feeling and slant can be figured.
In Fig. 5, the bar graph is represented based on the verity of emotions that are
present in the sentiment for analysis. Positive words are mostly expected by the many
people for trust. For the negative words, the bar graph shows the low because of the
situation.
Limitations:
• The emotions which are presented by emoji’s cannot be retrieved by present
sentimental analysis tools.
• In the sentiment dictionary, various local language words are not defined.
• It is very difficult to find the sentiments on the mixed language words and also
transliterated words.
110 M. Katamaneni et al.
• The number of words is 3000 that is used to compare with matched content in the
given dataset.
• Only recent tweets are taken into consideration which is in text format only.
Solution:
• Expansion of word net for various languages which makes analyzing the
sentiments easy.
• Developing tools or algorithm which can determine the context of humor or
sarcasm can improve analysis further.
6 Conclusion
In this paper, various tweets are shown for analysis on the #KERALAFLOODS that
some of the people will support Kerala. Sentiment analysis is done on the tweets given
by the various users and this shows the emotions of the users on #KERALAFLOODS.
Positive words show the gratitude toward the victims in this floods and this shows
the more sentiments on victims. The bar graph representation shows the positive and
negative tweets that are analyzed by the proposed system. With this analysis, people
may raise the funds for this floods. This analysis shows the peoples sentiments with
the development of technology and this helps in doing the research in text mining.
References
1. Sharma V, Agarwal A (2016) Suppositions mining and classification of music lyrics utilizing
SentiWordNet. In: symposium on colossal data analysis and networking
2. Keka I, Çiço B (2015) Factual treatment for trend detection and analyzing of electrical load
using programming language R. In: foouth mediterranean conference on embedded computing
2015
3. Katamaneni M, Cheerala A (2014) Wordnet based document clustering. Int J Sci Res (IJSR)
3(5)
4. Madhavi K, Anush Chaitanya K, Percolate M Supremacy user walls by using Pfw. Int J Sci
Eng Adv Technol
5. Turney PD (2002) Thumbs up or thumbs down?: semantic orientation applied to unsuper-
vised classification of reviews. In: proceedings of the 40th annual meeting on association for
computational linguistics, pp 417–424
6. Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In:
proceedings of the seventh conference on international language resources and evaluation, pp
1320–1326
7. Hussein DMEDM (2016) J King Saud Univ Eng Sci. Available: http://dx.doi.org/10.1016/j.jks
ues.2016.04.002
8. Kowcika A, Guptha A (2013) Sentiment analysis for social media. Int J Adv Res Comput Sci
Softw Eng 3(7):216–221
9. Vinodini G, Chandrashekaran RM (2012) Sentiment analysis and opinion mining: a survey. Int
J Adv Res Comput Sci Softw Eng 2(6):283–294
112 M. Katamaneni et al.
10. Liang PW, Dai BR (2013) Opinion mining on social media data. In: IEEE 14th international
conference on mobile data management, pp 91–96, ISBN 978-1-494673-6068-5
11. Thet TT, Na IC, Khoo CS, Shakthikumar S (2009) Sentiment analysis of movie reviews on
discussion boards using a linguistic approach. In: proceedings of the 1st international CIKM
workshop on topic-sentiment analysis for mass opinion, pp 81–84
12. Pangand B, Lee L (2004) Wistful training: sentiment investigation utilizing subjectivity
examination utilizing subjectivity synopsis dependent on least cuts. Leg tendon
Intrusion Detection Using Deep Learning
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 113
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_10
114 S. Patidar and I. S. Bains
data, such that observations or assumptions can be made without the function being
programmed directly. Studies of machine learning are focused in the fields of math,
computer science, and engineering and can provide solutions for many disciplines.
The two primary methods of AI are supervised learning and unsupervised learning,
where the use of a completely labeled dataset enables supervised learning. By
comparison, unsupervised learning occurs by the use of a totally unlabeled dataset
[1]. The model receives a dataset in supervised learning, which includes some feature
vectors and labels, which are the corresponding outputs of feature vectors. So, as a
result of given new input, the model learns to generate correct outputs. Classification
and regression are the most common forms of supervised learning [2].
No supervisor supplies the labels in unsupervised learning, which includes the
correct result of the corresponding input to train the models. So the model only has
input values that keep track of the effects of its operation. In other words, unsupervised
learning is a business that explains secret patterns from data inputs. Clustering and
the reduction in dimensionality are two traditional unsupervised methods of learning.
The underlying deep training model architecture consists of one information
covering success by a variety of shrouded coverings, which additionally took care of
contribution to the yield covering. Convolutional neural network (CNN) persists a
profound learning mechanism used primarily in PC picture preparation and language
handling. Without any preprocessing, a crude picture is taken care of straightfor-
wardly to the CNN model; it then evaluates the features through convolution tasks.
Recurrent neural network (RNN) is an additional form of profound learning standard
that has made positive ground within regions, for example, NLP and text prepara-
tion. Long short-term memory (LSTM) structure is a development based on the RNN
structure, which enables inline sequences to learn patterns. Autoencoders is a sort of
artificial neural system utilized in an unsupervised way to learn efficient data coding.
This paper explores these strategies to explain the problems and possible potential
of various methods of deep learning (Fig. 1).
Here, five essential kinds of profound learning designs which are autoencoders,
convolutional neurological networks, long transient memory and repetitive neural
network are examined. Out of these, LSTM and CNN are two of the major and the
most commonly utilized methodologies.
The document is sorted according to: Section 1 depicts profound learning tech-
niques. Section 2 audits the work done in deep learning by introducing a writing
review. Section 3 presents a summary of the examination papers. Section 4 presents
the proposed work. Section 5 presents results and discussions. Section 6 gives the
end.
LSTM is an artificial, repetitive model of the neural system used in deep learning.
Unlike normal neural feed-forward networks, LSTM has ties to feedback. It can
handle not only just single data points but also whole data sequences. A typical
LSTM system comprises a cellule, an information port, a yield entryway and a
forget port [5]. The battery recalls esteems above self-assertive timeframes, and the
three ports control the expressive progression of data inside and outside the cell.
LSTM systems are used to classify, process and make expectations subject to time
arrangement information because there may be slacks of obscure time in a time series
between important occurrences. LSTM applications include robot control, prediction
of the time series, recognition of voice, rhythm processing, processing grammar and
recognition of human behavior (Fig. 3).
2 Literature Survey
Deep learning is a mainstream examination zone among scientists. This area presents
a writing audit of the work done in this field.
Intrusion Detection Using Deep Learning 117
A lot of work has been concluded in deep learning with the assistance of deep
learning strategies. Chen et al. [6] are exploring a profound learning way of assisting
collaborative altering in Q&A pages. The main concept in this is to help inexperienced
editors to alter posts with a wide variety of subjects and to encourage the group to edit
sentences. This exhibits the practicality of preparing a profound learning model with
community post alters and afterward utilizing the prepared model to help community
post altering.
Lu et al. [7] examined security aspects by sniffing a deep learning smartwatch
password that is a Snoopy method. Snoopy uses a uniform structure to separate
movement information portions, albeit passwords are inserted, and utilizes new
profound neurological systems directed toward surmise the real watchwords. This
system can effectively spy information on moving out of sight while entering pass-
words. Without devouring noteworthy force/computational assets, it can successfully
extract password segments of motion data on smartwatches in real time.
Pouladzadeh et al. [8] presented an app that utilizes the image of the nourish-
ment, taken by the client’s cell phone, to perceive different nourishment things in
a similar food to evaluate the calorie and sustenance of the nourishment. In this,
the client is challenged to rapidly recognize the broad territory of nourishment by
an outline a bouncing ring on the nourishment image by contacting the canopy. The
framework, at that point, utilizes picture handling and statistical insight for food item
acknowledgment.
Shone et al. [9] presented a system that plays a crucial role in defending PC systems
called network intrusion detection frameworks. This paper gives another profound
learning strategy for interference identification, addressing to the expanding levels
of human cooperation required and diminishing degrees of exactness in detection.
For future work, the primary investigation road for development will be to evaluate
and stretch out the model’s ability to deal with zero-day assaults and afterward, hope
to develop the existing assessments by using genuine backbone system freight to
exhibit the benefits of the all-encompassing representation.
Liu et al. [10] presented a portable practice, Third-eye that can transform cell
phones toward top-notch PM2.5 screens, allowing for a crowd-sensing approach to
monitor fine-grained PM2.5 in town. Then they use two profound learning represen-
tations, convolutional neural system for pictures and long transient memory system
for climate and wind contamination information, to manufacture a start to finish
PM2.5 inference models training framework. Future work is to build up its world-
wide form to enable more clients to screen the air quality conventionally and secure
their well-being.
Kang et al. [11] presented the design of a crisis alarm framework dependent on
profound learning. It is deemed appropriate for use in the existing foundation, for
example, shut circuit TV and other checking equipment. Experiments were conducted
on car accidents, and natural disasters like fire and effective results were obtained
for emergencies.
Roopak et al. [12] examined different profound learning representations for digital
surveillance in Internet of Things (IoT) frameworks. The profound learning standards
are assessed utilizing the CICIDS2017 data file, which contains generous and the
118 S. Patidar and I. S. Bains
most forward-thinking normal assaults, which take after the genuine certifiable infor-
mation for identifying DDoS assault. For future work, the usage of IDS dependent on
profound learning order could be tried for haze to hub design utilizing disseminated
equal handling.
Chandran et al. [13] introduced a unique utilization of a profound learning system
for recognizing the reports missing youngsters from the photographs of a huge
number of kids accessible with the assistance of face acknowledgment. The general
population can transfer images of the skeptical kids within a typical entrance with
milestones and comments. The photograph is naturally contrasted and the enlisted
image of the lost kid from the storehouse. Convolutional neural network, which is
an exceptionally viable profound learning strategy for picture-based operation, is
utilized for surface acknowledgment.
Yahyaoui et al. [14] introduced a decision support system for diabetes predic-
tion dependent on conventional machine learning strategies with profound learning
approaches. For conventional machine learning, they utilized support vector machine
and the random forest; on the other hand, for profound learning, they utilized a fully
convolution neural system to foresee and identify the diabetes patients. The dataset
they utilized for this framework was the PIMA Indians Diabetes database. Future
work is to improve the feature extraction step by applying a programmed profound
feature extraction approach and for obtaining a superior fitting model to improve the
expected precision.
Jaradat et al. [15] introduced a way to detect the victims trapped in burning
sites. This work recommends recognizing victims in fire circumstances utilizing the
convolutional neural system model. The goal is to classify input pictures sent from
the burning site into one of the three classes: people, pets or no victims. Future work
is to assess the risk level associated with each recognized and characterize the best
way to contact those people in harm’s way. By characterizing a reasonable scoring
standard for threat level related to each detected individual, the firemen can organize
their tasks during fire conditions.
Sathesh et al. [16] presented an improved soft computing approach to identify the
intrusion that cause security issues in the social community. The proposed strategy of
the paper employs the enhanced soft computing method that consolidates the fuzzy
logic, decision tree, K means -EM and the AI in preprocessing, feature reduction,
clustering and classification, respectively, to build up a security approach that is
more viable than the conventional calculations in recognizing the abuse in the social
organizations.
Raj et al. [17] presented an investigation of the computational savvy methodolo-
gies as they appear to be reasonable choice for the man-made brainpower conquering
the mistake and drawbacks in it. Assessment of the distinctive computational method-
ologies to find the perfect one in the identification of bogus access will be a future
heading.
Intrusion Detection Using Deep Learning 119
See Table 1.
Table 1 (continued)
Year/paper Algorithm used Dataset Accuracy Future work
2019 [12] Internet of CICIDS2017 97.16% The usage of IDS
Things + dependent on
DDoS + deep profound learning
learning + order could be
CNN + LSTM tried for the fog
+ RNN to node design
using distributed
parallel
processing and
also develop a
deep learning
model which
could work on the
unbalanced
dataset
2019 [14] Machine SVM 65.38% Improve the
learning + RF 83.67% feature extraction
Deep learning CNN 76.81% step by applying
+ support a programmed
vector profound feature
machines + extraction
random forest
+ CNN
2020 [15] CNN + image One-Step CNN 96.3% To evaluate the
processing Two-step 94.6% risk level related
cascaded CNN to each
recognized and
characterize the
best way to
contact people in
danger and
assigning a
scoring standard
for threat level
with each person
4 Proposed Work
In this paper, for the security of IoT systems from digital attack, deep learning
models like CNN, RNN and LSTM are implemented and contrasted with machine
learning algorithms. DDoS attacks have influenced numerous IoT systems and that
has resulted in huge losses. So deep learning models provide high precision as
compared to machine learning algorithms for attack detection. IDS are an effective
technique for detection of cyberattacks in any network. A fog-to-node computing
is utilized for the implementation of IoT systems. The dataset contains the training
set which labels the attack as benign or attack, and the test set contains the IDS
Intrusion Detection Using Deep Learning 121
model which tests with profound learning models. The dataset utilized for this work
is CICIDS2017 which contains benign and most up-to-date common attacks, which
resemble true real-world information. The dataset is gathered for five consecutive
days with a wide range of digital attacks along with normal data. In Fig. 4, first, divide
the information into the preparation and testing part, where 70% of data have utilized
for preparing and rest 30% part for testing. Then, in the next step, label the data by
assigning benign as 0 and Web attack as 1. Then, the data should normalize and train
using deep learning approaches, and in testing part, the data will be normalized, and
further, in the next step, IDS model is used to detect attack in the system.
For leading the proposed endeavor, it has utilized the most recent DDoS assault
CICIDS2017 dataset. CICIDS2017 datasets contain exceptional genuine work
arranged taking after information. This dataset was accumulated for five sequen-
tial days with distinct cyberattacks alongside ordinary information. This dataset
contains the latest cutting-edge arranged information with and without assault which
is near the genuine work organized data. The main objectives are to implement deep
learning method with higher accuracy in cybersecurity to compare the accuracy
with the existing methods. The proposed deep learning model which is the altered
CNN-based deep learning algorithm in this, multiple beds of deep learning-based
convolution and max pooling are applied to improve the accuracy. In this, there is
convolution bed followed by max pooling bed; after that, dropout layer is applied.
The dropout layers are consolidated to spare the structure from warming. Yield from
the dropout bed is dealt with to flatten bed which at that point provide input for the
dense bed which then provide input to the subsequent dropout layer. Yield from the
dropout bed is dealt with the second dense layer with sigmoid, relu and softmax
initiation work. This proposed deep learning model is perfect when necessity is of
less calculation as there is less parameter required in this model. Here, the accuracy
of the model is 99.10%. The proposed deep learning model for the portrayal results
is shown in Fig. 5; variation in accuracies is seen as per the increase in the epochs.
Fig. 5 Epochs versus accuracy and epochs versus loss curve of the proposed deep learning model
Intrusion Detection Using Deep Learning 123
Table 2 Performance
Model Precision Recall Accuracy
metrics evaluation
CNN model 94.33% 97.62% 98.32%
Proposed deep learning model 96.54% 98.44% 99.10%
As number of rounds increases, there are variations in testing accuracy. The varia-
tion in accuracy shows that the accuracy is not steady; it continues fluctuating. The
loss vs. epoch curve shows variation in loss as per the increase in the epochs. Here,
loss is the loss which the proposed model is experiencing, and epoch is the number
of rounds through which the model is experiencing, and accuracy is the precision
which the model is accomplishing. These parameters are essential to quantify intru-
sion detection accuracy as they help to determine productivity and viability of the
model.
The performance metrics evaluation is shown in Table 2. In this, the CNN model
which is the base model has precision of 94.33%, recall of 97.62% and accuracy of
98.32%. In this, the proposed deep learning model has precision of 96.54%, recall
of 98.44% and accuracy of 99.10%.
For calculation, global parameters which are accuracy, recall and precision are
used to compute the values.
TP + TN
Accuracy = (1)
TP + TN + FP + FN
TP
Precision = (2)
TP + FP
TP
Recall = (3)
TP + FN
6 Conclusion
Deep learning is, for sure, a quickly developing utilization of AI. The quick utiliza-
tion of the innovation of profound learning in various fields truly shows its prosperity
and adaptability. This investigation gives a thought regarding the techniques asso-
ciated with deep learning. Besides, a similar examination of techniques utilized for
124 S. Patidar and I. S. Bains
References
1. Maloof MA (2006) Machine learning and data mining for computer safety: techniques and
applications. Springer, Berlin
2. Alpaydin E (2014) Introduction to machine learnin. MIT Press, Cambridge
3. Hinton GE (2009) Deep belief networks. Scholarpedia 4(5):5947. Hoppensteadt, FC, pp.129–
35. https://www.scholarpedia.org/article/Deep_belief_networks
4. Dan C, Meier U, Masci J, Gambardella LM, Schmidhuber J (2011) Flexible, high-performance
convolutional neural networks for image classification. In: Proceedings of the twenty-second
international common conference on artificial intelligence 2:1237–1242
5. Hasim S, Andrew S, Beaufays F (2014) Short-term long memory recurrent neural network
architectures for large scale acoustic modeling
6. Chen C, Xing Z (2016) Mining technology landscape from stack overflow. In: Proceed-
ings of the 10th ACM/IEEE international symposium on empirical software engineering and
measurement. ACM, p 14
7. Harbach M, Luca AD, Egelman S (2016) The anatomy of smartphone unlocking: a field study
of android lockscreens. In: ACM conference on human factors in computing systems, CHI
8. Pouladzadeh P, Kuhad P, Peddi SVB, Yassine A, Shirmohammadi S (2016) Calorie measure-
ment and food classification using deep learning neural network. In: Proceedings of the IEEE
international conference on instrumentation and measurement technology
9. Dong B, Wang X (2016) Comparison deep learning method to traditional methods using
for network intrusion detection. In: Proceedings of 8th IEEE international conference
communication software and networks, pp 581–585
10. Al-Ali AR, Zualkernan I, Aloul F (2010) A mobile GPRS-sensors array for air deterioration
control. IEEE Sens J 10:1666–1671
11. Baek MS, Lee YH, Kim G, Park SR, Lee YT (2013) Development of T-DMB emergency
broadcasting system and trial service with the legacy receivers. IEEE Trans Consum Electron
59:38–44
Intrusion Detection Using Deep Learning 125
12. Chadd A (2018) DDoS attacks: past, present and future. Netw Secur 2018:13–15
13. Satle R, Poojary V, Abraham J, Wakode S (2016) Missing child identification using face
recognition system. Int J Adv Eng New Technol 3(1)
14. Punthakee Z, Goldenberg R, Katz P (2018) Definition, classification, and diagnosis of diabetes,
prediabetes and metabolic syndrome. Can J Diabetes 42:S10–S15
15. Pinales A, Valles D (2018) Autonomous embedded system vehicle design on environmental,
mapping and human detection data acquisition for firefighting situations. In: IEEE 9th
annual information technology, electronics and mobile communication conference (IEMCON),
Vancouver, BC, Canada
16. Sathesh A (2019) Enhanced soft computing approaches for intrusion detection schemes in
social media networks. J Soft Comput Paradigm (JSCP) 1:69–79
17. Raj JS (2019) A comprehensive survey on the computational intelligence techniques and its
applications. J ISMAC 1(3):147–159
Secure Trust-Based Group Key
Generation Algorithm for Heterogeneous
Mobile Wireless Sensor Networks
S. Sabena
Department of Computer Science Engineering, Anna University Regional Campus, Tirunelveli,
Tamil Nadu, India
e-mail: sabenazulficker@gmail.com
C. Sureshkumar
Faculty of Information and Communication Engineering, Anna University, Chennai, Tamil Nadu,
India
e-mail: msa.suresh@gmail.com
L. Sai Ramesh (B)
Department of Information Science and Technology, Anna University, Chennai, Tamil Nadu, India
e-mail: sairamesh.ist@gmail.com
A. Ayyasamy
Department of Computer Engineering, Government Polytechnic College, Nagercoil, Tamil Nadu,
India
e-mail: samy7771@yahoo.co.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 127
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_11
128 S. Sabena et al.
1 Introduction
The generation of group keys for existing and enable nodes which may join in the
group as intermediate. The view of the key distribution is shown in Fig. 2. A system
is formed with a group of N nodes. It is observed that there are N-1 autonomous
broadcasting groups available for data transmission. The mobile nodes don’t have
any predefined shared key or secret data rather than the protocol. The broadcasting
nodes are not believed to have any additional capacity to generate other kinds of
keys. The pre-requisites are shared at neither the physical nor the MAC layers. By
the assumption, the packet losses in the wireless networks within the group of nodes
or eavesdroppers are autonomous. There are M attackers with the same capacity as
the N nodes to any N-1 groups. The location of the eavesdroppers is not known to
any node in the network. Eavesdroppers are capable of change the channels at the
equal rate as the end users endeavor to generate shared secret information. Figure 2
demonstrates the group key generation for the mobile nodes in the heterogeneous
mobile wireless sensor networks.
3 Proposed Work
Three Nodes A, B, and C desire to ascertain the secure transmission within them
and produce a secret group key. There will be another node E who is eavesdropping
130 S. Sabena et al.
Phase 1 contains 3 states, every state containing S rounds. In the initial state, A and
B broadcast when C receives. Node A broadcasts on network NA for S rounds, Node
B broadcasts on network NB for S rounds. All the transmitting A and B in every
S round are randomly produced. A’s broadcasting in every S rounds is denoted by
KeyAC .
Aggregation [KeyAC , KeyAB ] [KeyBC , KeyBA ] [KeyCB , KeyCA ] Key AC Key BC Key AB [KeyAC , KeyAB ]
2 , 2 , 2 ,
KeyBA KeyCA KeyAB KeyCB KeyAC KeyBC
Information 2 , 2 2 , 2 2 , 2 Key CB Key BA Key CA [KeyCA ]
2 , 2 , 2
131
132 S. Sabena et al.
keys Key2 AC , Key2CA and Key2AB , Key2 BA to encrypt and broadcast SeA to B and C
accordingly.
The RM is relatively quicker and more secure than any other security methods
for producing the values of Secret keys. To improve the security using BT, the secret
key calculation is used as shown in Fig. 3. Before evaluating the key for the group,
the presence of malicious node is identified and modifications are carried out in key
generation.
The devices (D1 , D2 , . . . , D8 ) are situated on the child node of BT. It is important
for every key node must know the value of every parent’s secret key. It is impossible
134 S. Sabena et al.
for the entire device to the BT without knowing the secret key value of the parent
node. For example, the SeK12 of Node14 can be calculated by knowing the secret key
values of parent nodes SeK7 , SeK3 , SeK1 . Every device keeps their secret key value
in Integrated Circuit.
Every device generates a session key for secure transmission within the group. The
shared secret key values within the source to the destination are calculated using the
secret key values of the nodes.
3.6 Algorithm—STGG
4 Performance Evaluation
Entropy(1)
EntropyPairwise (i, j) = 2 × (9)
M −1
Figure 4 demonstrates the pair-wise keys are involved for the process of encryption
within the network communication. The experiments are evaluated with four different
set of nodes. The entropy level is evaluated based on the total number of rounds.
Fig. 4 Entropy
Secure Trust-Based Group Key Generation Algorithm … 137
D
M − i − 1
Probfail = 1 − Probsuc
Ra
ProbRa (z) = z
Probsuc xProbRa−z
fail (12)
z
D z
Ra M − i − 1
1− 1 − 1 − δTx E x 1 − δTx,i
z
i=1
M −i
D Ra−z
Ra M − i − 1
The proposed method STGG is compared with PA-SHWMP [3] and HWMP [9].
Figure 5 explains the comparison for number of malicious nodes versus the packet
delivery ratio. Figure 6 demonstrates the comparison between the total amounts of
malicious nodes versus route acquisition delay. Figure 7 illustrates the comparison
between the total amounts of malicious nodes versus average end-to-end delay.
Figure 8 illustrates the comparison between the total amounts of malicious nodes
versus message overload. Figure 9 illustrates the lossy links versus the false-positive
rate. The Simulation results suggest that the proposed method STGG is performed
better than the related methods PA-SHWMP and HWMP. The entropy is created for
the pair-wise group key in the Wi-Fi physical channel. The MAC protocol is used to
find the performance analysis.
5 Conclusion
This paper generates the STGG algorithm to generate the shared group key. For
the security analysis of the proposed algorithm with related algorithms, the entropy
related security is used to analyze the performance metric. The false-positive rate is
used to analyze the unconnected failure link in the network. Basically, the packets are
138 S. Sabena et al.
dropped in the network by the unwanted behavior of the active attacker or less amount
of connectivity. The route acquisition delay is calculated using the time period from
RREQ to RREP messages from the beginning node to the ending node.
References
1. Vivek K, Narottam C, Soni S (2010) Clustering algorithms for heterogeneous wireless sensor
network: a survey. Int J Appl Eng Res 1:273–287
2. Chun-Hsien W, Yeh-Ching C (2007) Heterogeneous wireless sensor network deployment and
topology control based on irregular sensor model. Adv Grid Pervasive Comput 4459:78–88
3. Sathiyavathi V, Reshma R, Parvin SS, SaiRamesh L, Ayyasamy A (2019) Dynamic trust
based secure multipath routing for mobile ad-hoc networks. In: Intelligent communication
technologies and virtual mobile networks. Springer, Cham, pp 618–625
4. Vivek M, Catherine R (2004) Homogeneous vs heterogeneous clustered sensor networks: a
comparative study. IEEE Int Conf Commun 6:3646–3651
5. Andreas R, Daniel B (2013) Exploiting platform heterogeneity in wireless sensor networks
by shifting resource-intensive tasks to dedicated processing nodes. In: IEEE international
symposium on a world of wireless, mobile and multimedia networks (WoWMoM), pp 1–9
6. Yu Y, Peng Y, Yu Y, Rao T (2014) A new dynamic hierarchical reputation evaluation scheme
for hybrid wireless mesh networks. Comput Electr Eng 40(2):663–672
7. Selvakumar K, Karuppiah M, SaiRamesh L, Islam SH, Hassan MM, Fortino G, Choo KKR
(2019) Intelligent temporal classification and fuzzy rough set-based feature selection algorithm
for intrusion detection system in WSNs. Inf Sci 497:77–90
Secure Trust-Based Group Key Generation Algorithm … 141
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 143
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_12
144 M. S. Ishi and J. B. Patil
1 Introduction
the self-learning nature of machine learning, it is used for the sports prediction model.
Typical statistical algorithms are used to predict the result of matches based on team
strength. This decision-making problem considers past data to predict future games
using the algorithms. This prediction models primarily depend upon the statistical-
based and simulation-based method. In the case of a simulation-based model, sport-
specific simulation engine is used to predict the outcome of a game by running the
algorithm multiple times. While the statistical model is based on the statistics of a
team in different conditions or strengths of the team [5, 6].
Machine learning is useful for designing an analytical model. Artificial intel-
ligence is the main domain of machine learning. Classification and regression in
machine learning are used to solve the prediction problem. The typical machine
learning model is shown in Fig. 1. With the help of dataset of previous matches, a
common pattern can be found from historical data. The primary aim of the classifica-
tion model is to design model based on training data, and further used this model to
evaluate other data. It is used to predict the target variable using previous data. Predic-
tion of a winner in the cricket is a classification problem where one can predict a class
label of win, loss, and draw. Support vector machine, Naive Bayes, logistic regres-
sion, cricket outcome predictor (COP), random forest, K-nearest neighbor algorithm,
logistic regression, C4.5, bagging, and boosting algorithms are used for the classi-
fication of data [7, 8]. In this paper, several methods are studied used for team and
winner prediction in cricket.
2 Related Work
Bunker and Thabtah [9] have focused on artificial neural networks for sports result
prediction. This model considers the result of existing matches, performance indica-
tors for players and opposite team information. They study the existing methodolo-
gies, data used, prediction model, and challenges that occurred during result predic-
tion. Machine learning is used as a learning framework for this prediction model.
The artificial neural network is a more recent area used for statistical and operational
research for sport prediction. ‘SRP-CRISP-DM’ type framework is proposed using
146 M. S. Ishi and J. B. Patil
this NN technique. This framework is based on six steps which consist of domain
understanding, data understanding, data preparation, feature extraction, model prepa-
ration, model evaluation, and finally deploying a model. This method is useful for
researchers, bookmakers, sports fan, and students who are interested in sports predic-
tion using neural network. Match prediction requires more accuracy, for that case
rather than using traditional statistical or analytical models one can go for the NN
model. Machine learning is preferred here because it generates a more accurate
prediction model using the already defined feature and previous dataset. ‘SRP-
CRISP-DM’ framework provides a solution for the most complex problem of sports
prediction.
Asif and McHale [10] have proposed a generalized nonlinear forecasting model
(GNLM) for predicting the number of runs to be scored in an inning of cricket. A
number of wickets and overs left are considered during this model. This model is
useful for any format of cricket. The aim of this model is used to predict total runs
in Twenty-20 international cricket. This model calculates run difference between the
two teams while the match in progress. The difference between runs can be used
to find the closeness of the game. It can be used to calculate the rating of the team
with the help of a margin of victory between two teams. This model can be useful
for target resetting in case of interrupted matches, and prediction but primarily the
main focus of this model is to determine top-20 greatest victory and accordingly
assigning a ranking to the team. This model works on the principle of the number of
wickets lost with remaining runs as a non-increasing function and runs on next ball
is non-decreasing function during inning progress. In case of overs left, the expected
runs and run on next ball are a non-increasing function of wicket lost as inning
progresses. To obtain an accurate team rating, margin of victory is considered for
this model. The problem arises when the team batting second wins the match. Then,
the margin of victory considers as wickets remaining instead of runs. This type of
problem can be resolved by increasing team two scores if they allow continuing to bat.
This is a mathematical model based on truncated survival function based on Weibull
distribution. The current system of ICC ranking does not consider the margin of
victory for team rankings. This module produces some properties for the run-scoring
pattern and accurate behavior of this model. This GNLFM model is not limited to
team rating but this can be also useful for other issues of limited over cricket. It can
be used for target resetting for interrupted matches. The score prediction model can
be designed using this framework.
Chakraborty et al. have used TOPSIS method with MCDM tool for the selection of
players in the cricket team. This tool obtains single response value as performance
measure from multiple features considered for decision making. It gets a lot of
popularity because it considers a smaller number of a parameter as input with high
consistency and less complexity [11]. The selection is based on the shortest Euclidean
distance from a positive ideal solution, and far from a negative ideal solution. A
positive ideal solution obtains if attribute receives a maximum response from the
database. In the case of a negative solution, the attribute receives a minimum response.
The poor results get balanced with a good result if the poor criterion is changed to
some other positive criterion. This method applies all forms of one day and T20
A Study on Machine Learning Methods Used … 147
cricket teams. The entropy method is used to assign a weight to the players by
the corresponding criterion. The players are selected from shortlisted data to find
possible team composition. They prefer to use the augment entropy method, where
information is available in the form of decision and evaluation matrix. They maintain
a relationship between the criterion from the available information and weight. The
primary advantage of this technique is that it finds weight from the decision matrix.
It does not consider the view of decision-makers. For the selection of players, it
measures the uncertainty of random information available in the decision matrix.
The combination of decision matrix and weight is used to get the right composition
of players.
Jayanth et al. [12] have proposed method for team recommendation and outcome
prediction in cricket. The supervised learning method is used with linear, nonlinear,
and RBF kernel to predict the outcome of a cricket match. Group of players are
formed at a different level for both the teams before predicting the outcome of the
match. The player’s contribution is measured at the same level of player into that
group. K-means clustering is used to recommend players using past data. K-nearest
neighbor classifier is used with five neighbors to find the nearest player. Unstructured
data is extracted from a sports Web site and stored in the database. Team selection is
done from the historical data by measuring the winning contribution of the players.
SVM with linear and nonlinear techniques is used to predict the outcome of the
match. This SVM model is trained with a ranking index of batsmen and bowlers.
It is treated as a binary problem by considering win and loss as a class with a
finite dimension. The player ranking index is calculated using statistics extracted
from the particular tournament. Data from the n-dimensional space is not linearly
separable; hence, SVM with nonlinear RBF kernel performs better as compared
to linear or poly kernel. The dimensions of the feature vectors are reduced with
principal component analysis (PCA). PCA converts this feature set into a new set of
variables called principal components. Accuracy, precision, and recall rate for SVM
with RBF kernel outperform other SVM models for winner prediction. For team
recommendation, k-means clustering is used, where similar players are found with
k-nearest classifier.
Chand et al. [13] have provided a model for assembling a team for a particular
tournament. The stochastic optimization technique is used for team selection. This
method of team selection is not useful when stakes are high. Here, multi-objective
integer programming model is proposed for optimal team formation. They suggest
a partial team construction model by selecting a few members and keeping other
members unchanged in the squad. Players are assigned with rank based on their
importance. The integer linear programming model is used to select an optimal
team to get a multi-objective formulation of the team. Multiple ILP modules are
used to provide a solution using a multi-objective procedure. They used a classical
constraint approach where one objective minimized or maximized to check the effect
on other objectives. A binary vector is used to provide solutions for the available set
of players. This method guarantees optimality for team selection with minimum
time. This approach is scalable using five objectives. To make this decision-making
process effective, player ranking is obtained. The current form of players will define
148 M. S. Ishi and J. B. Patil
the team is optimal, not the previous record. Batting and bowling objectives need
to maximized or minimized according to objective formulation method. A player
contribution is calculated using the measurement of the hypervolume of the team.
The higher difference in hypervolume indicates the high contribution of players. Two
way, three-way, and five-way objective optimization method based on batting and
bowling constraints used for team formation in T20 matches.
Ahmad et al. [14] have provided a method for finding the superior team. The first
aim of this technique is to focus on batting and bowling ability of the team. After
that, it uses new features to find the quality of a team. This mechanism considers the
actual performance of batsmen and bowlers in one-day international matches. For
calculating the precedence of batsmen, they considered the top-six batting positions.
This is assigned with label batting productivity precedence. Bowling productivity
precedence (BoPP) is used for the last six positions as a bowler. Overlapping players
are called as an all-rounder. These two are added together to get team productivity
precedence (TPP) over the other team. The productivity precedence algorithm (PPA)
is used here to provide productivity weights for batting, bowling, and team prece-
dence. This efficient mechanism is used to find features of batsmen, bowler, and
cricket team. Around 8 to 9 features are used and aggregate to calculate the weight
of that particular domain. The productivity precedence algorithm (PPA) is applied to
get team precedence over other teams. Bidirectional productivity graph BPG (T, I) is
designed between two teams after the PPA algorithm. In BPG, T represents a node of
the team and I refer to interaction during a match against each team. The productivity
precedence algorithm uses a network structure to find an important feature for team
precedence. For better outcome, fielding parameter can be added with batting and
bowling parameters to provide a rank for players in a team.
Khatua and Khatua [15] have proposed a method for winner prediction in 2015
world cup using Twitter users. This type of problem is not a binary classification
problem because multiple teams are involved. The logistic regression model applied
over 3.5 million tweets to get the relationship between classification and tweeting
patterns. Structured information is used for this model. It is tested with eight logistic
regression models. The user orientation with mix tweet pattern is used form the forma-
tion of a team. This model is statistically tested with likelihood ratio and independent
variable coefficients for all eight logistic regression models. Positive and negative
tweets are used to check the positive effect of the model for winner prediction.
Verma and Izadi [16] have proposed a new analytical framework called as cricket
prognostic system (CPS). Advanced machine learning and statistical methods are
used to predict the result. Around thirty dynamic features are considered based on real
statistics and historical data. Three classifiers are used in this technique for prediction
of win of a particular cricket team and also to find the distribution of players for each
match simulation. The first classifier is used to predict players dismissal, the second
classifier for remaining runs to score, and the third classifier used for calculation of the
number of extra runs scored by a team. Simulation of a cricket match is done with risk
model CPS. If the wickets do not fall then runs and extra runs are calculated on each
ball. The matches which are interrupted and shortened are not considered by CPS
system. This classifier model is designed with hundreds of indicators like batsman,
A Study on Machine Learning Methods Used … 149
bowler, ground, teams, and current state of the game. The current game state variable
includes current over, wickets fallen, and run rate in second winning. Consistency of
player, pressure index, and impact of a player are considered as indicators for CPS.
Central tendency and variability are selected as a parameter in this model. For the
number of wickets binary logistic regression model is used. Runs scored and extra
runs awarded are evaluated with six independent logistic regression models with a
probability score of 0 to 6. This probability is finally normalized into one final value.
This study is used to predict winning probabilities and to evaluate a player on their
benchmark value. Player’s comparison is based on metrics and strategies.
Singh and Kaur [17] have presented the tool for the evaluation of players using
visualization of performance. They identified the key variables required to evaluate
batsmen and bowler’s performances and also adds some extra weight. HBase tool is
used to evaluate the performance of players based on historical data. It is a distributed,
open-source tool used for storing a non-relational database. It is used to store billions
of rows and columns of data into the table. This work focuses on providing statistical
analysis using different characteristics of players. After that, it considers the statistics
of the players to predict the winning probability of the team. This tool is based on
machine learning algorithms. Semi-structured and unstructured data are stored in
HBase tool. K-nearest algorithm with four neighbors is used to predict the winner
of the match. The KNN algorithm is compared with a decision tree, random forest,
support vector machine, and logistic regression to evaluate the accuracy of the model.
Multidimensional feature space with a class label is used for every value of a training
set of KNN algorithm. During classification, k is assigned with value four, and label
for training samples to the nearest point. Euclidean distance is used for designing
distance metric. Hamming distance with a discrete variable is used for overlap metric
for classification of text. In this method, ten features are considered for prediction of
winner with a primary focus on toss and venue of the match. KNN enables data fitting
accurately by avoiding overfitting and underfitting of data with maximum accuracy.
A non-relational database is used to provide a dynamic approach to address this
problem of outcome prediction.
Dey et al. [18] have evaluated network properties for team formation and to find
whether that player belongs to a team network or not. Players are considered as nodes
of a network, and interaction of that players are denoted with edges. Intra-country
networks are considered for team selection, which inherits all characteristic using
the past performance of players. For calculation of weights during team selection
fielding, running between the wickets, and partnership are considered as more impor-
tant parameters for the evaluation of players. The social network analysis method is
used for checking the effectiveness of players. The network analysis is performed
with a bidirectional weighted network from the data of T20 cricket matches. The
clustering coefficients and centrality measurement methods of network analysis are
used for checking a player’s efficiency before adding to the squad. This approach
works on three steps: The first formation of T-20 network, second identifying the
properties of network, and finally the formation of network based on high centrality
measurement of the clustering coefficient. The players are assigned with rank, node
degree distribution, and clustering coefficient with centrality. The path length of the
150 M. S. Ishi and J. B. Patil
network from central specifies the players rank in a free-scale network. This approach
provides information about player performance and the bonding between teammates.
Some players which are having high centrality value and clustering coefficient can
help for the formation of a team.
Irvine and Kennedy [19] have provided a method to determine the performance
indicator of players. They also study the effect of performance indicator on the
outcome of the cricket match. Innings run rate, the total number of wickets taken,
and the number of dot balls is used to get magnitude-based interference for a perfor-
mance indicator. This magnitude-based interference allows selecting bowler with
good wicket-taking capability during the attacking field. The aggressive batsmen
are selected based on high boundary percentage and strike rate. The purpose of this
study is to evaluate the performance indicator of players which has positive effects on
match results. This study concludes that the difference between winning and losing
team is the number of wickets taken, dot balls, and inning run rate. It also finds the
significance of performance indicator. This study is carried out around four envi-
ronment conditions such as sub-continent condition and eastern condition. The run
rate in English condition is better in swinging conditions, but stroke play is diffi-
cult during the start of an inning. In case of sub-continent condition, initially run
rate is high, but as inning progresses run rate gets slowdown due to reverse swing.
The performance indicator for batsmen, bowler, and the team is used separately to
determine the outcome of the match.
Bandulasiri et al. [20] have used principal component analysis method for studying
batting, bowling parameter, and finding ranking of the team. The rank of a team,
number of fifties, partnership of players, number of spinners, number of fasters,
and number of all-rounders are considered as characteristic for analyzing the team.
Numerical feasibility adds an extra effect to this technique for making it more popular.
The primary purpose of the PCA is to reduce the number of variables. Here, a large
number of correlated variables are converted into linear uncorrelated variables in
the best possible way. These variables are called as a principal component. Vector
transformation is used to convert a higher dimension variable into a smaller dimen-
sion so that a small number of variables are sufficient to explain the output of this
method. Batting, bowling, and decision making are used as three factors to calculate
the principal components. The partnership between players is impacting more as
compared to other parameters if proper attention is given to this parameter. CBR and
CBA are used to measure the performance of bowlers and batsmen. Small CBA and
CBR values indicate poor performance of batsmen and bowlers. In PCA dataset size
is checked with KMO value which indicates how data can be used for analysis.
Daud et al. [21] have evaluated the strength of the team based on the concept
of team rank. The current ranking method for cricket team considers the number
of matches win or loss by a particular team. It does not consider the margin of
remaining runs or remaining wickets for assigning ranks to the team. The concept of
H-index and page rankings are proposed for identifying the weakness of previously
used methods. The network of teams is formed, where each team acts as a node and
weighted directed edge is drawn between two teams. The team is awarded more points
if they won against a strong team. Team index (T-index) is proposed similar to the
A Study on Machine Learning Methods Used … 151
replace the traditional method. The proposed method is static, someone can modify
this method to make this method dynamic by obtaining a detailed analysis.
Bhattacharjee and Saikia [23] have proposed binary integer programming method
for the formation of a balanced squad. Batting performance is measured with a batting
average, strike rate, and contribution of batsmen to team total. All values are normal-
ized and assigned with some weight for relative importance. All these normalized
scores are multiplied with some weight and added together to get an optimum value
of batsmen. The number of catches taken, run out in series are used to evaluate the
quality of fielders in the team. These values are also normalized and assigned with
some weight as a multiplication value to evaluate the strength of bowlers. Bowling
average, economy rate, and strike rate are considered for the evaluation of bowlers.
This factor value is normalized and multiplied with some weight to calculate the
performance of bowlers. Finally, the wicketkeeper is evaluated with the number of
catches, stumpings, and byes runs conceded in the match. This factor values are also
normalized and assigned with weight as a multiplication value to evaluate the strength
of the players. This performance measurement is a linear combination of statistics
of batting, fielding, bowling, and wicketkeeper. The normalization method tries to
avoid diversity of variables and remove the unit of measurement. For simplicity, it
maintains a range of 0 to 1 for evaluation. The performance measurement factors
are classified into positive and negative factors. Batting average catches taken are
positive factors for players. The number of byes runs conceded and economy rate
is a negative factor related to the ability of the players. Therefore, it needs to be
extra cautious while designing a formula for normalization. A composite index is
obtained for a particular player linearly with the multiplication factor as a corre-
sponding weight. Binary variables are defined to form the objective function and
constraints. If the constraints are changed, then squad also gets change from the
pool of available players. The team formation is a close operation between cricket
statisticians and selectors. This method tries to simplify the method of team selection
with some simple criteria.
Amin and Sharma [24] have used data envelopment analysis method for cricket
team selection. Different capabilities with multiple outputs are features of this method
for the evaluation of cricket players. DEA score is calculated for each player, then
categorization of efficient and not efficient players is done. This method considers
multiple factors related to the performance of players. Linear programming DEA
model is used to get the aggregate score of particular players. The score of the players
is obtained objectively instead of subjective computation. Aggregation method is
needed for evaluating the players with multiple capabilities. Subjective model of
DEA with linear programming is proposed in this method to evaluate the quality
of players. This method is capable of providing a solution to players if they are
not performing well. The typical DEA model supports multiple inputs and output,
but some of the cases are there with multiple inputs and no output or vice versa.
This model is solved n times to calculate the score of players and then select the
players into the team. The issue with this technique is an evaluation of multiple
performances with aggregation method and measuring the effectiveness of players
using available statistics. Hence, the linear programming model with the aggregation
A Study on Machine Learning Methods Used … 153
method is proposed to get the best DEA score to check the efficiency of the players.
Finding a team is similar to solve the integer programming model. Team efficiency
is directly proportional to the sum of the efficiency of players. Optimization-based
DEA model is obtained with an aggregation of multiple performances.
All the methods studied in this paper are presented in the form of summary in
Table 1. The methods used for team prediction and winner prediction are represented
in the table with their author’s name, methods used, advantage, and disadvantage of
methods.
3 Conclusion
The study of several methods for team formation and winner prediction is performed
in this paper. The finding of the study shows that cricket is a game of planning; it
needs to be divided into segments to evaluate the effects of the parameters which are
needed for team formation and winner prediction. There is a need for classification
of players according to the strength of the opposite team. There is also a need for
searching for more parameters to increase the effectiveness of the team and to study
individual factors which affect winner prediction model. Proper weight needs to be
assigned for forming the team and predicting a winner in cricket. Player evaluation
needs to be done by considering the relative ranking of teams. In this paper, a study
of the team strength evaluation using the concept of ranking is done. The bulk of data
is available, from their quality of players needs to be evaluated using some possible
ways. Machine learning algorithms like KNN, Naive Bayes, support vector machine,
and logistic regression are used to select the players and winner prediction in cricket.
Some of the authors used the neural network method which uses a graph-based
network for team formation. NSGA-II genetic algorithm, the binary programming
model is used to evaluate the strength of the team. The concept of optimization of
weights is used to evaluate the strength of players. Normalization methods are used
to normalize the value of players and team. In this way, the study of a number of
algorithms that are used for the team formation and winner prediction in cricket using
different methodologies is performed. The research gaps are identified. Methods are
studied with their merits and demerits. The effective model with maximum accuracy
needs to be created for team formation and winner prediction in cricket.
154 M. S. Ishi and J. B. Patil
Table 1 Summary of methods used for team formation and winner prediction
Sr No. Author’s Method(s) used Advantage Disadvantage
1 Rory P. Bunker and SRP-CRISP-DM Solution to complex Working on
Fadi Thabtah [9] using ANN for problem pinpoint accuracy
result prediction
2 M. Asif and I.G. GNLFM model for Advanced team Home advantage
McHale [10] winner prediction rating system feature not used
3 Shankar Chakraborty TOPSIS method Use of decision Not bias free
et.al. [11] with MCDM tool matrix with weights
for team prediction
4 Sandesh Bananki K-means Good result for SVM Poor result for
Jayanth et.al. [12] clustering for team with RBF kernel linear and poly
formation and kernel SVM
SVM for winner
prediction
5 Shelvin Chand et.al. Multi-objective Optimal team with Result may not
[13] integer less time consistent with
programming form of players
model for team
formation
6 Haseeb Ahmad et.al. Productivity Use of network Fielding
[14] precedence approach parameter not
algorithm for team considered
formation
7 Apalak Khatua and Logistic regression Use of mix tweets Not a generalized
Aparup Khatua [15] with Twitter data pattern model
for winner
prediction
8 Aman Verma and Cricket prognostic Good accuracy More than one
Masoumeh Izadi [16] system for winner model to predict
prediction winner
9 Shubhra Singh and HBase tool with Use of HBase tool More features can
Parmeet Kaur [17] KNN algorithm be added
for team formation
10 Paramita Dey et.al. Weighted network Network-based Complex
[18] for team formation model approach
11 Scott Irvine and Performance Use of Additional
Rodney Kennedy indicator with magnitude-based parameters for
[19] magnitude-based interference bowlers are
interface for team needed
formation
12 Ananda Bandulasiri PCA method for PCA used to reduce Phase-wise
et.al. [20] team selection variable size of data analysis is not
possible due to
data
(continued)
A Study on Machine Learning Methods Used … 155
Table 1 (continued)
Sr No. Author’s Method(s) used Advantage Disadvantage
13 Ali Daud et.al. [21] T-index and page Use of graph for Concept of
rank algorithm for better ranking for temporal
team ranking players dimensions can be
added
14 Satyam Mukherjee Social network Page rank concept for Use of static
[22] analysis (SNA) for players evaluation approach
team formation
15 Dibyojyoti Binary integer Normalization Need to add more
Bhattacharjee and programming concept used for parameters for
Hemanta Saikia [23] method for team weight calculation evaluation
formation
16 Gholam R. Amin and Data envelopment Objective evaluation Comparison of
Sujeet Kumar analysis method is used instead of players with own
Sharma [24] for team selection subjective evaluation team is not done
References
1. Swartz TB (2017) Research directions in cricket. In: Handbook of statistical methods analysis
sport, pp. 445–460. https://doi.org/10.1201/9781315166070
2. Passi K, Pandey N (2018) Increased prediction accuracy in the game of cricket using machine
learning. Int J Data Min Knowl Manage Process (IJDKP) 8:19–36. https://doi.org/10.5121/
ijdkp.2018.8203
3. Saikia H, Bhattacharjee D, Radhakrishnan UK (2017) A new model for player selection in
cricket. Int J Perform Anal Sport 16:373–388. https://doi.org/10.1080/24748668.2016.118
68893
4. Ahmad H, Daud A, Wang L, Hong H, Dawood H, Yixian Y (2017) Prediction of rising stars
in the game of cricket. IEEE Access 5:4104–4124. https://doi.org/10.1109/ACCESS.2017.268
2162
5. Pathak N, Wadhwa H (2016) Applications of modern classification techniques to predict the
outcome of ODI cricket. Procedia Comput Sci 87:55–60. https://doi.org/10.1016/j.procs.2016.
05.126
6. Asif M, McHale IG (2016) In-play forecasting of win probability in one-day international
cricket: a dynamic logistic regression model. Int J Forecast 32:34–43. https://doi.org/10.1016/
j.ijforecast.2015.02.005
7. Jhanwar MG, Pudi V (2016) Predicting the outcome of ODI cricket matches: a team composi-
tion based approach. In: European conference on machine learning and principles and practice
of knowledge discovery in databases (ECML-PKDD) proceedings, vol 1842, pp 111–126
8. Sankaranarayanan VV, Sattar J, Lakshmanan LVS (2014) Auto-play: a data mining approach
to ODI cricket simulation and prediction. Int Conf Data Min SDM 2:1064–1072. https://doi.
org/10.1137/1.9781611973440.121
9. Bunker RP, Thabtah F (2019) A machine learning framework for sport result prediction. J Appl
Comput Inf 15:27–33. https://doi.org/10.1016/j.aci.2017.09.005
10. Asif M, McHale IG (2019) A generalized non-linear forecasting model for limited overs
international cricket. Int J Forecast 35:634–640. https://doi.org/10.1016/j.ijforecast.2018.
12.003
11. Chakraborty S, Kumar V, Ramakrishnan KR (2019) Selection of the all-time best World XI
Test cricket team using the TOPSIS method. Decis Sci Lett 8:95–108. https://doi.org/10.5267/
j.dsl.2018.4.001
156 M. S. Ishi and J. B. Patil
12. Jayanth SB, Anthony A, Abhilasha G, Shaik N, Srinivasa G (2018) A team recommendation
system and outcome prediction for the game of cricket. J Sports Anal 4:263–273. https://doi.
org/10.3233/jsa-170196
13. Chand S, Singh HK, Ray T (2018) Team selection using multi-/many-objective optimization
with integer linear programming. In: 2018 IEEE congress on evolutionary computation CEC
2018—Proceedings, pp 1–8. https://doi.org/10.1109/CEC.2018.8477945
14. Ahmad H, Daud A, Wang L, Ahmad I, Hafeez M, Yang Y (2017) Quantifying team precedence
in the game of cricket. J Cluster Comput 21:523–537. https://doi.org/10.1007/s10586-017-
0919-z
15. Khatua A, Khatua A (2017) Cricket world cup 2015: predicting user’s orientation through
mix tweets on twitter platform. In: Proceedings 2017 IEEE/ACM international conference on
advances in social networks analysis and mining ASONAM 2017, pp 948–951. https://doi.org/
10.1145/3110025.3119398
16. Verma A, Izadi M (2017) Cricket prognostic system: a framework for real-time analysis in
ODI cricket. In: International conference on large scale sports analytics
17. Singh S, Kaur P (2017) IPL visualization and prediction using HBase. Procedia Int Conf Inf
Technol Quant Manage 122:910–915. https://doi.org/10.1016/j.procs.2017.11.454
18. Dey P, Ganguly M, Roy S (2017) Network centrality based team formation: a case study on
T-20 cricket. J Appl Comput Inf 13:161–168. https://doi.org/10.1016/j.aci.2016.11.001
19. Irvine S, Kennedy R (2017) Analysis of performance indicators that most significantly affect
international Twenty20 cricket. Int J Perform Anal Sport 17:350–359. https://doi.org/10.1080/
24748668.2017.1343989
20. Bandulasiri A, Brown T, Wickramasinghe I (2016) Factors affecting the result of matches in
the one day format of cricket. J Oper Res Decis 26:21–32. https://doi.org/10.5277/ord160402
21. Daud A, Muhammad F, Dawood H, Dawood H (2015) Ranking cricket teams. J Inf Proces
Manage 51:62–73. https://doi.org/10.1016/j.ipm.2014.10.010
22. Mukherjee S (2014) Quantifying individual performance in cricket—a network analysis of
batsmen and bowlers. J Phys Stat Mech Appl 393:624–637. https://doi.org/10.1016/j.physa.
2013.09.027
23. Bhattacharjee D, Saikia H (2014) On performance measurement of cricketers and selecting
an optimum balanced team. Int J Perform Anal Sport 14:262–275. https://doi.org/10.1080/247
48668.2014.11868720
24. Amin GR, Sharma SK (2014) Cricket team selection using data envelopment analysis. Eur J
Sport Sci 14:37–41. https://doi.org/10.1080/17461391.2012.705333
Machine Learning-Based Intrusion
Detection System with Recursive Feature
Elimination
Abstract With the prevalent technology like cloud computing, big data and Internet
of things (IoT), a huge amount of data have been generated day by day. Presently,
most of the data are stored in the digital form and transfer to others by the mean
of digital communication media. Hence, to provide security to data and network is
one of the main concerns for everyone. Several intrusion detection systems (IDS)
have been proposed in the last few years, but accuracy and false alarm rate are
still most challenges issue for the researchers. Nowadays, an intruder is used to
design new types of attack day by day, which is challenging to identify. Recently,
machine learning is emerging as the most powerful tool for the development of IDS.
This paper discusses three different machine learning approach, namely decision
tree, random forest and support vector machine (SVM). KDD-99 dataset is used to
train the model. Due to the unbalance data and duplicate feature, recursive feature
extraction technique is being used to reduce the number of features. The experiment
result shows that proposed IDS performs well as compared to the base model with
the accuracy of 99.1.
1 Introduction
A huge amount of data is being generated every second due to the social media, IoT
devices and technology reform [1, 2]. All information is stored in a server or host
machine in digital form and transfers from one machine to another. To provide secu-
rity to user data from intruder or attacker is a challenging task due to the advancement
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 157
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_13
158 A. R. B. Gupta et al.
It is a maximum marginal classifier which classifies data based on the support vector.
All features are plotted in N dimension space where N is the number of features and
separate all data into classes by using a hyperplane. Figure 3 shows the working of
SVM.
The decision tree is a tree-like structure where each node represents the question,
an edge represents answer, and leaf node represents the class label. Decision tree
uses information gain and entropy to choose root node. A feature which has higher
information gain and low entropy will opt as a root node. Following equation is used
for calculation information gain
160 A. R. B. Gupta et al.
n
Entropy = −P(ci ) log2 (P(ci )) (1)
i=1
Random forest tree is an ensemble method where multiple trees being constructed.
As opposed to the decision tree where only one tree is built for each problem in the
random forest tree, approach nodes are selected randomly and then build multiple
trees. The final decision will be taken based on the voting, and the sample will assign
to the class based on the majority. Entropy and information gain is used to select a
node in a different tree.
2 Literature Survey
Anish et al. [9] proposed machine learning-based IDS. This paper uses SVM and
Naïve Bayes classifier to train IDS. KDD-NSL dataset is used to train a model.
Machine Learning-Based Intrusion Detection … 161
Maximum accuracy claimed by the author is 97.29 by using SVM. The main limi-
tation of this approach is that they are not eliminating duplicate features from KDD
dataset. As the model is more trained for the normal transaction, so the final developed
model will also classify the maximum transaction as a normal transaction.
Rahaman et al. [10] introduced deep learning-based IDS which cab used for the
smart cities. KDD Cup-99 dataset is used as training data to train a model. In this
paper, they design deep neural network and apply SVM at the classification layer.
The deep neural network will extract features from the given data and pass to the
SVM for classifying data into a class. Information gain and J48 are used to extract
features. Since this model is using deep neural network for the feature extraction and
then SVM is used for the classification, it will increase the training time and chance
of overfitting.
Almseidin et al. [11] evaluate the performance of various machine learning-based
IDS. This paper trains IDS using multiple machine learning approaches like random
forest, decision table multi-layer perceptron (MLP), Naive Bayes and Bayes network.
KDD Cup dataset is used for the training and testing purpose. Accuracy, precision and
recall are used to measure the performance of all trained IDS. Maximum accuracy
claimed by the author is 93.77 for the random forest classifier. The main limitation
of this approach is that they have not applied any feature selection approach, which
may lead to low accuracy. This paper does not evaluate the performance with respect
to SVM because SVM gives better efficiency when features are normalized.
Kumar et al. [12] summarized different ensemble method that can be used for the
IDS development. As each classifier has their limitations, so instead of using single
classifier, this paper suggested using an ensemble method for the development of
IDS. Since the output of one classifier is given to the next classifier, so it will help
to achieve more accuracy, but at the same time, it will increase training and testing
time.
Alrowaily et al. [13] design an IDS which trained for several machine learning
approaches like random forest, KNN, Naive Bayes, decision tree, MLP and
AdaBoost. In this work, they used CICIDS2017 dataset for the training and testing
purpose. KNN gives higher accuracy of 99.46 for the given dataset.
3 Proposed Work
the system is tested for the new attacks. In machine learning-based IDS, accuracy is
mainly depending upon the feature selection form the given dataset. Figure 4 shows
the different steps involved in the development of IDS. Following steps are included
in the development of machine learning-based IDS.
(a) Data Collection
(b) Data Preprocessing
(c) Model Training
(d) Model Evaluation
In this work, KDD Cup-99 dataset is used to train machine learning-based IDS
which is downloaded from the Kaggle [14]. This dataset consists of 494,020 rows
(instances), 42 features and classified all transaction into 23 different types of attacks.
Machine Learning-Based Intrusion Detection … 163
This is one of the most important and challenging phases of any machine learning
model. Accuracy of the model mainly depends on data preprocessing. This is a
process to convert row data into machine consumable form. Following preprocessing
is done on KDD dataset before feeding data to a model. Steps required to convert
row data into machine consumable form
i. Identified all numeric, categorical and text data
ii. Convert all text data into a number
iii. Convert all ordinal variable in number
iv. Convert all nominal variable in the dummy variable.
KDD dataset has 41 features where three features, namely protocol_type, service
and flag, are categorial. To change these features into a number, first apply the label
encoder and then apply one-hot encoding to all features. To avoid dummy trap first
column is drop.
KDD dataset has 23 different types of attack. Based on the attack property and
behaviour, 23 attacks are categorized into 5 groups, namely denial of service attacks
(DoS), root to local (R2L), user to root attack (U2R), normal and probe and assign
a unique number to all five groups. Table 1 shows a unique number assign to each
group, and Table 2 depicts the assignment of attack into the group.
KDD data have 41 features as shown in Fig. 5. After applying the one-hot encoding,
the total number of features is 117. During the training, it is observed that most
of the features are redundant and can be removed to get higher accuracy and
164 A. R. B. Gupta et al.
minimize training time. To remove useless and redundant features, the recursive
feature elimination (RFE) approach is being used. After applying RFE, the top 13
features have been chosen from 117 features. Below table depicts selected features
for which the model is trained.
To normalize input data, standard normalization is used which converts all feature
matrix in the range of 0 mean and one standard deviation. The formula for the standard
normalization is
x −μ
Z= (3)
σ
where x represents feature value, μ represent mean, and σ represents the standard
deviation.
166 A. R. B. Gupta et al.
After the preprocessing, feature matrix and target vector are passed to the model for
the training. In this paper, our model is trained for decision tree classifier, random
forest and SVM classifier. All models are trained only for the selected features. Due
to the removal of all redundant features and less number of features, the proposed
model takes less time in training.
IDS algorithm
1: Load dataset
2: Preprocessed data
– Convert all categorical variable into a number using label encoder and one-hot encoder
– Map all attacks of KDD dataset into five clusters
– Normalize feature matrix
3: Split your data into train and test
4: Call RFE algorithm for the feature selection
5: Train your model for the selected features
6: Evaluate the model performance
The task of all machine learning-based IDS is to find the class of each transition, i.e.,
normal, probe, R2L, U2R, DoS, etc. So, the performance of each IDS is evaluated
using the following metrics.
Confusion gives a summary of the actual result and predicted result. It is used
to analyse the performance of the classifier, which helps us to improve model
performance.
3.4.2 Accuracy
It represents the overall accuracy of the given model. The formula for calculating
accuracy of the model is given by
TP + TN
Accuracy =
TP + FP + TN + FN
Machine Learning-Based Intrusion Detection … 167
3.4.3 Precision
TP
Accuracy = Precision =
TP + FP
3.4.4 F1-Measure
2 ∗ Precision ∗ Recall
F1-measure =
Precision + Recall
KDD Cup-99 is one of the most popular datasets which are used to design an intrusion
detection system (IDS). Although this dataset is prepared in 1999, still is one of the
most popular choices of IDS developer [14]. This dataset consists of 494020 rows
(instances) and 42 features. This dataset covers 23 different types of attacks as shown
in Fig. 5. The main limitation of this dataset is that it is unbalanced and has several
duplicate entries. So preprocessing plays a vital role to achieve higher accuracy.
As this dataset classified transaction into 23 different types of attack so first, these
attacks have been divided into five different categories. Table 3 shows the different
type of KDD attacks and new assign categories of the attack.
It is clear from Table 3 that data is highly unbalanced, and most of the transaction
will be classified as normal or DoS because dataset contains lots for a record for
normal and DoS, so machine learning algorithm will find the normal and Dos pattern
easily. So dataset needs to be preprocessed before passed to the machine learning
model.
5 Result Evaluation
To evaluate the performance of the proposed model, it is compared with the existing
approach named “evaluation of machine learning algorithms for intrusion detec-
tion system” [11]. This paper implements several machine learning algorithms like
support vector machine (SVM), random forest, J48, decision tree, Naïve Bayes, etc.
Maximum accuracy claimed is 93.77 by using random forest. The performance of
168 A. R. B. Gupta et al.
our proposed approach has also been evaluated using decision tree, random forest
tree and SVM. The proposed algorithm is implemented in a machine with configura-
tion of i5 (9th generation) processor, 8GB Ram and 4GB Nvidia graphics card 1650
GTX (Tables 4, 5 and 6; Figs. 6, 7, 8 and 9).
6 Conclusion
Due to the social media and IoT, huge amount of data is transferred from one device
to another, and it is increasing day by day. So there is a need to develop an effec-
tive IDS that identified the unauthorized access and takes appropriate action. One
of the primary re-equipments of any IDS is high accuracy and low false alarm rate.
Recently, machine learning-based IDS performs better as compared to the tradi-
tional IDs. This paper proposed machine learning-based IDS which used recursive
Machine Learning-Based Intrusion Detection … 169
feature extraction technique to reduce redundant and useless features. The proposed
approach is implemented in three machine learning classifiers, namely decision tree,
SVM and random forest tree. The experiment result shows that random forest tree
performs well as compared to the SVM and decision tree. One of the main advantages
170 A. R. B. Gupta et al.
of random forest tree is that it predicts less number of fraud transaction as a normal
transaction as compared to the decision tree and SVM.
Machine Learning-Based Intrusion Detection … 171
97 99
95 92.44 93.77
References
9. Anish et al (2019) Machine learning based intrusion detection system. In: Proceedings of the
third international conference on trends in electronics and informatics, pp 916–920
10. Rahaman A et al (2020) Scalable machine learning-based intrusion detection system for IoT-
enabled smart cities. Sustainable Cities and Society
11. Almseidin M et al (2017) Evaluation of machine learning algorithms for intrusion detec-
tion system. In: Proceeding of IEEE 15th international symposium on intelligent systems and
informatics, September 2017, pp 277–282
12. Kumar G et al (2020) MLEsIDSs: machine learning-based ensembles for intrusion detection
systems—a review. J Supercomput 76(2), Feb 2020
13. Alrowaily M et al (2019) Effectiveness of machine learning based intrusion detection systems.
In: Proceeding of international conference on security, privacy and anonymity in computation,
communication and storage, pp 277–288
14. Bay SD (1999) The UCI KDD archive. Department of Information and Computer Science,
University of California, vol 404, pp 405. http://kdd.ics.uci.edu.irvine.ca
An Optical Character Recognition
Technique for Devanagari Script Using
Convolutional Neural Network
and Unicode Encoding
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 173
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_14
174 V. K. Kikkuri et al.
phase is character detection, and the second phase is character recognition. The first
phase includes the pre-processing and segmentation of the image. The second phase
includes training a model that can predict the result from the first phase. Using good
image pre-processing techniques will give a better output to the following phases.
Pre-processing includes noise reduction, skew correction, and converting a given
image to binary format. The classifier that predicts characters should be trained in
such a way that it can be generalized over any font and size.
Devanagari script is used to scribe languages like Sanskrit, Hindi, Marathi, etc.
In Devanagari script, there is a line called “Sirorekha” or the header line which
connects all the characters in a word. Languages following Devanagari script have
a very large set of consonants, vowels, and combinations of consonants with vowels
and the combinations of consonants among themselves (left part will the pure form
of one consonant and the right part will be full consonant). In the Devanagari script,
a character can have an upper modifier in the top strip, a lower modifier in the bottom
strip, a pure form of consonant, and a full consonant in core strip. Ref: Fig. 1.
This paper presents a methodology to create an OCR framework for Sanskrit char-
acters by using the segmentation algorithm proposed in [1]. This technique separates
upper modifier, lower modifier, and also does fused character segmentation to sepa-
rate pure form and full consonant separately. Histograms are used for line, word, and
character segmentation as proposed in [2]. To build a robust classifier that can accu-
rately predict characters with different sizes, fonts, and strokes, convolutional neural
network (CNN) is used instead of traditional classification algorithms like SVM,
KNN, ANN, etc. [3]. CNN is a multi-layered architecture which does feature extrac-
tion without information loss by using several layers of convolutional layers and then
feed the extracted features into a classifier which predicts the output [4]. To train the
model, an artificially synthesized dataset consisting of around 1.2 lakh images with
85 classes for the core part of a character, is developed using different fonts avail-
able at [5]. Initially, CNN is used for object detection purpose, but later on, this is
being used in other domains also. Each component in the separated segmentation
phase will be fed into its respective trained CNN model. Using the Unicode values
provided at [6–8], the scanned document is reconstructed to a machine-encoded text.
The workflow of the proposed framework is depicted in Fig. 2. An image, with
Devanagari script (Sanskrit text), taken as input is first pre-processed. The segmen-
tation algorithm deals with the pre-processed image. Segmentation phase consists of
line, word, character, fused character segmentation, and also the separation of modi-
fiers. All characters identified in the segmentation phase are fed to train CNN, and
these predictions are mapped to their respective Unicode values and added together
An Optical Character Recognition Technique for Devanagari … 175
2 Existing Works
Several works have been done in the past to build an OCR for Devanagari scripts.
Research work proposed in [1, 9, 10] segments given image to character level, sepa-
rate upper and lower modifiers, and segment the fused characters using the structural
properties of the Script.
Research work presented in [11] discusses OCR for printed Malayalam text using
singular value decomposition (SVD) for dimensionality reduction and Euclidean
distance measure for character recognition, whereas [12] discusses OCR for Telugu
text using SVD, projection profile and discrete wavelet transform for feature
extraction and K-nearest neighbors and support vector machine for character
recognition.
Advancements in computer technology urged developers to adopt machine
learning and deep learning algorithms that need high computational capabilities
when compared to rule-based or template-based mechanisms and at the same time
producing better results [13, 14]. Using such algorithms in tasks like OCR proved
to produce better results and better generalization over a wide range of font styles
and sizes. A work by Dineshkumar and Suganthi [15] does handwritten character
recognition using ANN. Another work by Jawahar [16] does character recognition
for languages like Hindi and Telugu using principal component analysis followed by
SVM.
Sankaran and Jawahar [17] proposed a work using the Bidirectional Long Short-
Term Memory (BLSTM) approach which can recognize text in word level. BLSTM
176 V. K. Kikkuri et al.
uses previous and present word context to make predictions. The Sanskrit OCR
proposed by Avadesh and Goyal in their work [3] does character-level segmentation
and employs a CNN model that is trained on a dataset consisting of 602 classes which
include fused characters also.
This section explains the segmentation operators used in the algorithm proposed in
[1].
To the binarized image, horizontal projection computes the total number of black
pixels in every row which can be done by calculating horizontal histogram. So the
pixel rows with no black pixels are considered to be the white spaces. So, the starting
of a black pixel row will be the top boundary and starting of white space can be
considered to be the bottom boundary. Ref: Fig. 3a.
To the binarized image, vertical projection computes the total number of black pixels
in every column which can be done by a vertical histogram. The pixel columns with
no black pixels are considered to be the white space. So, the starting of a black pixel
column can be considered as the left boundary of text and start of white space column
can be considered as the right column of text. Ref: Fig. 3b.
An Optical Character Recognition Technique for Devanagari … 177
To find the position of the vertical bar, find the height of character from Sirorekha
to the bottom of the image using vertical projection. The column which has black
pixels count more than 80% of the height, that column is considered as the vertical
bar.
Unlike in horizontal projection, collapsed horizontal projection checks for the occur-
rence of at least one black pixel in a row. If there is a black pixel in a row, CHP of
that row is set to 1, else it is set to 0.
To find the height and continuity of a character, denote “R1” as the pixel row where
CHP is equal to 1 for the first time, “R2” as the pixel row where CHP is equal to 0
in the subsequent rows, and “R3” as the pixel row where again in subsequent rows
CHP is equal to 1. Now there are 3 possibilities for R2, R3 (since R1 is always there).
They are:
1. Both R2 and R3 does not exist: This means, the character is continuous and height
is (total number of rows − R1)
2. R2 exists but R3 does not exist: This means, the character is continuous and
height is (R2 − R1)
3. Both R2 and R3 exist: This means, there is discontinuity and height is (R2 − R1)
It is the width of Sirorekha. The rows with maximum black pixels are considered to
be Sirorekha. It is estimated using the outcome of horizontal projection.
4 Image Pre-processing
The image that contains Devanagari text is prepared for further steps in this stage
by using a Gaussian filter which smoothens the image by preserving the edges of the
image. The resulting image, which is an RGB image is then converted to gray scale
178 V. K. Kikkuri et al.
and then to a binary image. Binarization is done using OTSU global thresholding
algorithm. This binary image is used in the further segmentation process.
In this phase, each line is identified and separated from the image. This is done using
the horizontal projection algorithm. Using the top and bottom boundaries obtained
from horizontal projection algorithm, the lines are cropped from the original image,
and each line is saved as a different image.
In this phase, each word from every line is identified and separated that is obtained
from the previous phase. This is done using the vertical projection algorithm. Using
the left and the right boundaries obtained from vertical projection algorithm, a line
can be segmented to words. This is done for every line that is segmented in the
previous phase.
In this phase, each character is identified and separated from every word that is
obtained from the previous phase. For this, first, the “Sirorekha” is identified. To find
Sirorekha for each word, find the row with the maximum number of black pixels
using the horizontal projection algorithm and this row is considered as Sirorekha.
Now remove the Sirorekha and apply the vertical projection algorithm which
separates each character individually. Each character is saved as a separate image.
For better segmentation, find the rows with maximum and second maximum black
pixels and whiten all pixel values in between these rows. To find Sirorekha, horizontal
projection algorithm counts black pixels in every row only up to half the height of
the image. For every segmented character, check for the presence of upper modifier
and lower modifier and separate them from the core part of the character.
An Optical Character Recognition Technique for Devanagari … 179
This can be done by making a vertical projection from the top of the image till
Sirorekha. If there are any black pixels present, it can be concluded that there is an
upper modifier for that character image and cropped the image from top till Sirorekha.
To check the presence of a lower modifier, find the height of each character from
Sirorekha and find the maximum height. Using below-mentioned rules categorize
characters to 3 categories.
1. If the character height is more than 80% of the maximum height, then classify
them to category-1.
2. If the character height is less than 80% and more than 64% of the maximum
height, then classify them to category-2.
3. If the character height is less than 64% of the maximum height, then classify
them to category-3.
To check the possible presence of a lower modifier, find the category with the
maximum number of images and find the average height of the images in this category.
These images in this category will not be having characters with lower modifiers.
The average height calculated is denoted as a threshold to check the presence of
lower modifier that is, character images from other categories with height more than
the threshold are sent for lower modifier segmentation, where lower modifier gets
separated from the core part of the character. The presence of a lower modifier is
considered only if its height is greater than one-fifth of the height of the character.
After identifying and separating lower and upper modifier, the character image is
sent for further character-level segmentation if there are any unsegmented characters
left due to overlapping of pixels because of the presence of modifiers.
A final set of characters remained in the core part of separating upper and lower
modifiers are checked for the presence of fused character. To check whether a char-
acter image has a fused character or not, the same method that is used to check the
possible presence of lower modifier is employed. But here instead of height, consider
the width of the character image. Similarly, categorize every image and find a cate-
gory with the maximum number of images. Then estimate the average width for the
images in this category and threshold is set to this average width. All images from
the other two categories which have width more than a threshold are sent for fused
character segmentation.
180 V. K. Kikkuri et al.
For fused character segmentation, a column needs to be found that separates the
pure form from the full form. It is known that in a fused character, pure forms always
occur on the left side and full consonant on the right side. To find this separating
column follow the steps given below.
1. Find the vertical bar in the rightmost end of the fused character and ignore the
whole part to the right of vertical bar (including vertical bar pixel column). The
vertical bar position is now considered to be the extreme right of the character
image. Now take the column which is pen width columns left to the vertical bar,
denoted as C1. Find continuity and height of the character inscribed between C1
and the column before the vertical bar. If the part of image inscribed between
these boundaries is discontinuous and its height (refer 3.5 for continuity and
height of a character) is greater than one-third of complete character height then
finalize that column as C1 and stop the process. Else, move C1 by one column
to its left and repeat the process.
2. Now find another column C2 from the left end of the character image that is where
pure forms of consonants are positioned. Based on the heights of a pure form
of consonants they are classified into two categories. If the height of character
inscribed between the left most column of image (left_bound_C2) and the column
that is at one-third position of the character width (right_bound_C2), is less than
or equal to 80% of the consonant height is classified to H1 ( ,
etc.), else it is classified to H2 ( etc.). For each class, a different
method is followed to find the column C2, which is compared to C1 and to
estimate the final segmenting column for the fused character.
3. C1 ~ C2 > pen width: This means that there are more than two characters present
in the character image. Segment the image using C1 as segmentation column and
the remaining image is sent for further segmentation.
6 Dataset Description
The model built for recognition of core part character takes an image with size 32
× 32 × 3 as input. This image is passed through several convolution layers in such
a way that the convolution/filter matrix (each with size 3 × 33) width would be
increasing. Max-pooling layer with pool-size 2 × 32 is also added to reduce spatial
dimension and number of training parameters after every few layers of convolution.
The obtained feature matrix is flattened, and hidden layers with activation function
as Rectified Linear Unit are added as shown in Table 1, and the output layer of the
network has 85 nodes (i.e., number of classes) with activation function as “SoftMax”.
Batch normalization layer is also added to the network before flattening the feature
matrix and the dropout layer is added after the first dense layer, to make sure that the
model does not overfit. To generalize the model better, data augmentation technique
is used while training the model. Using this technique more images are generated on
the fly by changing shear range, brightness range, height and width shift ranges, etc.
Model is trained for 20 epochs with batch size being 128. The optimizer used
is Adam with a learning rate 0.001 and performance metric used is accuracy. Since
the dataset used here is balanced, the accuracy metric is claimed to be the best for
evaluation for such datasets. Accuracy is the ratio of true results and the total cases.
The accuracy measures for both training and validation data resulted in >90% as
shown in Fig. 4.
182 V. K. Kikkuri et al.
8 Unicode Encoding
The segmented characters from the segmentation part are predicted using the trained
CNN model. Results obtained are mapped to their corresponding Unicode values and
added (Ref: Fig. 5) Devanagari text uses Unicode values ranging from 0900 to 097F
[6–8]. In the segmentation phase, images are named in such a way that, the position
of the character in a word and position of the word in a line can be identified.
An Optical Character Recognition Technique for Devanagari … 183
9 Results
Stepwise results for segmentation algorithm presented in this paper are shown for a
sample test image (Fig. 6).
Top and bottom boundaries are estimated using an algorithm in Sect. 5.1. Using these
boundaries, the test image is cropped out to separate lines Ref: Fig. 7.
Left and right boundaries of each word for every line are estimated as in Sect 5.2.
Using these boundaries each word from every line cropped out and stored Ref: Fig. 8.
Position of Sirorekha is estimated for every word and is whitened. Figure 9a shows
the word#9 of Fig. 8 without Sirorekha.
An Optical Character Recognition Technique for Devanagari … 185
Fig. 9 a After Sirorekha removal b Separated top modifiers. c Separated lower modifiers. d Core
part characters
Left and right boundaries for every character is estimated and are cropped from
their respective word images. Upper modifiers are also separated during character
segmentation and resulting characters are subjected to identification of lower modi-
fier after which further segmentation happens if there are any combined characters
because of overlapping of the modifiers. Figure 9d depicts the final result after sepa-
rating modifiers, and Fig. 9b, c depicts the separated top and lower modifiers for
word#9 of Fig. 8.
Using the technique presented in Sect. 5.6, characters are selected for fused character
segmentation. Final segmentation column estimated by using results of C1 and C2
is used to crop the fused character. The left part of character will be the pure form
and the right part will be full form. Figure 10 depicts the pure form and full form of
the fused character from the word#9 of Fig. 8.
The core part of character images is fed into the trained CNN model. The predicted
class is mapped to its Unicode, using which reconstruction of the test image can be
done. For example, if images 9_4_1 and 9_5_0 from Fig. 7d is fed into the model,
the output of the model is “ka” and “ma”, respectively. Classifier output is mapped
to the Unicode value “0915” for “ka” and “092E” for “ma” are added together to
produce the word “ ”. For Unicode mapping used a .csv file is used which has
the class names and their respective Unicode values.
This presents an approach to building a robust OCR for Devanagari text. The algo-
rithms used for line, word, character, lower modifier and fused character segmentation
in the segmentation phase from [1] are adapted here. A CNN model was trained for
core part recognition using the character dataset synthesized using different Devana-
gari fonts. A Unicode addition approach to reconstruct the image into machine-
encoded text is also exhibited. Future works can include developing dataset for
top and lower modifiers, and building a CNN model for these data samples. Pre-
processing stage can also include skew correction technique to adjust the orientation
of the text. Object detection techniques can also be employed to detect text from
images containing both Devanagari text and other figures.
References
1. Bansal V, Sinha RMK (2002) Segmentation of touching and fused Devanagari characters.
Pattern Recogn 35:875–893. https://doi.org/10.1016/S0031-3203(01)00081-4
2. Vijay K, Sengar P (2010) Segmentation of printed text in devanagari script and Gurmukhi
script. Int J Comput Appl. https://doi.org/10.5120/749-1058
3. Avadesh M, Goyal N (2018) Optical character recognition for Sanskrit using convolution neural
networks. In: 2018 13th IAPR international workshop on document analysis systems (DAS),
Vienna, 2018, pp 447–452
4. Sultana F, Sufian A, Dutta P (2018) Image classification using CNN
5. https://fonts.google.com/?subset=devanagari
6. Chandrakar R (2004) Unicode as a multilingual standard with reference to Indian languages.
Electr Libr 22:422–424. https://doi.org/10.1108/02640470410561947
7. https://unicode.org/charts/PDF/U0900.pdf
8. Nair J, Sadasivan A (2019) A Roman to Devanagari back-transliteration algorithm based on
Harvard-Kyoto convention. In: 2019 IEEE 5th international conference for convergence in
technology (I2CT), Bombay, India, 2019, pp 1–6. doi: https://doi.org/10.1109/I2CT45611.
2019.9033576
9. Bansal V, Sinha R (2001) A Complete OCR for printed hindi text in devanagari script, pp
800–804. https://doi.org/10.1109/ICDAR.2001.953898.
An Optical Character Recognition Technique for Devanagari … 187
10. Bag S, Krishna A. (2015) Character segmentation of Hindi unconstrained handwritten words.
https://doi.org/10.1007/978-3-319-26145-4
11. H. P. M. (2014) Optical character recognition for printed Malayalam documents based on
SVD and Euclidean distance measurement. In: International conference on signal and speech
processing. ICSSP 2014
12. Jyothi J, Manjusha K, Kumar MA, Soman KP (2015) Innovative feature sets for machine
learning based Telugu character recognition. Indian J Sci Technol 8(24)
13. Neena A, Geetha M (2018) Image classification using an ensemble-based deep CNN. Adv Intel
Syst Comput 709:445–456
14. Shah P, Bakrola V, Pati S (2018) Optimal approach for image recognition using deep convo-
lutional architecture. In: Sa P, Bakshi S, Hatzilygeroudis I, Sahoo M (eds) Recent findings
in intelligent computing techniques. Advances in intelligent systems and computing, vol 709.
Springer, Singapore
15. Dineshkumar R, Suganthi J (2015) Sanskrit character recognition system using neural network.
Indian J Sci Technol 8:65. https://doi.org/10.17485/ijst/2015/v8i1/52878
16. Jawahar CV, Kumar MNSSK, Kiran SS (2003) A bilingual OCR for Hindi-Telugu documents
and its applications. 1:408–412. https://doi.org/10.1109/IC-DAR.2003.1227699
17. Sankaran N, Jawahar CV (2012) Recognition of printed Devanagari text using BLSTM
neural network. In: Proceedings of the 21st international conference on pattern recognition
(ICPR2012), pp 322–325
A Machine Learning-Based
Multi-feature Extraction Method
for Leather Defect Classification
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 189
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_15
190 M. Jawahar et al.
multi-layer perceptron neural network (MLP). Experimental results show that the
highest classification accuracy (89.75%) is achieved using GLCM along with Hu
moments, HSV colour features and random forest classifier.
1 Introduction
Leather industry is one of the ancient industries in the world, wherein the by-products
of slaughterhouses were utilized and the raw materials were transformed into various
types of leather and high value products. Leather has unique properties like breatha-
bility, feel, comfort, durability and elegance. The leather industry holds a prominent
place in the Indian economy. It is in the top ten foreign exchange earners for the
country. Likewise, it is the second-largest producer of footwear and leather garments
in the world. Own raw material source about 3 billion ft2 of leather is produced
annually as shown in Fig. 1 [1].
The raw material of the leather industry which are hides and skins, suffers from
various defects that downgrade the quality of the leather products. The leather defects
can be classified into ante-mortem and post-mortem where the defects caused while
the animal is alive is ante-mortem and the other is the defect caused after the death
of the animal. Other kinds of defects are caused during leather processing. Brand
marks, tick marks, pox marks, insect bite, wounds, scratches, growth marks, flay cuts,
veininess, wrinkles, fat folds, salt stain, lime blast, chrome or dye patches, drawn
grain, open cuts, pinhole damage, etc. are the common leather defects.
Quality inspection in the assessment of the useful area of leathers is an important
step. The manual inspection system is currently used in leather processing industries.
Manual inspection requires expert knowledge and is highly subjective, tedious and
time-consuming whereas automatic detection leads to reliable, consistent, accurate
and avoids dispute between buyer and seller.
Leather pieces are graded based on the cutting value and the price varies according
to the size and location of surface defects. Grading has to be done carefully since the
Fig. 1 a Profile of leather product manufacturing sector; b Country-wise share of leather and leather
products
A Machine Learning-Based Multi-feature Extraction … 191
price depends upon the quality of the leather. Digital image processing is effectively
used to accurately identify the defect and classify the quality of the leather. The texture
of leather is unique due to its hair pore arrangements and natural grain indentations,
so separating the defect region from the background is a challenging task [2–4].
Leather processing from raw skin to finished leather comprises of four major unit
operations that include pertaining, tanning, post tanning and finishing are illustrated
in Fig. 2a. Few defective leather images and good leather images are shown in Fig. 2b,
c, respectively.
2 Related Work
Various works have been carried out in recent years in classifying the leather defects
from the leather surfaces. It’s a challenging task due to its inherent texture variations
which vary from piece to piece. Several studies have developed using various defect
classification models based on computer vision algorithms [20]. A lot of variations
such as grain surface indentations, colour, texture, brightness exist within the leather
substrate and strong variation within the defective region may lead to inaccuracy.
Moreover, the nature of leather defect features like shape and/or size and/or orienta-
tion and varied distribution of defects also increases the complexity of the problem.
The various types of leather defects and the image size used by the researchers were
illustrated in Table 1.
192 M. Jawahar et al.
Table 1 Various leather defects and image sizes used for analysis
Paper Total no. of images Image size No. of defects Types of defects
[5] 140 200 × 300 5 Lines, holes, stains, wears and
knots
[6] 387 2.5 × 0.5 m2 8 Scars, the mite nests, warts, the
open fissures, the healed scars,
the holes, the pinholes and the
fat folds
[7] 30 8 Scars, mite nests, warts, open
fissures, healed scars, holes,
pinholes and fat folds
[8] 8 Background, no-defect,
hot-iron marks, ticks, open
cuts, closed cuts, scabies and
botfly larvae
[9] 2000 64 × 64 4 Brand marks, made from a hot
iron, tick marks, cuts and
scabies
[10] 80 2048 × 2048 12
[11] 256 × 256 Barbed wires, shingles
[12] 15 600 × 450 Tick marks, brand marks from
hot iron, cuts and scabies
[4] 700 256 × 256 20 Open defects, closed cuts,
ticks, brand marks, Thorn
marks, scratches, bacterial
damage, mechanical damage,
fungal attack, pox mark,
growth marks, insect bites,
lime blast, wrinkles, pipiness,
chrome patch, grain cracks,
dye patch, fatty spew, finish
peel
In this study, an attempt has been made to extract GLCM, HOG, HU, HSV and
multi-variate features. The extracted features were given as training input and classi-
fied using the classifiers like LF, LDA, RF, CART, SVM and MLP. The overall flow
structure of the proposed leather defect classification system is given in Fig. 3.
Leather images are captured using the image acquisition system shown in Fig. 4. The
system used a Sony industrial CCD colour camera with USB 3.0 interface mounted
on a moving carriage. The images are captured as the camera is moved horizontally
from left to right. The captured images are of resolution 1200 × 1600 pixels. The
A Machine Learning-Based Multi-feature Extraction … 193
dataset used in the proposed work included 577 defective and 100 non-defective
images where a split ratio of 80:20 used as training and testing dataset.
194 M. Jawahar et al.
Leather has large variations in its intensity grey level due to the inherent texture, so
wiener filter is used to preprocess to images. The Wiener filter estimates the target
feature by detecting the observable process of linear time-invariant, implying a known
stationary signal as well as additive noise. This preprocessing is based on minimizing
the mean square error between the estimated signal W (ω) and the original signal
W (ω). Lim and Oppenheim defined the wiener filter [13] as
Wx (ω)
S(ω) = (1)
Wx (ω) + W y (ω)
where Wx (ω) and W y (ω) represents the noise free and the noisy background signal
which is stationary and uncorrelated. The preprocessed image can be computed after
computing the transfer function as
W (ω) = X (ω)S(ω) (2)
In this process, a feature set is found that can accurately distinguish between defec-
tive and non-defective regions. Texture-based features refer to the local intensity
variations in a region of a digital image from pixel to pixel.
Local features of an image identify the relationship between the spatial distribution
and the pixel grey values. Based on the pixel level, statistical-based methods are
categorized as shown in Fig. 5.
a b
Higher-
Second- order
order
First-
order
is derived from two adjacent neighbouring pixels of the image. The function f (i,
j|d, θ ) in the image describes the likelihood of a pair of grey levels occurring at
the distance d with direction θ. GLCM is calculated based on two terms: neighbour
pixels displacement d and pixel orientation θ. Grey-level co-occurrence matrix can
reveal certain properties about the spatial distribution of the grey levels in the texture
image. For example, if most of the entries in the GLCM are concentrated along the
diagonal, the texture is coarse. Initially, Haralick et al. [14] analysed 14 parameters
that include autocorrelation, contrast, correlation, cluster prominence, cluster shade,
dissimilarity, energy (uniformity), entropy, homogeneity, maximum probability, sum
of square (variance), difference variance, information measure of correlation, inverse
difference normalized (INN) and inverse difference moment normalized.
The HOG feature descriptor uses the distribution (histograms) directions of gradients
(oriented gradients) which are employed as the major characteristics. The gradients (x
and y derivatives) of an image are useful as they provide large information around the
corners and edges and hence results in bigger in size. As gradients are large around
corners and edges, HOG calculates the gradient images by filtering the greyscale
⎡ ⎤
−1
image with the following filter kernels Dx = [−1 0 1] and Dy = ⎣ 0 ⎦. The
1
magnitude and gradient are found using
g= gx2 + g 2y (3)
gy
θ = arctan (4)
gx
where gx and gy are the gradient vectors; g and θ are the gradient magnitude and the
orientation.
196 M. Jawahar et al.
Machine learning algorithms are used to extract the class of information from a large
set of data. Two types of classification can be done using machine learning tech-
niques: (a) supervised and (b) unsupervised classification. The supervised classifi-
cation includes user information associated with each class during the classification
process. Unsupervised classification finds the classes without human intervention.
x = θ1 + θ2 ∗ Y (5)
where θ1 and θ2 are the intercept and coefficient of Y. the cost function of the Linear
regression is the RMSE between the predicted (P) and the true value (T ). For reducing
the cost function and to update θ1 and θ2 gradient descent is used in the linear
regression model.
1
n
C= (Pi − Ti )2 (6)
n i=1
LDA is similar to the analysis of variance ANOVA [15] tries to express the dependent
value as a combination of other features and uses continuous independent variables
and a categorical-dependent variable. It can be used as a dimensionality reduction
also. LDA handles the data with the unequal within class frequencies where the
performance can be evaluated using the randomly generated test data [16]. The
LDA can be used in classifying images, speech, etc. Class-dependent and class-
independent transformations are the different approaches of LDA. Class-based means
minimizing the difference between classes to class variance while class-independent
transformation involves the maximization of the ratio of overall variance to the class
variance. The mean of each dataset and the entire dataset are computed as given in
Eq. 7 where β refers to the mean value and C refers to the probabilities of the class
A Machine Learning-Based Multi-feature Extraction … 197
of data.
βx = C1 β1 + C2 β2 (7)
K-nearest neighbour is the oldest and the simple method for the classification process.
Mostly it achieves competitive results for most of the domains trained with proper
knowledge. KNN classifies the unlabelled data by the majority label among its k
neighbours to identify the closest neighbour. When prior information is not available
distance metrics like Euclidean distances are used to measure the similarities and
dissimilarities between the data. However, the distance metric for kNN has to be
chosen based on the problem statement [5–7]. The distance can be computed as
Eq. 12 and the cost function can be computed as Eq. 13.
→
x j = L(−
x i −−
D − → →
x i −−
→ 2
x j ) (12)
n i j L(− →
x i −−→ 2
∈ (L) = x j )
→ 2
n i j (1 − yil ) 1 + L −x i −−
→
x j − L(−
→
x i −−
→ 2
+c x l ) (13)
198 M. Jawahar et al.
Classification and regression tree is a predictive model that analyses how an outcome
value can be predicted based on the other given data. This is a decision tree model
where each fork is splitted as a predictor value where each node includes a prediction
and an outcome variable.
Breiman proposed a random forest (RF) classifier based on multiple decision trees.
Every tree can be considered a single classifier and computed as a unit to identify the
input function for final classification. Random Forest divides each node randomly
by means of selected features. Error has been predicted for the out-of-bag portion of
every tree and also the feature variable permutation for each variable was computed.
The same computation was carried out after permuting each feature variable. Splitting
stops when the feature variable standard deviation difference equal to 0 [17].
The node impurity measure is computed using Gini index in random forest
classifier. Gini(T ) is defined as
n
Gini(T ) = 1 − P j2 (14)
j=1
Initially, SVMs were developed for classification and later extended for regression
and the learning process. A binary classifier is used so that SVM can result in either
positive or negative. It was later improved by combining the multi-class binary classi-
fier. In addition, the SVM can also be used to map the input space into nonlinear cases.
The linear classification is made equivalent to a nonlinear with the input space. The
input feature can be mapped to higher feature dimensions through maximal hyper-
planes in SVM. Based on the kernel parameters and hyperparameters better accuracy
can be achieved. For optimizing the data and to identify the best model kernel plays
a major role [18].
Multilayer feed-forward neural network consists of input, one or more hidden layers
and output layers. During feed-forward phase, each input node receives and then
transmits the input signal to each of the hidden nodes. Each hidden node calculates
A Machine Learning-Based Multi-feature Extraction … 199
the activation function and transmits its signal to the output node. The output unit
calculates the activation function for a certain input pattern as the network response.
Each output node compares its activation with the desired response during backprop-
agation to produce an error signal for a specific input pattern. This is repeated for
all samples in each training epoch. The error signal at each output unit is distributed
back to all units in the previous layer. The weights and biases are updated to minimize
error on each training epoch. The total squared error of the output is minimized by
gradient descent method known as backpropagation [19]. The error function and the
gradient function and weights change can be shown in Eqs. 15–17, where J is the
Jacobian matrix
H = JT J (15)
g = JTe (16)
−1
ω(n + 1) = ω(n) − [J T J + μI ] + JTe (17)
In this research work, multi-feature and multi-classifier analyses were used for leather
defect classification.
The leather database consists of the good leather or non-defective leather images
and defective leather images. The acquired image size was 1200 × 1600 and resized
to 400 × 600 without losing the major field of view. Out of 577 defective and 100
non-defective crust leather images, 542 samples images were used for the training
and remaining 135 samples for testing. Dataset was trained on a computer with
2.80 GHz Intel Core i7 CPU using Python SciPy.
Leather images were initially preprocessed using Wiener filter to smooth the
intrinsic background noises, and from the preprocessed data, GLCM and HOG
features were extracted. Similarly, Hu moments and HSV colour features were also
extracted. Extracted GLCM and HOG features were classified using the machine
learning algorithms like LR, LDA, KNN, CART, RF, SVM and MLP. The parame-
ters of the classifiers were tuned by trial and error method. Table 2 shows the mean and
standard deviation of the classification accuracy results for GLCM (Fea1) and HOG
(Fea2) feature extraction methods using the above seven classifiers using tenfold
cross-validation method.
Using the GLCM feature extraction method, the highest classification accuracy
was obtained for random forest, followed by KNN, SVM, CART, MLP, LDA and
LR. Similarly, RF obtained the highest classification accuracy using the HOG feature
extraction method followed by MLP, SVM, LR, CART, LDA and KNN. It can be
200 M. Jawahar et al.
observed that the random forest outperforms the other classifiers for both the feature
extraction methods.
Subsequently, GLCM features were combined with Hu moments and HSV (Fea3)
colour features and finally HOG, GLCM, Hu and HSV (Fea4) features and these
multi-features were given as input to the seven classifiers. Table 3 shows the mean
and standard deviation of the classification accuracy results of the multi-feature set.
As can be seen from Table 3, random forest was found to be the best classifier for
the multi-feature vector as well.
Figure 6 illustrates the distribution of the classification accuracy data for the
seven classifiers trained using the four feature extraction methods. Leather being
natural material with inherent texture variations, the non-defective images which
had a bold grain or other prominent variations were often misclassified as defective.
These misclassifications can be seen as outliers in Fig. 6 for all the four feature
extraction methods trained with seven classifiers. Random Forest achieved superior
classification accuracy (89.75%) with Fea3(GLCM + Hu + HSV) feature vector
(Fig. 7c). Nevertheless, the classification success rate of GLCM (Fea1) feature vector
was also found to be remarkable (88.83%). Experimental results show that GLCM
texture features trained using Random Forest classifier can be successfully used for
leather defect classification. Furthermore, the multiple feature set fusing features of
GLCM with invariant Hu moments and HSV trained with Random Forest improves
the classification accuracy.
A Machine Learning-Based Multi-feature Extraction … 201
4 Conclusion
Acknowledgements The authors gratefully acknowledge the Ministry of Electronics and Informa-
tion Technology (MeitY), Government of India for funding this research and Director, CSIR-CLRI
for his support during the project (A/2020/LPT/GAP1811).
References
3. Jawahar M, Vani K (2019) Machine vision inspection system for detection of leather surface
defects.J Am Leather Chemists Assoc 114(1)
4. Jawahar M, Chandra Babu NK, Vani K (2014) Leather texture classification using wavelet
feature extraction technique. In: 2014 IEEE international conference on computational
intelligence and computing research. IEEE
5. Kwak C, Ventura JA, Tofang-Sazi K (2001) Automated defect inspection and classification of
leather fabric. Intell Data Anal 5(4):355–370
6. He F, Wang W, Chen Z (2006) Automatic defects detection based on adaptive wavelet
packets for leather manufacture.In: Technology and innovation conference, 2006. ITIC 2006.
International, pp 2024–2027. IET
7. Pölzleitner W, Niel A (1994) Automatic inspection of leather surfaces. In: Proceedings,
Machine vision applications, architectures, and systems integration III, vol 2347
8. Amorim WP, Pistori H, Jacinto MAC, Sudeste EP.A comparative analysis of attribute reduction
algorithms applied to wet-blue leather defects classification
9. Pistori H, Amorim WP, Martins PS, Pereira MC, Pereira MA, Jacinto MAC (2006) Defect
detection in raw hide and wet blue leather.In: CompIMAGE, pp 355–360
10. Yeh C, Perng DB (2001) Establishing a demerit count reference standard for the classification
and grading of leather hides. Int J Adv Manuf 18:731–738
11. Peters S, Koenig A (2007) A hybrid texture analysis system based on non-linear & oriented
kernels, particle swarm optimization, and kNN vs. support vector machines.In: 7th international
conference on Hybrid intelligent systems, 2007. HIS 2007, pp 326–331. IEEE
12. Viana R, Rodrigues RB, Alvarez MA, Pistori H (2007) SVM with stochastic parameter selec-
tion for bovine leather defect classification.In: Pacific-Rim symposium on image and video
technology, pp 600–612. Springer, Berlin
13. Bahoura M, Rouat J (2001) Wavelet speech enhancement based on the teager energy operator.
IEEE Signal Process Lett 8(1):10–12
14. Haralick RM, Shanmugam K, Dinstein IH (1973) Textural features for image classification.
IEEE Trans Syst Man Cybern 6:610–621
15. Balakrishnama S, Ganapathiraju A (1998) Linear discriminant analysis-a brief tutorial.Inst
Signal Inf Process 18
16. Utpal B, Dev Choudhury R (2020) Smartphone image based digital chlorophyll meter to
estimate the value of citrus leaves chlorophyll using Linear Regression, LMBP-ANN and
SCGBP-ANN.J King Saud Univ Comput Inf Sci (2020)
17. Breiman L (2001) Random forests. Mach Learn 45(1):5–32
18. Sharon JJ, Jani Anbarasi L, Edwin Raj B (2018) DPSO-FCM based segmentation and Classifi-
cation of DCM and HCM heart diseases.2018 Fifth HCT information technology trends (ITT).
IEEE
19. Haykin S (1994) Neural networks: a comprehensive foundation. Prentice Hall PTR
20. Sahin EK, Colkesen I, Kavzoglu T (2020) A comparative assessment of canonical correlation
forest, random forest, rotation forest and logistic regression methods for landslide susceptibility
mapping.Geocarto Int 35(4):341–363
Multiple Sclerosis Disorder Detection
Through Faster Region-Based
Convolutional Neural Networks
Abstract Multiple sclerosis is a leading brain disorder that highly affects the normal
functions of the human body. Due to this disorder, protective coverings of neuron
cells are get damaged, which causes disrupting the information flow inside the brain
and other body parts. The early detection of multiple sclerosis helps healthcare prac-
titioners to suggest a suitable treatment for the disease. The detection of multiple
sclerosis is a challenging task. Many types of approaches had been proposed by the
researchers and academicians for accurately detecting the brain lesions. Precisely,
detecting the brain lesions is still a big challenge. Due to the recent innovations
in the field of image processing and computer vision, healthcare practitioners are
using advanced disease diagnosis systems for the prediction of disorders/diseases.
Magnetic resonance imaging approach is used for the detection of various brain
lesions by the neurosurgeons and neurophysicians. The computer vision approaches
are playing a major role in the automatic detection of various disorders. In this
research paper, the faster region-based convolutional neural networks approach
is proposed based on computer vision and deep learning, using transfer learning
for the detection of multiple sclerosis as a brain disorder. The proposed approach
is detecting the damaged area inside the brain with higher precision and accu-
racy. The proposed model detects the multiple sclerosis brain lesions with 99.9%
accuracy. Three DAGNetworks are used for training; there are Alexnet, Resnet18,
and Resnet50. As compare to Alexnet and Resnet18, deep networks, the Resnet50
Pre-trained network performed well with higher accuracy of detection.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 203
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_16
204 S. Ram and A. Gupta
1 Introduction
Human beings are fighting against various diseases since the beginnings of human
civilization. Many types of healthcare systems were developed and developed from
time to time as per the human requirements. Various types of brain-related diseases
were found and investigated by healthcare scientists. Multiple sclerosis is one type
of neurological brain disorder that causes disability in every age of men and women
[1]. The symptoms of multiple sclerosis disease were first time defined by Jean-
Martin Charot, a French professor of anatomical pathology [2]. This type of disorder
highly affects the parts of the central nervous system including the main parts of the
brain such as spinal cord, cerebrum, cerebellum, and optic nerves [3]. It is estimated
through the recent study done by the National Multiple Sclerosis Society that more
than one million people are living with multiple sclerosis brain disorders in the USA
[4]. The National Multiple Sclerosis Society also found that more than 2.3 million
people are living with multiple sclerosis across the world. The researchers found that
the ration of women suffering from multiple sclerosis is higher than the men [5].
The main cause of multiple sclerosis is the damage of myelin sheath which is an
insulating cover around the nerves [6]. Multiple sclerosis lesions mostly affect the
white matter or gray matter inside the brain [7]. Magnetic resonance images (MRIs)
have become the most important source of disease diagnosis. Various MRI modalities
such as axial, coronal, and Segital, are used by the healthcare practitioners for refer-
encing. With MRI, medical experts can detect brain disorders and control the progres-
sion of the disease by proper treatment. Magnetic resonance images clearly show
disease activity and active lesions. The neurologists compare the scanned images
based on white and dark area distribution to find out the damaged and healthy tissues
[8]. MRI scans are very useful for detecting various brain tumors, traumatic brain
injuries, Alzheimer’s disease [9, 10], Parkinson’s disease, brain strokes, dementia,
brain infections, and multiple sclerosis brain lesions [8]. The commonly used MRI
imaging sequences are T1-weighted, T2-weighted, and fluid-attenuated inversion
recovery (Flair). The contrast and brightness of the images are controlled through
the time of echo (TE) and time of repeating (TR). Both the T1-weighted and T2-
weighted MRI images are used by the neurological experts for the diagnosis of disease
[8]. In this research paper, the T2-weighted MRI images of multiple sclerosis are
downloaded [11–14].
The recent advancement in the field of artificial intelligence and machine has
opened the door for the healthcare expert to use the automatic disease diagnosis
tools and systems to find out the nature and effects of various diseases on human
beings [15]. The deep learning approach, which is the subfield of machine learning,
is playing a major role in the domain of medical image analytics [16, 17]. Through
the deep learning approach, the large volume of medical imaging records can be
explored and analyzed. The higher computing power as such graphics processing
units (GPUs) manufactured by various leading companies is also playing a dramatic
role in the field of machine learning. The GPUs are used as the main hardware for
the implementation of deep learning algorithms. The object detection through deep
Multiple Sclerosis Disorder Detection Through Faster … 205
2 Related Work
The academicians and researchers have proposed various object recognition and
object detection approaches using the convolutional neural networks [25, 26], with
transfer learning approaches. The pre-trained neural networks are designed and
trained with large image datasets are becoming more suitable networks for the clas-
sification and object detection tasks. To train a large network is a very expensive
and tedious task. Through the literature review, it is found that various approaches
to object detection were proposed by academicians.
Shaoqing Ren et al. proposed the approach for real-time object detection with
region proposal networks. They designed a region proposal network that shares the
features of a convolutional neural network with the detection network. The region
proposal network is a somewhat fully convolutional network that is used to predict
object bounds and object detection scores at the desire position. The authors explored
the approaches of object detection with pre-trained networks [23].
Ross Girshick explored a faster region-based convolutional neural network (R-
CNN) for object detection. The author experienced that as compared to the image
classification process, object detection is a more challenging task. The object detec-
tion process requires more complex methods. The author proposed an algorithm
based on single-stage training that jointly learns to classify object proposals and
update the corresponding spatial locations. The VGG16 deep network was trained
through the above-said method [24].
R. Ezhilarasi and P. Varalakshmi proposed a model for brain tumor detection
using the faster R-CNN approach. The Alexnet pre-trained network model was used
for the classification of various tumors as a basic model along with region proposal
network through the faster R-CNN approach. The transfer learning approaches were
used during the training of the network. The faster R-CNN was used for the detection
of brain tumor with creating the bounding box around the tumor area with tumor type
[27].
Ercan AVSAR and Kerem SALCIN proposed the approach for classification and
detection of brain tumors from MRI images through faster R-CNN. The authors
applied the faster R-CNN approach to brain MRI images to detect and locate the
tumor area. The authors said that approach used for the detection and classification is
more efficient and accurate as compared to simple R-CNN and fast R-CNN methods.
They achieved a 91.66% classification accuracy [28].
3 Research Methodology
This part of the research paper explains the research methodology used to detect
multiple sclerosis brain lesions. Multiple sclerosis detection approaches are carried
systematically from data collection to pre-processing the image datasets and finally
implementing the proposed approach. The most important step of the research is
Multiple Sclerosis Disorder Detection Through Faster … 207
the selection of a suitable dataset for implementing the model. The performance of
the model mostly depends on the quality of the image datasets. After the selection
of suitable datasets and pre-processing the MRI images, the selection of convo-
lutional neural network architecture is a very essential and important step. The
research methodology adopted for the proposed research is based on the labeling
of datasets with ground truth values. The step of labeling the datasets is a very
important part of the research methodology. After labeling the image, dataset four
pre-trained DAGNetworks are used to train the model to extract the features from
image datasets. The region proposal network is trained with the features extracted
from the pre-trained networks. Different training parameters are selected to train the
proposed model.
The multiple sclerosis MRI images are collected online from [11, 13, 14, 29], contain
38 patients T2-weighted MRI images in TIFF and BMP image file formats. These
images are the collection of first and second examination with the very beginning
and 6–12 months’ time interval. The total numbers of MRI images are 718. Some
images of FLAIR modality are also included in the dataset. The dataset is prepared
with the help of online downloaded images as well as images collected from the
above-cited sources.
The image datasets originally collected from the sources were of different formats.
All the images are converted to a PNG file format with size 512 × 512. After data
conversion into the PNG file format, an image datastore is created through MATLAB
image processing tools. All the MRI images are labeled using the image labeler
approach of the MATLAB. A ground truth datastore is created with a label “multiple
sclerosis.” Few sample images are shown in Fig. 1.
Deep learning is one of the very powerful machine learning approaches that auto-
matically extract the images features through learnable filter weights [21]. The faster
region-based convolutional neural network approach proposed by Ren et al. is widely
used for the object detection and classification [23]. The faster R-CNN approach is
208 S. Ram and A. Gupta
based on the pre-trained convolutional neural network and region proposal network
(RPN) [23]. In this research work, Alexnet, Resnet18, Googlenet, and Resnet50 pre-
trained deep networks are used as the base models with the faster R-CNN object
detection approach. The base models are trained with ground truth label images. The
AlexNet is a pre-trained network with 25 layers, having five convolutional layers
with three fully connected layers. The second pre-trained DAGNetwork used for the
training is Resnet18. It is a deep network with 71 layers with output name as “Clas-
sificationLayer_predictions” [21]. The third pre-trained DAGNetwork used for the
training is Resnet50. All the above-mentioned pre-trained networks are trained with
ground truth labels to extract the features from the labeled images. Two subnetworks
are used after extracting the features from the labeled images. The region proposal
network (RPN) is a subnetwork used after the feature extraction process and it is
trained to generate the object proposals [21]. The object proposals are the areas
inside the image where the object of interest exists. The next subnetwork is trained
Multiple Sclerosis Disorder Detection Through Faster … 209
to predict the actual class of each object proposal [21]. The region proposal network
is one kind of convolutional neural network consists of convolutional layers and a
proposal layer.
The bounding boxes with detection probability are drawn within the image
showing the region of interest (ROI) through the evaluation of an object detector that
is trained using region proposal network. The performance of the model measures
through the accuracy of the detection of the infected part within the image by drawing
the bounding around the infected part. The rectified linear unit (ReLU) activation
function is used after each convolution operation to take positive values. It is the
most commonly used action function in deep learning algorithms.
Where w is the learnable weight and x is the input values in the form of the
image matrix. The other performance measurements of accurately identifying the
information are recall and precision values obtained through each testing step. The
precision is the value that indicates the positive predictive result refers to the positively
predicted values divided by the sum of positive predictive value and false positive
predicted values. It can be written in the following formula
True Positive
Precision = (2)
(True Positive + False Positive)
210 S. Ram and A. Gupta
The recall is the value representing the positively predicted values with a fraction
of the sum of positive predicted values and false negative values. It can be written
using Formula 3 as given below.
True Positive
Recall = (3)
(True Positive + False Negative)
All the steps as described in Sect. 4 are repeated to train the AlexNet, Resenet18,
Googlenet, and Resnet50 pre-trained deep network.
Multiple Sclerosis Disorder Detection Through Faster … 211
The Resnet18 pre-trained network is trained using the train faster R-CNN object
detection function. The training is completed in 4 h, 3 min, and 12 s. Through the
training process, a training table is generated which consists of training epochs,
iteration required for training, time elapsed, mini-batch loss, mini-batch accuracy,
mini-batch root mean squared errors, region proposal network mini-batch accuracy,
region proposal network mini-batch root mean squared errors, and base learning rate.
Figure 8 shown display a graph plotted between Precision and recall. The precision
and recall are calculated based on true positive detected values to true positive plus
false negatives values using, the Formulas (2) and (3). The Resnet18 network has
higher precision and accuracy as compared to Alexnet.
Figure 9 shown depicts the mini-batch accuracy and mini-batch loss during the
training period of the Resnen18 deep network.
Figure 10 shown below depicts the mini-batch root mean squared error and region
proposal squared errors; both are decreasing with respect to each other.
The region proposal network mini-batch accuracy is seen higher as compared to
mini-batch accuracy during the training period.
The multiple sclerosis detector trained with Resnet18 pre-trained network is tested
on 116 images of multiple sclerosis and the few images as a part of 116 images are
shown below in Fig. 12, with detection accuracy printed inside each image in the
form of detection probability.
The Resnet50 pre-trained network is trained using the train faster R-CNN object
detection function on the same hardware platform as used for training the Alexnet
and Resnet18. In this proposed research, the DAGNetwork [21] is retrained on the
grayscale images. Through the training process, a training table is generated which
consists of the same fields as the table created through the Resnet18 training process
[21]. Figure 13 shown depicts the graph plotted between precision and recall.
Figure 14 shown displays the mini-batch accuracy and mini-batch loss of the
pre-trained deep network.
Figure 15 depicts the mini-batch RMSE and region proposal network mini-batch
RMSE.
There is a very small variation in the values of mini-batch accuracy and RPN
mini-batch accuracy as can be seen from Fig. 16.
Disorder detection accuracy of the multiple sclerosis detector trained with
Resnet50 is comparatively higher as compare to Alexnet, Resnet18, pre-trained deep
network. 36 images with detection probability printed inside the images are shown
in Fig. 17.
The values of precision and recall of all three networks are generated during the
training process of the pre-trained networks. The values of precision and recall are
generated through each epoch and the average of all values is calculated. A table for
precision and recall is created during the training period of each pre-trained network.
5 Conclusion
This research paper explored the transfer learning approaches with the help of the
pre-trained deep networks to provide more accurate results for the detection of brain
disorder. The pre-trained DAGNetworks is trained on grayscale image and achieve
higher brain disorder detection accuracy by comparing the performance of three
deep networks. All three pre-trained network’s performance is compared based on
precision, recall, and detection accuracy with bounding boxes. The Resnet50 deep
network has a higher precision value as compared to Alexnet and Resnet18 networks.
The 99.9% detection accuracy is achieved for the multiple sclerosis brain disorder
detection. The model can be used to detect the brain disorder inside the real-life
MRI images of the multiple sclerosis. As compared to other models proposed by
the researchers, our model for the brain disorder detection has the higher detection
Multiple Sclerosis Disorder Detection Through Faster … 219
Table 1 Performance
Alexnet Resnet18 Resnet50
comparisons
Average precision 0.928418 0.966239 0.977778
Average recall 0.47114 0.480835 0.482276
accuracy. As a future scope, the research work in the domain of medical image
processing applications can be explored.
References
19. Krizhevsky A, Sutskever I, and Hinton G (2012) ImageNet classification with deep convolu-
tional neural networks. In: Proceedings of advances in neural ınformation processing system,
pp 251090–1098
20. Ram S, Gupta S, Agrawal B (2018) Devanagari character recognition model using deep convo-
lution neural networks. J. Stat. Manag. Syst. 21(4):593–599. https://doi.org/10.1080/09720510.
2018.1471264
21. MATLAB R2020a, The MathWorks, Inc., Natick, Massachusetts, United States
22. Ettinger GJ, Grimson WEL, Lozano-Perez T, Wells WM, White SJ, Kikinis R (1994) Automatic
registration for multiple sclerosis change detection. In: Proceedings of IEEE workshop on
biomedical image analysis, Seattle, WA, USA, pp 297–306. https://doi.org/10.1109/BIA.1994.
315885
23. Ren S, He K, Gershick R, Sun J (2017) Faster R-CNN: towards real-time object detection with
region proposal networks. IEEE Trans Patt Anal Mach Intell 39(6):1137–1149
24. Girshick R (2015) Fast R-CNNproceedings of the 2015 IEEE international conference on
computer vision. Santiago, Chile, Dec. 2015, pp 1440–1448
25. Solanki D, Ram S. Object detection and classification through deep learning approaches
26. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object
detection and semantic segmentation. In: Proceedings of the 2014 IEEE conference on computer
vision and pattern recognition, Columbus, OH, June 2014, pp 580–587
27. Ezhilarasi R, Varalakshmi P (2018) Tumor detection in the brain using faster R-CNN. In:
Proceedings of the second ınternational conference on I-SMAC
28. Ercan AVSAR, Kerem SALCIN (2019) Detection and classification of brain tumors from MRI
ımages using faster R-CNN. Tehnıckı Glasnık 13(4):337–342
29. Loizou CP, Kyriacou EC, Seimenis I, Pantziaris M, Petroudi S, Karaolis M, Pattichis CS (2013)
Brain white matter lesion classification in multiple sclerosis subjects for the prognosis of future
disability. Intell Decis Technol J (IDT) 7:3–10
Retaining Named Entities for Headline
Generation
Bhavesh Singh, Amit Marathe, Ali Abbas Rizvi, and Abhijit R. Joshi
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 221
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_17
222 B. Singh et al.
1 Introduction
Due to an exponential rise in the amount of data available on the Web, it can get
really difficult for an individual to collect relevant information. It not only takes a
lot of effort to read through this enormous data, but also consumes a lot of valuable
time. A challenging task is to obtain as much of the relevant information as possible,
in the least amount of time. A need arises to have tools that can quickly brief or
summarize the important points of a large document, to ensure that human effort and
time is saved.
Majorly, there are two approaches followed for text summarization, namely
extractive and abstractive summarization. The former uses the content verbatim and
reforms it into a shorter version, while the latter uses an internal language repre-
sentation to generate more human-like summaries, paraphrasing the content of the
original text.
Natural language processing (NLP) is a huge domain, and one of its branches
extends down to text processing. One of the main applications under text processing
is text summarization. A large number of researches under this field use a sequence-
to-sequence model, a deep neural network technique, which gives an amazing output
accuracy on well-trained data.
As technology and science have advanced to such a great extent, and with the
sudden boom in the fields of AI and ML, automation of work has been undertaken
in several domains, and lately, this has been adopted by few news companies for
news summarization. Although, in such applications of text summarization, a major
problem that arises is retaining named entities. News headlines, report summaries
and all similar use cases have an important requirement of not losing important facts
and figures during summarization.
Inshorts [1], a company which invented an app that offers the latest news stories,
has recently developed a new algorithm Rapid60, that can automatically summarize
a full-length news article to a 60 words news brief, also creating a headline and a
card image.
This paper explores three models of text summarization. The first is a vanilla
encoder–decoder model with the recurring unit being a LSTM layer for long-term
dependencies [2]. This model is useful in figuring out the impact of every word from
the input article on the headline also called attention [3]. In this model, one can see
the system of the encoder as a text capturing mechanism from an article, which is
then decoded to generate the headline.
The second model, i.e., pointer generator has a similar architecture with a mech-
anism that points out at words directly from the context. As the named entities are
crucial to the credibility of the news, this system points the words that affect the
semantics significantly, from the article directly to the headline, to better preserve
the meaning of the article [4].
The transformer, a completely different model, uses the concept of self-attention
and parallelization rather than recurrence for finding dependencies and boosting
Retaining Named Entities for Headline Generation 223
speed. This model has achieved good results in many NLP tasks, and its n-headed
self-attention is a desirable feature for the summarization task at hand [5].
In this paper, the focus is on the evaluation of an accuracy of generated headlines
using the above-mentioned text summarization models. In addition to this, a new
method is proposed for retaining named entities, which are useful in generating
accurate headlines [6].
The rest of the paper is organized as follows. Section 2 throws light on the related
works carried out on text summarization. Section 3 explores the proposed approach
along with the design and implementation details of the system. Section 4 presents
an analysis of the results obtained from the system and evaluated based on ROGUE
metrics. The paper ends with a conclusion and directions for future work.
2 Literature Survey
Apart from being dependent on an optimal function, text summarization also relies
on a sentence similarity measure, up to a certain extent. It can significantly improve
the efficiency of abstractive summarization techniques. Masum and et al. modified
existing models and algorithms to generate headlines [7]. In addition to this, some
further processing, like forming classifications for named entities, have been carried
out by them to improve system accuracy and minimize possible problems [7].
Hanunggul and Suyanto presented a comparison between the two types of atten-
tion: global and local. They observed that a larger number of words pertaining to
the original summary were produced in the global attention-based model. While in
the local attention-based model, more sets of words from the original summary were
produced. The reason behind such an outcome is the subsets of input words get
considered instead of entire input words in the local attention implementation [8].
In 2015, Sutskever and et al. made the first attempt to summarize the text using
recurrent networks based on encoder decoder architecture [2]. Later in 2016, the
model designed by Loung and et al. on the famous CNN/daily mail dataset was the
stepping stone for abstractive summarization [3]. The next model to give promising
results was the pointer generator [4], a hybrid model that not only points the words
directly to the summary but also generated new ones from the vocabulary.
Masum et al. developed an efficient way of summarizing text using sequence-to-
sequence RNNs [9]. They proposed a method of successfully reducing the training
loss that occurs while training the model. The steps involved in their methodology
include data preprocessing, counting vocabulary size and then going on to adding
word embeddings and passing it to the encoder decoder layer with LSTM. One of the
limitations of their work is that it does not provide good results for large text inputs.
A paradigm shift in natural language processing occurred with the introduction
of the transformer [5]. It uses self-attention as a way to find dependencies instead of
recurrence. This design gave state-of-the-art results on a plethora of NLP tasks and
was efficient due to the parallelization in the architecture.
224 B. Singh et al.
3 Methodology
In this section, the approach followed in the development of the system is presented
in detail. This provides a more clear view of the proposed system by providing
diagrammatic representations and explanations.
datasets were identified after scrutiny, namely, the CNN daily mail news dataset and
the Amazon food review dataset. The main challenge in this model was summarizing
the dataset into a short, crisp headline, which also retained most of the important
information. It was found that the Amazon food review dataset was not fit for the
model because the labels contained only 2–3 words which did not meet the criteria
of a headline. Some marked inconsistencies were observed in the CNN daily mail
dataset where the data present was in the form of summaries of the article instead of
headlines, and thus, both datasets were discarded. Eventually, a dataset was created
using Web scraping where multiple articles and their headlines were acquired from
Inshorts [1]. The acquired news is not restricted to any single domain, rather it
consisted of a profusion of domains, viz. sports, business, politics, fashion, health,
etc. The final dataset holds 64,094 news articles along with their headlines, which
were preprocessed and split into training, validation and test sets, respectively.
Cleaning the Dataset: After acquiring the dataset, it is cleaned to get the best possible
accuracy on the model. The Web scraped dataset contained various ambiguous char-
acters, and thus for cleaning, these were processed with the use of regular expres-
sions. The dataset had frequent occurrences of “...”, new lines which were replaced
by a simple space. It is a common practice to dump punctuations before training
the models. Instead, for training this model, these punctuations are considered as
individual words as they play a vital role in the semantic and impact of the headline.
Thus, pretrained word embeddings file contained the embeddings of the punctua-
tions as well, which helps in improving the quality of the language of the generated
headline. Punctuations and symbols like (“.”, “!”, “?”, “:”, “;”, “*”, “,”, “|”, “/”)
are kept, and every word/character is left space separated.
Word Embeddings: The word embeddings used for the model are acquired from
the Stanford NLP Web site. The famous Global Vectors for word Representations
or Glove [10] embeddings each of dimensionality 300 are used. These vectors are
226 B. Singh et al.
trained using word co-occurrence statistics on the Common Crawl corpus. The word
embeddings of more than 94% of the words of the dataset’s vocabulary are present in
the glove.840B.300d.zip file (This number is achieved only after the steps to handle
the named entities are taken, as mentioned in the next paragraph). Random vectors of
size 300 are assigned to the words whose embeddings are not present. The embedding
layer of the model is given these pretrained vectors and is set to further train them to
figure out the latent meanings of these words with randomly assigned embeddings
and further refine the ones that already existed.
Identifying the named entity: An important issue that arises when using abstractive
text summarization, especially dealing with news datasets, is that the named entities
are abundant and not well retained during summarization which can compromise the
credibility of the news. This happens because named entities occur in large numbers
and comprise the majority of the vocabulary size. With each one being assigned a
random word embedding, there are not just enough occurrences of that named entity
for the model to learn the embedding. To tackle this problem, all the named entities
are categorized into 19 disparate classes as shown in Table 1, and each one is assigned
a token which would replace the named entity in the sentence. Using the open-source
spaCy library, for advanced natural language processing [11], various named entities
Table 1 Description of
Named entities Description
named entities [11]
PERSON People, including fictional
NORP Nationalities or religious or political groups
FAC Buildings, airports, highways, bridges, etc.
ORG Companies, agencies, institutions, etc.
GPE Countries, cities, states
LOC Non-GPE locations, mountain ranges, water
bodies
PRODUCT Objects, vehicles, foods, etc.
EVENT Named hurricanes, wars, sports events, etc.
WORK OF ART Titles of books, songs, etc.
LAW Named documents made into laws
LANGUAGE Any named languages
DATE Absolute or relative dates or periods
TIME Times smaller than a day
PERCENT Percentage, including “%”
MONEY Monetary values, including unit
QUANTITY Measurements, as of weight or distance
ORDINAL “First”, “second”, etc.
CARDINAL Numerals that do not fall under another type
INTERNET Web sites and Hashtags
Retaining Named Entities for Headline Generation 227
in the dataset are identified and replaced by the category tokens [12]. Although
the ones which are not identified, are found manually and processed using regular
expressions. With the entire set of named entities reduced down to 19 categories,
the vocabulary size and the number of unknown embeddings reduced radically. This
ensured that the tokens assigned to the categories would occur frequently, and its
embedding captured the latent meaning of the token. A dictionary is assigned to
every preprocessed article with the keys being the detected categories of the named
entities, and their corresponding values are lists of named entities of that category.
This is done so that the named entities could be replaced in the headline effortlessly.
After all these preprocessing steps, a clean dataset is ready, which would help in
obtaining a more accurate model to generate the best possible patterns and results for
a given input. The only thing left to do now is to replace the tokens in the generated
headline with the named entities of the article of the same category [13].
Table 2 shows an article and its headline, before and after replacing the named
entities. The named entities of the article are identified by spaCy. The second row
shows these named entities categorized depending on their type. The category tokens
are simply the category name encapsulated by “ < > ”. This token then replaces the
named entity in the article and the headline. The named entities of the Article that
occur in the Headline are highlighted in italic.
Tokenization: The final step left before preprocessing is to create tensor objects
out of the pairs of articles and headlines. The “<SOS>” (start-of-sentence) and
“<EOS>” (end-of-sentence) tokens are added at the beginning and the end of each
article and headline, respectively. All articles and headlines are truncated to a length
of 80 and 25 characters, respectively (punctuations included). An additional “<PAD>
” (padding) token is added to the articles and headlines until they met the desired size
of 80 and 25, respectively. When there is an appearance of an unknown word during
testing, the “<OOV>” (out of vocabulary) token is used. But the preprocessing with
spaCy mitigates the use of an OOV token. Every word and every token are assigned
a number in the vocabulary which is then used to create the tensors that could be
taken by the embedding layer.
3.2 Architecture
The mechanism of the three approaches: (1) Seq2Seq with attention model, (2) The
transformer, (3) Pointer generator model has been described in this section.
Sequence to sequence with Attention: It is a classic example of an encoder–decoder
model [2] with the encoder responsible for creating context vector representations
from the news articles that are provided and the decoder for generating a headline.
The headline is generated word by word, as the model calculates the attention given
to the encoder representations at every instant.
The Encoder: It consists of a trainable embedding layer for which the Glove [10]
300-dimensional word embedding is used. The recurrent layer is chosen to be of
228 B. Singh et al.
Table 2 Shows an article and its headline before and after the preprocessing steps and the
corresponding named entity dictionary
Article Headline
Original text The total number of India reports more than 5000
coronavirus cases in India has coronavirus cases in 5 days,
risen to 12,759 after over 5000 total cases rise to 12,759
cases were reported in last five
days. Meanwhile, the
coronavirus death toll in India
has risen to 420, while 1515
Covid-19 patients have been
cured, discharged or migrated.
Earlier today, the Health
Ministry revealed that 941 new
cases and 37 deaths were
reported on Wednesday
Recognition and classification GPE—[India] GPE—[India]
of named entities CARDINAL—[12,759, 5000, CARDINAL—[5000,
420, 1515, 37] 12,759]
DATE—[5 days, Wednesday] DATE—[5 days]
TIME—[Earlier today]
ORG—[the Health Ministry]
After replacing the tokens The total number of <GPE> reports
coronavirus cases in <GPE> <CARDINAL> coro-navirus
has risen to <CARDINAL> cases in <DATE>, total cases
after over <CARDINAL> cases rise to <CARDINAL>
were reported in last <DATE>.
Meanwhile, the coronavirus
death toll in <GPE> has risen
to <CARDINAL>, while
<CARDINAL> Covid-19
patients have been cured,
discharged or migrated.
<TIME>, <ORG> revealed that
<CARDINAL> new cases and
<CARDINAL> deaths were
reported on <DATE>
Equation 1 [3] gives the actual representation of the significance. The matrices Uatt
and Watt are used to get the vectors st−1 and h j , respectively, to the same dimension.
T
The Vatt is a matrix that leaves us with a scalar e jt . Softmax is applied to the values
of e jt which gives us alphas. The output vectors at every instance are multiplied by
their corresponding scaling factor and are added to form one vector.
The final dense layer is of the size of the vocabulary, and cross entropy loss is the
loss function. During training, there is a 50% chance of the predicted word being
sent in as the next input to implement teacher-forcing. This would result in the model
being not completely reliant on the proper input while training and would function
better in the testing scenarios
e jt = Vatt
T
tanh Uatt st−1 + Watt h j (1)
The Transformer: The model (Fig. 3) chosen is the exact rendition of the one from
the paper. Attention is all you need [5]. The article is positionally encoded after
every word is assigned its 300-dimensional embedding. This is important because
the transformer does not rely on recurrence. Hence, an idea of order is required for
the model to understand sequence data.
Each side of the transformer consists of six encoders and decoders, having multi-
headed attention of eight heads for better focus over the article. For calculating
230 B. Singh et al.
self-attention, a set of three matrices are multiplied with every input word producing
three vectors, i.e., query, key and value. The query and key vectors of every input
word are used to calculate constants that scale the value vector. This determines the
impact of every other word on the current word. This represents a single head of
attention, and n such sets of matrices are used to find n-headed attention.
The transformed vectors are added to the original ones and are followed by normal-
ization. This occurs in every encoder for 6 times, which leads to generating a context-
rich vector, and it is fed to every decoder from the decoder stack. The output of the
final decoder from the stack is fed to a dense layer of the size of the vocabulary to
predict the next word using the categorical cross entropy loss.
Pointer Generator: The paper [4] presents a new architecture for abstractive text
summarization that augments the standard sequence-to-sequence attentional model.
In this method, a hybrid pointer generator network is used that can not only
copy words from the source text via pointing, which aids accurate reproduction
of information, but also produces novel words through the generator. Further, the
generation probability pgen ∈ [0, 1] for time-step t is calculated from the context
vector h ∗t , the decoder state st and the decoder input xt using Eq. 2 [4].
pgen = σ whT∗ h ∗t + wsT st + wxT xt + bptr (2)
where vectors wh ∗ , ws , wx and scalar bptr are learnable parameters and σ is the
sigmoid function.
Now, this value of pgen is used to determine whether the words should be picked
from the article directly or from the original vocabulary distribution. One of the main
advantages of the pointer generator model is its ability to produce out of vocabulary
words, by contrast, other text summarization models are restricted to their pre-set
vocabulary.
Retaining Named Entities for Headline Generation 231
4 Result
Table 4 Example 1
Original article Delhi-based diabetes management app BeatO has raised over
11 crore in a pre-Series A funding round led by Orios Venture
Partners. The funding round also saw participation from
existing investors Blume Ventures and Leo Capital. Founded
in 2015 by Gautam Chopra, Yash Sehgal and Abhishek
Kumar, BeatO offers diabetes management programmes to
users via a smartphone app
Original headline Diabetes management app BeatO raises 11 crore led by Orios
Transformer headline Diabetes management app BeatO raises 11 crore led by Orios
Pointer generator headline Diabetes management app BeatO raises 11 crore in series by
Seq2Seq with attention headline Diabetes management app
Table 5 Example 2
Original article The TMC is leading in Kharagpur Sadar and Karimpur seats
in the West Bengal Assembly by poll. Meanwhile, BJP is
leading in the Kaliaganj seat. The Kaliaganj by poll was
necessitated following the death of sitting Congress MLA
Pramatha Nath Roy, while Kharagpur Sadar and Karimpur
seats had fallen vacant after the sitting MLAs were elected as
MPs in the LS polls
Original headline TMC leading in 2 of 3 seats in West Bengal by poll
Transformer headline TMC leading in 2 seats in West Bengal by poll
Pointer generator headline TMC leading in 2 TMC seats in West Bengal assembly
Seq2Seq with attention headline TMC TMC company seats in polls
Table 6 Comparison of
Architectures ROGUE-1 ROGUE-2 ROGUE-L
results on the basis of
ROGUE metrics Transformer 0.335 0.162 0.521
Pointer generator 0.369 0.157 0.493
Seq2Seq with attention 0.216 0.091 0.225
(unigram, bigram, etc.) between the generated headlines and original headlines over
the entire test set.
From Table 6, it can be inferred that the pointer generator performs better than
transformer and Seq2Seq based on the ROGUE-1 metric, as the pointer generator
has a special mechanism of pointing at single words directly from the article. But on
comparing the other metrics, the transformer model outperforms the rest in preserving
the semantics and also the named entities over the entire headline. The basic Seq2Seq
model not only lacks a mechanism to point important words directly to the output,
but also has no extensive self-attention architecture like the transformer. Hence, the
ROGUE metric is low in both short- and long-term dependencies.
Retaining Named Entities for Headline Generation 233
This paper has presented an approach for adapting existing text summarization
models to generate crisp headlines by taking news articles as input. Although it
is observed that the SeqSeq with attention and pointer has a problem of repetitions
that occur while headline generation. The pointer model has been found to perform
well under most circumstances. The transformer model has been seen to give the
best results out of all three. A new technique for retaining important named entities
has been presented here and produces more natural and meaningful headlines. The
proposed system would be a stepping stone toward automating the process of fool
proof headline generation, to be used in latest automated AI-based news platforms
like Inshorts. Also, the same system can be modified and trained for similar, but other
use cases like legal document analysis, stock market prediction based on news or
summarization of customer feedback on products, where retaining named entities is
essential.
References
1. Inshorts.com (2020) Breaking news headlines: Read All news updates in English—Inshorts.
Available at: https://inshorts.com/en/read. Accessed 4 August 2020
2. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks
3. Luong M-T, Pham H, Manning CD (2015) Effective approaches to attention-based neural
machine translation
4. See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator
networks
5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I
(2017) Attention is all you need
6. ETtech.com (2020) Inshorts debuts ai-based news summarization on its app—
Ettech. Available at https://tech.economictimes.indiatimes.com/news/startups/inshorts-debuts-
aibased-news-summarization-on-its-app/64531038. Accessed 4 Aug 2020
7. Masum KM, Abujar S, Tusher RTH, Faisal F, Hossai SA (2019) Sentence similarity measure-
ment for Bengali abstractive text summarization. In: 2019 10th international conference on
computing, communication and networking technologies (ICCCNT), Kanpur, India, 2019, pp
1–5. https://doi.org/10.1109/ICCCNT45670.2019.8944571
8. Hanunggul PM, Suyanto S (2019) The impact of local attention in LSTM for abstractive text
summarization. In: 2019 international seminar on research of information technology and
intelligent systems (ISRITI), Yogyakarta, Indonesia, 2019, pp 54–57. https://doi.org/10.1109/
ISRITI48646.2019.9034616
9. Mohammad Masum K, Abujar S, Islam Talukder MA, Azad Rabby AKMS, Hossain SA (2019)
Abstractive method of text summarization with sequence to sequence RNNs. In: 2019 10th inter-
national conference on computing, communication and networking technologies (ICCCNT),
Kanpur, India, 2019, pp 1–5. https://doi.org/10.1109/ICCCNT45670.2019.8944620
10. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation.
EMNLP 14:1532–1543. https://doi.org/10.3115/v1/D141162
11. Spacy.io (2020) Industrial-strength natural language processing. Available at: https://spacy.io/.
Accessed 4 August 2020
12. Partalidou E, Spyromitros-Xioufis E, Doropoulos S, Vologiannidis S, Diamantaras KI (2019)
Design and implementation of an open source Greek POS Tagger and Entity Recognizer
234 B. Singh et al.
using spaCy. In: 2019 IEEE/WIC/ACM international conference on web intelligence (WI),
Thessaloniki, Greece, 2019, pp 337–341
13. Li J, Sun A, Han J, Li C (2018) A survey on deep learning for named entity recognition. IEEE
Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2020.2981314
14. Janjanam P, Reddy CP (2019) Text summarization: an essential study. In: 2019 international
conference on computational intelligence in data science (ICCIDS), Chennai, India, 2019, pp
1–6. https://doi.org/10.1109/ICCIDS.2019.8862030
15. Partalidou E, Spyromitros-Xioufis E, Doropoulos S, Vologiannidis S, Diamantaras KI (2019)
Design and implementation of an open source Greek POS Tagger and entity recognizer
using spaCy. In: 2019 IEEE/WIC/ACM international conference on web intelligence (WI),
Thessaloniki, Greece, pp 337–341
16. Modi S, Oza R (2018) Review on abstractive text summarization techniques (ATST) for single
and multi-documents. In: 2018 international conference on computing, power and communi-
cation technologies (GUCON), Greater Noida, Uttar Pradesh, India, pp 1173–1176. https://
doi.org/10.1109/GUCON.2018.8674894
Information Hiding Using Quantum
Image Processing State of Art Review
Abstract The bottleneck of digital image processing field narrows down to the
memory consumption and the processing speed problems, which can be resolved by
performing image processing in quantum state. In this Internet era, all the information
is exchanged or transferred through the Web of things, which necessitates maintaining
the security of transmitted data. A variety of techniques are available to perform
secret communication. A quantum steganography scheme is introduced to conceal
a quantum secret message or image into a quantum cover image. For embedding of
secret data into the quantum cover, many algorithms like LSB Qubits, QUALPI are
available. This paper thrashes out on the subject of secret data transmission using
quantum image steganography.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 235
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_18
236 S. Thenmozhi et al.
of quantum bits is shown in Fig. 1. The concept of quantum computation was first
proposed by Richard Feynman in the year 1982. Shor’s quantum integer factoring
methodology proposed in the year 1994, and a search algorithm proposed by L.K.
Grover in the year 1996 (named after him) has developed a new and possible way
of computation. By the dawn of the late 1990s, development of quantum computing
became a hot topic in Information Science and Technology. Quantum information
hiding, a part of quantum computing, is divided into two parts, namely quantum
watermarking and quantum image steganography. Due to advancement in infor-
mation technology with tons of data being transferred through the Internet, it is
essential to have secure communication between the end users. This led to the rapid
development of quantum multimedia technology and quantum image steganography.
Quantum image steganography deals with information hiding inside an image in such
a way that the information is completely masked and the eavesdropper will never
know about its existence. Many quantum image steganography models have been
developed ever since [1]. In this work, the author proposed a Qubit lattice in the year
2003. Further, in the year 2010, he improved his model and proposed an intertwined
representation to store statistical information [2]. In [3], authors projected a real ket
model of quantum steganography in the year 2005. A new Flexible Representation
for Quantum Images (FRQI) model was projected by the authors in the year 2011
[4]. So, for the first time, a method that took consideration of position and inten-
sity was proposed. In the year 2014, Yi Zhang et al. proposed a methodology called
novel enhanced quantum representation (NEQR) [5]. This method is similar to FRQI
except for the fact that, in FRQI method, it considers only one qubit sequence. On
the contrary, NEQR considers superposition of all the qubits. To enhance the perfor-
mance of quantum steganography, a novel method called quantum log-polar image
representation (QUALPI) was proposed by Yi Zhang et al. in the year 2013 [6].
The advantages of quantum steganography over conventional methods are briefly
described in Table 1.
Fig. 1 Representation of
quantum bits
2 Literature Review
The paper [7] here proposed a new matrix-coded quantum steganography algorithm
which made use of the quantum color image. Here, a covert and quantum secure
communication is established by taking into account the good invisibility and higher
efficiency in terms of embedding of the matrix coding. Two embedding methods were
applied in this paper. The first being single pixel-embedded (SPE) coding where
three least significant qubits (LSQbs) of single quantum carrier image pixel were
embedded with two qubits of secret message. The second method used was multiple
pixels-embedded (MPsE) coding where three least significant qubits (LSQbs) of
different pixels of the carrier quantum image were embedded with two qubits of secret
message. The PSNR values of the embedding methods used here were found to be
higher than other methods referenced in the paper. Combining PSNR and histogram
analysis, it is shown that this protocol achieves very good imperceptibility. The
protocol is also shown to have good security against noises in the quantum channel
238 S. Thenmozhi et al.
and various attacks. The efficiency and capacity in terms of embedding of single-
pixel embedded coding are shown to be 2.67 and 22n+1 , respectively, and MPsE is
shown to be 2.67 and 22n+1 /3 [8] Here, the authors proposed three quantum color
image steganography algorithms which involved “Least Significant bit” technique.
Algorithm one utilized a generic LSB technique. Here, information bits of secret
data were substituted in place of the pixel intensity’s LSB values. This utilized a
single image channel to hide secret information. Algorithm two made use of least
significant bit (LSB) Xoring technique and utilized a single image channel to cover
secret data. Algorithm three made use of two channels of the cover image to cover the
color image for hiding secret quantum data. As the number of channels increased, the
capacity of the third algorithm also increased. An image key was used in all the three
algorithms in the process of embedding the secret data and extraction of secret data.
The evaluation parameters considered here were invisibility, robustness and capacity.
The PSNR values observed in the first algorithm were around 56 dB, the second
algorithm was around 59 dB, and the third algorithm was around 52 dB. The quality
of the stego image obtained by making use of the second algorithm was better than the
other two. As the third algorithm made use of two channels of the cover image to cover
the secret data, the capacity was enhanced. The capacity of the third algorithm was
2 bits/pixel, whereas the other two algorithms were 1 bit/pixel. [9] In this proposed
work, initially, a quantum carrier image was prepared using “NEQR” model by
employing two-qubit patterns to accommodate the grayscale intensities as well as
the position of each pixel. In EMD embedding, a group of pixels having a Npixels is
formed, and every secret digit of the hidden message, belonging to a system of (2N +
1)-ary notation, was embedded into that group. During this embedding of the secret
digit, a single pixel of the cover image alone might be modified or the cover image
pixels remain as such. If the cover pixel value was to be modified, then it was either
incremented or decremented by a unit value. This implies that for an N no of cover
pixels (2N + 1), different transformations need to be performed to acquire (2N + 1)
values of a secret digit. The advantage of EMD embedding is to provide good quality
of the image with a PSNR exceeding 52 dB. This algorithm achieves high embedding
efficiency, security and imperceptibility of the secret information. However, as ‘N’
becomes larger, the embedding rate reduces. In this paper [10], initially, a (2n ×
2n )-sized cover image and a (2n−1 × 2n−1 )-sized watermark image were modeled
by a “Novel Quantum Representation of Colour Digital Images Model (NCQI)”.
The watermark was scrambled into an unordered form through image preprocessing
technique to simultaneously change the position of the pixels while changing the
color pixel information based on “Arnold transformation”. The (2n−1 × 2n−1 )-sized
scrambled watermark image with a gray intensity range of 24-qubits was subjected
to expansion to acquire a (2n × 2n )-sized image with a gray intensity range of 6-
qubits using the “nearest-neighbour interpolation” method. This watermark image
was embedded onto the carrier by LSB steganography scheme, by substituting the
least significant bits of the pixels of the cover image of three channels, i.e., red, green
and blue. In the meantime, a (2n × 2n )-sized key image with information of 3-qubits
was also created to retrieve the actual watermark image. The extracting process is
just the inverse process of embedding. The PSNR value for the algorithm exceeds
Information Hiding Using Quantum Image … 239
54 dB, which indicates that the imperceptibility of the cover image is not affected
by the embedding of a watermark. The proposed scheme, thus, provides good visual
quality, robustness, steganography capacity and lower computational complexity.
This work [11] proposes three strategies which involve the design of new geometric
transformation performed on quantum images. The proposed design focused on
affected regions in an image, separability and smooth transformations by representing
an image in a quantum computer. The first strategy considered transformations that
considered parts of a quantum image. More controls were added to show information
about parts present in a quantum image. The second method took the separability
present in classical operation to transformation in the quantum state. By making use
of the flexible representation for quantum image (FRQI) model, it was feasible to
examine and define separable and geometric transformations. Third method aimed at
the transformations taking place smoothly. Multi-level controls which were used by
the cyclic shift transformations were the primary technique in obtaining smooth trans-
formation. The methods proposed in the paper provided top-level tools for expanding
the number of transformations required for building practical applications dealing
with image processing in a quantum computer. It is also shown that the design of a
quantum circuit with a lesser complexity for inconsistent geometric transformation
is feasible [6]. In this paper, the author proposed FRQI, a method in which images
are mapped onto its quantum form, in a normalized state which captures information
about colors and positions. The quantum image compression algorithm starts with
the color group. From the same color group, Boolean min-terms are factored. By
combining all the min-terms, a min-term expression is created. In the next step, the
minimization of min-terms is done. At the final step, minimized Boolean expression
of output is obtained. The paper has three various parameters evaluated depending
upon the unitary transformation on FRQI dealing only with color images, color
images with its current positions and a union of both color and pixel points. Consid-
ering the application of QIC algorithm on a unit-digit binary image, the compression
ratios vary from 68.75 to 90.63%, and considering a gray image, the value varies
from 6.67 to 31.62% [12] This paper focuses on estimating the similarity between
quantum images based on probabilistic measurements. The similarities between the
two images were determined by the possibility of amplitude distribution from the
quantum measurement process. The methodology utilized in this paper for repre-
senting the quantum state is FRQI. The obtained quantum image was then passed
on through a Hadamard gate to recombine both the states, and then, it is followed
by quantum measurement operation. The result of the measurement was dependent
on the differences in the two quantum images present in the strip. The probability of
getting a 0 or 1 was dependent on the pixel differences among the two quantum images
in the strip, and this was determined through quantum measurements. Comparing
a 256 × 256 grayscale original image with the same size watermarked image, the
240 S. Thenmozhi et al.
similarity value was found to be 0.990. When compared with the same-sized dark-
ened image with the original image, the similarity was found to be 0.850. Hence,
the similarity between two images is more proficient when the similarity value is
nearer to one [4]. The protocol used in this paper enhances the existing FRQI model,
by representing the qubit sequence to store the grayscale information. This method
starts by converting the intensity values of all the pixels into a ket vector. Then, a
tensor product of respective position and its intensity was done to form a single qubit
sequence, thereby, successfully converting a traditional image into a quantum image.
At the receiver, a same but inverse operation called quantum measurement is done
to retrieve back the classical image. The computational time in NEQR is found to be
very less. Compression ratio was also found to be better than FRQI.
This paper deals with a methodology for hiding a grayscale image into a cover image
[13]. An (n/2 × n/2)-sized secret grayscale image with a gray intensity value of 8-
bits was expanded into a (n × n)-sized image with a gray value of 2-bits. This secret
gray image and a (n × n)-sized cover image were represented using NEQR model
which stores the color information and position of every pixel in the image. The
obtained secret image, in quantum form, was scrambled using “Arnold Cat map”
before starting the process of embedding. Later, the quantum secret image, which
underwent scrambling, was embedded onto a cover image in quantum form using two
“Least Significant Qubits (LSQb)”. The process of extracting requires the stegano-
graphic image alone to extract the secret image embedded. This scheme achieves high
capacity, i.e., 2-bits per pixel which is significantly higher compared to other schemes
in the field of quantum steganography. The security of this scheme is enhanced since
this method involves scrambling the image before embedding. The PSNR achieved
by this scheme accounts to a value around 43 dB which is higher when compared with
Moiré pattern-based quantum image steganography, but less when compared with
other LSB techniques. In this proposed paper, initially, the image to be encrypted
was mapped onto a NEQR model which stores pixel values and pixel positions in an
entangled qubit sequence [5]. A chaotic map called logistic map was used to generate
chaotic random sequences. The process of encryption of the carrier image includes
three stages, namely, intra bit permutation and inter bit permutation and chaotic
diffusion. The intra bit permutations and inter bit permutations were operated on the
bit planes. The intra bit permutation was accomplished by sorting a chaotic random
sequence, which modified the position of the bits, while the pixel weight remained
the same. As the percentage of bit 0 and bit 1 was roughly the same in each and
every bit plane, all the bits were uniformly distributed due to the permutation oper-
ations. The inter bit permutation was operated between different bit planes, which,
simultaneously, modified the grayscale information as well as the information of the
pixel. This was achieved by choosing two-bit planes and performing Qubit XOR
operations on them. Finally, a chaotic diffusion procedure was put forth to retrieve
Information Hiding Using Quantum Image … 241
the encrypted text image, which was facilitated using an XORing of the quantum
image. The chaotic random sequence generated from a logistic map determined the
controlled-NOT gates, which was significant in realizing the XOR operations. The
parameters to the logistic map were found to be sensitive enough to make the keyspace
value large enough. Larger the keyspace value, the more difficult it is to perform the
brute-force attack. This methodology not just altered the grayscale intensities and the
positions of the pixels, yet, in addition, the bit distribution was observed to be more
uniform progressively. According to the simulation output, the proposed technique
was found to be more proficient than its classical equivalent. The security accom-
plished is confirmed by the measurable examination, the sensitivity of the keys and
keyspace investigation. When compared with the classical image cipher techniques,
mathematical entanglement of the proposed approach was found to be lesser. The
PSNR value for a grayscale image of size 256 × 256 was found to be 8.3956 dB
as opposed to an image cipher algorithm implemented using no linear chaotic maps
and transformation whose value was found to be 8.7988 dB [14]. This paper intro-
duces a novel, keyless and secure steganography method for quantum images dealing
with Moiré pattern. Here, the proposed methodology consists of two steps. Initially,
they carried out the embedding operation where a secret image was embedded onto
a preliminary Moiré grating of the original cover image which resulted in Moiré
pattern. Here, the preliminary Moiré grating was modified in accordance with the
secret image to result in a final Moiré pattern. The workflow of the embedding oper-
ation consisted of three steps. First, a preliminary Moiré grating was under consid-
eration, and the user had the flexibility in choosing the same. Second, a deformation
operation was performed to generate a Moiré pattern by making use of the prelimi-
nary grating and the image which was needed to be hidden. Finally, denoising was
performed which transformed the obtained Moiré pattern to a steganographic image.
The second phase of the methodology dealt with the extraction of the secret image
by making use of the preliminary grating and an obtained Moiré pattern. Evaluation
parameters considered here were visual effects and robustness. PSNR was performed
in displaying the steganography scheme’s accuracy. Even though the PSNR value
was observed to be around 30 dB, not much noticeable change was found between
the cover image and stego image. For the sake of understanding robustness of the
proposed scheme, the addition of salt & pepper noise with various densities was done
to stego image. The secret image extracted was easily identifiable and robust against
the addition of the salt and pepper noises. The stego image was under the influence
of cropping attack, and the extracted secret image from the cropped stego image
consisted of a few non-adjacent parallel black lines attached. Even though they had
observed the appearance of parallel black lines, the meaning and content of a hidden
image were observed conveniently.
242 S. Thenmozhi et al.
The paper [15] proposes a new quantum image steganography method which intro-
duced a quantum image representation called QUALPI that makes use of log-polar
images in preparing the quantum image model. This was followed by quantum
image expansion where an atlas consisting of various quantum image copies are
superimposed. The expanded quantum image was subjected to the embedding of
the secret information. This was done by choosing one particular image copy out
of the atlas followed by embedding the secret information onto the chosen image
copy. At the receiver, Grover’s search algorithm, an algorithm that aimed at reducing
the time complexity of searching a record present in an unsorted database, was
utilized in the retrieval of secret information. This work included three perfor-
mance parameters namely imperceptibility, capacity and security. The secret infor-
mation is embedded onto one of the many image copies and with a smaller angle
of image expansion, a complex atlas was obtained showing better imperceptibility
and thus greater security against eavesdroppers [16]. This paper introduced a novel
representation for a quantum image named quantum log-polar image (QUALPI)
which involved processing and storing of a sampled image in log-polar coordinates.
QUALPI involved following the preparation procedure. Initially, it dealt with the
conversion of classical image to an image sampled in log-polar coordinates. For an
image of size 2m × 2n with 2q grayscale values, a register consisting of (m + n + q)
number of qubits in the quantum state was defined in the storage of image informa-
tion as qubit sequences also referred to as ket. Later, an empty ket was initialized by
keeping all the grayscale intensities as zero followed by setting all the pixels with
their appropriate intensities. This constituted a final image representation in quantum
state named QUALPI. The time complexity involved in storing a 2m × 2n log-polar
image having a grayscale value of 2q was O(q(m + n) · 2m+n ). Common geometric
transformations such as rotational transformations and symmetric transformations
were performed conveniently with the help of QUALPI when compared with other
representations, for example, NEQR and FRQI.
In this paper, a new technique for constructing substitution boxes [17] dealing with
quantum walks nonlinear properties was presented. Quantum walks are universal
quantum computational models used for designing quantum algorithms. The perfor-
mance of this method was evaluated by an evaluation criterion called S-box. Also, a
novel method for steganography of images was constructed using the S-boxes. The
proposed method consists of a mechanism involving data hiding (traditional) and
quantum walks. This technique is shown to be secure for data which is embedded.
It is also seen that the secret message of any type can be used in this technique.
Information Hiding Using Quantum Image … 243
During the extraction process, only the preliminary values for the S-boxes genera-
tion and steganographic image are found to be required. This method has a greater
capacity of embedding and good clarity with greater security [18]. In this work, a
new quantum steganography protocol was proposed using “Pixel Value Differenc-
ing” (PVD) which satisfactorily adheres to edge effects of image and characteristics
of the human optical system. The whole process was divided into three parts namely
quantization of two-pixel blocks based on the difference in their grayscale values,
data embedding and extraction. Here, the cover image was embedded with the oper-
ator’s information and secret image based on pixel value differencing. Based on
the pixel value difference level, information about the operator with different qubit
numbers was embedded. The difference in pixel values was not a concern while
embedding a secret image. Secret image and information about the operator were
embedded by swapping the pixel difference values belonging to the two-pixel blocks
of the cover image with similar ones where embedded data qubits are included. Secret
information traceability is realized by extracting information about the operator. The
extraction process is seen to be completely blind. There were two parameters taken
into account while checking for the invisibility of the secret image. During histogram
analysis, it is seen that the histograms of steganographic images are very similar to
the original ones. Considering “Peak Signal-to-Noise Ratio” (PSNR), it is seen that
the algorithm proposed obtains good clarity. It is also seen that the scheme allows for
good embedding capacity and is found to be highly robust [19]. This paper discusses
a new protocol which is based on quantum secure direct communication (QSDC).
The protocol is used to build a concealed channel within the classical transmission
channel to transmit hidden information. The protocol discussed in this paper uses
QSDC as its basis. The technique adopts the entanglement transaction of its bell-
basis states to embed concealed messages. This protocol contains six steps which are
crucial for the preparation of large numbers, mode selection by a receiver, control
mode, information transmission, covert message hiding mode and concealed data
retrieving mode. The protocol uses IBF which is the extension and a more secured
method over BF coupled with QSDC. It was seen that the protocol can reliably
deal with the intercept-resend attack and auxiliary particle attack and also a man-in-
the-middle attack. This protocol also shows great imperceptibility. Compared to the
previous steganography protocols based on QSS and QKD, this protocol has four
times more capacity in hidden channels, thereby increasing the overall capacity of the
channel [20]. In this proposed work, a quantum mechanical algorithm was proposed
to perform three main operations: Creating a configuration in which amplitude of a
system present in anyone of the 2n states is identical. Performing Fourier transforma-
tion, rotating
√ the selective states by the intended angle. This paper presents a method
having O( n) complexity in time for identifying a record present in a database with
no prior knowledge of the structure in which the database is organized.
244 S. Thenmozhi et al.
3 Conclusion
References
1. Venegas-Andraca SE, Bose S (2003) Storing, processing, and retrieving an image using
quantum mechanics. Proc SPIE 5101:1085–1090
2. Venegas-Andraca SE, Ball JL (2010) Processing images in entangled quantum systems. Quant
Inf Process 9(1):1–11
3. Latorre JI (2005) Image compression and entanglement, pp 1–4. Available https://arxiv.org/
abs/quant-ph/0510031
4. Zhang Y, Lu K, Gao Y, Wang M (2013) NEQR: a novel enhanced quantum representation of
digital images. Quant. Inf. Process. 12(8):28332860
5. Lıu X, Xıao D (Member, IEEE), Xıang Y (2019) Quantum ımage encryption using ıntra and
ınter bit permutation based on logistic map. https://doi.org/10.1109/ACCESS.2018.2889896
6. Le PQ, Dong F, Hirota K (2011) A flexible representation of quantum images for polynomial
preparation, image compression, and processing operations. Quant. Inf. Process. 10(1):63/84
7. Qu Z, Cheng Z, Wang X (2019) Matrix coding-based quantum ımage steganography algorithm.
IEEE Access 1–1 (2019). https://doi.org/10.1109/access.2019.2894295
8. Heidari S, Pourarian MR, Gheibi R, Naseri M, Houshmand M (2017) Quantum red–green–blue
image steganography. Int. J. Quant. Inf. 15(05):1750039. https://doi.org/10.1142/s02197499
17500393
9. Qu Z, Cheng Z, Liu W, Wang X (2018) A novel quantum image steganography algorithm based
on exploiting modification direction. Multimedia Tools Appl. https://doi.org/10.1007/s11042-
018-6476-5
10. Zhou R-G, Hu W, Fan P, Luo G (2018) Quantum color image watermarking based on Arnold
transformation and LSB steganography. Int. J. Quant. Inf. 16(03):1850021. https://doi.org/10.
1142/s0219749918500211
11. Le P, Iliyasu A, Dong F, Hirota K (2011) Strategies for designing geometric transformations
on quantum images. Theor. Comput. Sci. 412:1406–1418. https://doi.org/10.1016/j.tcs.2010.
11.029
12. Yan F, Le P, Iliyasu A, Sun B, Garcia J, Dong F, Hirota K (2012) Assessing the similarity
of quantum ımages based on probability measurements. In: 2012 IEEE world congress on
computational ıntelligence
13. Zhang T, Abd-El-Atty B, Amin M, Abd El-Latif A (2017) QISLSQb: a quantum ımage
steganography scheme based on least significant qubit. https://doi.org/10.12783/dtcse/mcsse2
016/10934
14. Jiang N, WangL (2015) A novel strategy for quantum ımage steganography based on moire
pattern. Int J Theor Phys 54:1021–1032. https://doi.org/10.1007/s10773-014-2294-3
15. Qu Z, Li Z, Xu G, Wu S, Wang X (2019) Quantum image steganography protocol based on
quantum image expansion and grover search algorithm. IEEE Access 7:50849–50857. https://
doi.org/10.1109/access.2019.2909906
16. Zhang Y, Lu K, Gao Y, Xu K (2013) A novel quantum representation for log-polar images.
Quant Inf Process 12(9):31033126
17. EL-Latif AA, Abd-El-Atty B, Venegas-Andraca SE (2019) A novel image steganography tech-
nique based on quantum substitution boxes. Opt Laser Technol 116:92–102. https://doi.org/10.
1016/j.optlastec.2019.03.005
18. Luo J, Zhou R-G, Luo G, Li Y, Liu G (2019) Traceable quantum steganography scheme based
on pixel value differencing. Sci Rep 9(1). https://doi.org/10.1038/s41598-019-51598-8
19. Qu Z-G, Chen X-B, Zhou X-J, Niu X-X, Yang Y-X (2010) Novel quantum steganography
with large payload. Opt Commun 283(23):4782–4786. https://doi.org/10.1016/j.optcom.2010.
06.083
20. Grover L (1996) A fast quantum mechanical algorithm for database search. In: Proceedings of
the 28th annual acm symposium on the theory of computing, pp 212–219 (1996)
Smart On-board Vehicle-to-Vehicle
Interaction Using Visible Light
Communication for Enhancing Safety
Driving
Abstract Li-Fi technology has emerged as one of the sound standards of commu-
nication where light sources such as LED and photodiodes are used as a data source.
This technology is predominantly used in various modes for felicitating any type of
data communication. In the field of automobile, the role of Li-Fi technology marks
highly essential for achieving vehicle-to-vehicle interaction in a smart environment.
This smart communication finds its user end application even at a traffic light control
system. Both the transmitter and receiver section take advantage of using LED as a
light source due to its fast switching nature which makes the entire systems to be real-
ized at low cost and greater efficiency. In this paper, an intelligent transport system
is proposed using a Li-Fi technique which is generally a visible light communication
for felicitating a secured vehicle-to-vehicle interaction in a dynamic situation. The
receiver design is robust and dynamic which interprets the light waves into data with
the help of solar panels and amplifiers which is being transmitted from the other
vehicle. The overall data throughput is good and found an appropriate replacement
for typical RF communication systems in automobiles.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 247
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_19
248 S. S. Kumar et al.
1 Introduction
In 2004, Komine and Nakagawa [1] proposed a visible light communication tech-
nique for including an adaptable mechanism in vehicles for light dimming, and it is
mainly due to fast modulation of optical sources such as LED and visual light commu-
nication standard (IEEE 802.15.7) for effective wireless communication for short
range. Noof Al Abdulsalam et al. (2015) identified a novel approach for designing
Li-Fi module using normal LEDs for vehicular automation. Turan et al. [2] proposed
a novel modulation coding scheme to achieve minimum Bit Error Rate (BER) for
low latency secured communication using VLC. In 2017, Cailean and Dimian [3]
addressed challenges for implementing Li-Fi-based vehicle communication for anal-
ysed distance measures and visible light positioning. Poorna Pushkala et al. [4]
projected a solution for radiofrequency congestion and reliable audio, and image data
was communicated using Li-Fi communication without involving microcontrollers
and other peripheral devices. In 2018, Jamali et al. [5] proposed a methodology to
avoid road accidents due to vehicle collision-based Li-Fi technology. Satheesh Kumar
et al. [6] reviewed various advancements in recent automobiles to enhance a vehic-
ular communication for human-centred interactions and also deal with emotions
pertained through driver’s action and gestures. Gerardo Hernandez-Oregon et al.
(2019) analysed the performance of V2V and V2I communication by modelling
the road infrastructure by Markov process which benefits accurate calculation of
Smart On-board Vehicle-to-Vehicle Interaction Using Visible … 249
data throughput. In 2020, Subha et al. [7] explained OFDM-based Li-Fi architec-
ture for 5G and beyond wireless communication for enhancing the natto cells for
wireless channel capacity. In 2017, Satheesh Kumar et al. [9] and in 2015, Christable
Pravin et al. [8] explained about automation techniques which can be incorporated for
continuous speech recognition systems. In 2019, Ganeshprabhu et al. [11] discussed
about solar powered robotic vehicle and in 2019, Sujin et al. [10] explained about
the impact of public e-health monitoring systems. As a matter of ensuring the safety
of the individual, in 2018 Nagaraj et al. [12] proposed an alcohol impaired vehicle
tracking system using wearable smart helmet. In 2015, Allin Christe et al. [12, 13]
implemented a novel 2D wavelet transform approach for image retrival and segmen-
tation which paved a way for effective motion capture while driving the vehicle. With
the same intention, in 2020, Mazher Iqbal et al. [15] implemented a MWT algorithm
for effective analysis of medical images. All the way, in 2014, Satheesh Kumar et al.
[14] proposed rapid expulsion of acoustic soft noise using RAT algorithm which was
found to be an effective algorithm for removing soft noises in the images too and to
make it more convenient, region based scheduling is practiced with certain sensor
networks and this is focussed by Karthik et al. [16] in the 2019.
The basic functionality of the transmitter section is discussed here. The sensor module
integrated with the transmitter section acquires the data from the vehicle which is
being sensed. Due to the dynamic nature of the vehicle movement, the variations in
the sensing element are generally a fluctuation voltage, but moreover, it represents the
AC voltage. This will be converted into a DC voltage level by the sensor module which
will be readable by the microcontroller unit. A microcontroller unit is a processing
unit which compares the current data with the previous one and provides the output
to the LED driver (Fig. 1).
250 S. S. Kumar et al.
Once the output reaches the LED driver circuit, the data will be ready for transmis-
sion via wireless mode. The photodiode detects the light which is been transmitted
and converts it to a current. LCD will display the output appropriately. This approach
will make our road accidents to at least reach to a smaller extent. The push buttons are
generally used to take care of the contact establishments between different modules.
The motor got interfaced with the brake shoe and other primary controlling units of
an automobile. The usage of LEDs will bring a simple possible transmitter module
for felicitating the speed control of a vehicle for smart transport.
The receiver section has been implemented with the same set-up where the LED
blinking can be detected at the frequency above 1 kHz. The ultrasonic sensor detects
the distance between the two different vehicles. If the distance goes below the
threshold range, which is generally a safe distance level, an appropriate alert is
being transmitted by the Arduino module. The collected data will be processed by
a PIC microcontroller unit for reliable control actions. The ultrasonic sensor unit
is completely tuned for safe distance and violating that distance results in a chance
of getting hit with other vehicles. So, Arduino pro-mini module takes care of other
peripheral sensors for further processing of sensed information (Fig. 2).
All these separate processes help in achieving sudden response during an unfair
situation. The auxiliary systems help the receiver unit to take appropriate actions and
reduce the computation and processing burden of PIC microcontroller. The Bluetooth
systems help in conveying the state of information to the driver and help him to handle
the situation manually to some extent. But if the action fails to happen within a span
of time then, automatic actions will be triggered by the microcontroller which helps
to avoid accident situation.
Smart On-board Vehicle-to-Vehicle Interaction Using Visible … 251
4 Experimental Results
The entire system gets connected in a coherent fashion to take care of major task
whatever the V2V system adopts during the time of controlling actions. The user
is the deciding one who can choose the option which highly recommended at the
situation to access the vehicle.
The entire set-up is well connected, and served Bluetooth allows the information to
be transferred to a short-range preferably inside the vehicle to get connected with all
devices. LCD module helps in indicating the information which is really needed to
display.
Manual control is essentially required to control the vehicle actions based on the
level of comforts. Special buttons can be provided to perform switching actions.
252 S. S. Kumar et al.
Through VLC and the information, video and sound are then moved through light
devotion. Clear light interchanges (VLC) operate by adjusting the current over and
over to the LEDs at an incredible rate, to rate to see the naked eye in any way, so that
there is no gleaming. While Li-Fi LEDs should be held on to submit information,
they could be darkened to below people’s vision while still radiating enough light
for information to be transmitted. In addition, the invention is important, relies on
the obvious spectrum, because it is limited to the improvement and is not modified
in line with portable correspondence. Advances that allow different Li-Fi cells to
wander through otherwise called hinders that make it possible for the Li-Fi to change
consistently. The light waves can not be separated into a lot shorter range, but are
increasingly free from hacking than Wi-Fi. For Li-Fi, direct views do not matter;
70 Mbit/s can be obtained from light that reflects the dividers (Fig. 3).
Li-Fi is an invention of ORs, using light emanating from light diodes (LEDs) as
an organized, flexible and quick link to Wi-Fi via these boards. Illumination from
light-emanating diodes (LEDs). The Li-Fi ad was expected to rise at an annual rate
of 82% from 2013 to 2018 and to amount to more than 6 billion dollars per year by
2018. However, the market has not created a speciality showcase in this capacity,
and Li-Fi remained primarily for creative assessment (Fig. 4).
These sorts of V2V frameworks are required in light of the fact that human can
commit errors while driving which can make mishaps and they are valuable all
together use the excursion viably and in a made sure about way (Figs. 5, 6 and 7).
For instance fluorescent and flashing lights, powered lights are up to 80% more
efficient than traditional lighting. 95% of life in LEDs is converted into light, and
just 5% is lost as energy.
This is in contrast to glaring lights which make 95% of vitality warm and 5%
cold. LEDs are amazingly vitality proficient and expend up to 90% less force than
brilliant bulbs. Since LEDs utilize just a small amount of the vitality of a glowing
light, there is a sensational decline in power costs.
Shading rendering index is an estimation of a light’s capacity to uncover the
real shade of items when contrasted with a perfect light source (characteristic light).
High CRI is commonly an alluring trademark (despite the fact that obviously, it relies
upon the necessary application). LEDs by and large have high (great) appraisals with
regards to CRI.
Maybe the most ideal approach to acknowledge CRI is to take a gander at an
immediate correlation between LED lighting (with a high CRI) and a customary
lighting arrangement like sodium fume lights (which by and large have helpless
CRI evaluations and are at times practically monochromatic). See the accompanying
picture to thoroughly analyse the two occasions (Fig. 8).
254 S. S. Kumar et al.
5 Conclusion
The task targets planning a model for move of data from one vehicle in the front to
the back. This can be controlled remotely by means of an application that gives the
highlights of switch mode. An application is run on Android gadget. The framework
can be utilized in a wide scope of zones. The framework incorporated with various
highlights can be applied in the accompanying fields.
• The vehicles will be securely rerouted to elective streets and courses, which will
eliminate traffic clog essentially
• Because of frameworks, for example, the early accident alert, which will let vehi-
cles impart speed, heading and area with one another, there will be less occasions
of accidents.
• Because of lower blockage and less time spent in rush hour gridlock, the
contamination brought about by the vehicle will be lower also.
• The innovation is incorporated with the vehicle during its unique creation and
frequently gives both sound and visual alerts about possible issues with the vehicle
or the environmental factors.
Smart On-board Vehicle-to-Vehicle Interaction Using Visible … 255
• The innovation is included after the first get together; reseller’s exchange gadgets
are ordinarily not as completely coordinated as those applied during creation; V2V
secondary selling gadgets can be introduced by vendors or approved vendors.
Other reseller’s exchange gadgets could be independent and versatile gadgets that
can be conveyed by the traveller or driver.
• These gadgets are founded on street foundation things, for example, street signs
and traffic signals. The vehicles would have the option to get data from founda-
tion gadgets, which will help forestall mishaps and give natural advantages; this
correspondence procedure is called V2V, for short. This sort of correspondence
could give an admonition when a vehicle disregards a red light or a stop sign, has
unreasonable speed, enters a diminished speed zone, enters a spot with unexpected
climate changes and comparable.
As of now, the application is made for Android smartphone; different OS stage
does not bolster our application. Taking a gander at the current circumstance, cross-
stage framework that can be created on different stages like iOS, Windows can be
manufactured.
References
12. Nagaraj J, Poongodi P, Ramane R, Rixon Raj R, Satheesh Kumar S (2018) Alcohol impaired
vehicle tracking system using wearable smart helmet with emergency alert. Int J Pure Appl
Math 118:1314–3395
13. Allin Christe S, Balaji M, Satheesh Kumar S (2015) FPGA ımplementation of 2-D wavelet
transform of ımage using Xilinx system generator. Int J Appl Eng Res 10:22436–22466
14. Satheesh Kumar S, Prithiv JG, Vanathi PT (2014) Rapid expulsion of acoustic soft noise for
noise free headphones using RAT. Int J Eng Res Technol 3:2278–0181
15. Mazher Iqbal JL, Narayan G, Satheesh Kumar S (2020) Implementation of MWT algorithm
for image compression of medical images on FPGA using block memory. Test Eng Manag
83:12678–12685
16. Karthik V, Karthik S, Satheesh Kumar S, Selvakumar D, Visvesvaran C, Mohammed Arif
A (2019) Region based scheduling algorithm for Pedestrian monitoring at large area build-
ings during evacuation. In: International conference on communication and signal processing
(ICCSP). https://doi.org/10.1109/ICCSP.2019.8697968
A Novel Machine Learning Based
Analytical Technique for Detection
and Diagnosis of Cancer from Medical
Data
Abstract Cancer is the most dreadful disease which has been affecting human race
for a decade. Cancer is of different types which have been affecting people to the
at most saturation and ruining thousand of life per year. The most common cancer
is breast cancer in females and lung sarcoma in male. Globally, breast cancer has
been turned into a cyclone of disease due to which many females particularly of
the middle age (30–40 years). Cancer proved out to be the most nationwide disease
due to which millions of lives are being taken as recorded by the National Cancer
Registry programme of the Indian Council of Medical Research (ICMR), that is,
more than 13,000 individuals lose their life every day. Breast cancer contributes to
the majority of deaths of many women and the maximum percentage, i.e., 60% of
the ladies are being declared dead due to breast cancer. In this paper, our main point
of implementation is to develop more précised and more accurate techniques for the
diagnosis and detection of cancer. The machine learning algorithms have been taken
into account for the betterment and advancement in the medical field. Support vector
machine, naïve Bayes, KNN, decision tree, etc. have been used for classification as
these are the various types of a machine learning algorithm.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 259
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_20
260 Vasundhara and S. Parveen
Fig. 1 Comparison of
mammogram images of
normal and abnormal breast
models which can perform a certain task without the need of human work for a human
explicitly programming it to do something [2]. For example, if a person X listen to
music A on YouTube if she or he likes it hitting the like button, then whenever he
again listens to music which he likes to hear. The high pace and soothing which
provide only due to machine learning (Fig. 1).
Machine learning has evolved way of treating and diagnosing breast cancer in
the most appropriate manner with the utilization of its various algorithm [3]. It has
been proving an essential tool for the early detection and classifying breast cancer
based on risk factor extremities. A number of machine learning algorithm such as
a neural network (Bayesian network, decision tree, support vector machine) are
taken into account on large scale and proved to be an asset as it allows the doctors
to detect cancer and classify it on the extremities of different stages.
2 Related Work
Breast cancer is one of the most dangerous cancers which has taken lakh of lives and
affected the women population the most. Comparsion of different types of algorithms
such as support vector machine, ART, naïve Bayes and K-nearest neighbor are
being used. In this paper, the algorithm being used is support vector machine on
the Wisconsin breast cancer database where KNN technology gives the best accurate
results. Other algorithms also performed well in this experiment. SVM is a strong
technique and used with Gaussian kernel which is the most appropriate technique
for the occurrence and non-occurrence prediction of breast cancer. The SVM used
is applicable only when the number of class variable is binary [4].
In [5], this author focuses on proposing adaptive ensemble learning voting method
for diagnosing breast cancer database. The aim is to do a comparative study and
explain how ANN and logistic algorithm, when combined with ensemble machine
learning algorithms, provide output which gives more accurate figures. The dataset
being used here is Wisconsin breast cancer database. The accuracy of being used
A Novel Machine Learning Based Analytical Technique … 261
3 Methodology
In this research, various machine learning techniques have been surveyed and
analyzed for analyzing and diagnosing the medical data and have found that the
following techniques are very beneficial for it.
262 Vasundhara and S. Parveen
Support vector machine is an efficient and accurate tool of machine learning algo-
rithm [1]. The main role of SVM is to help in minimizing the upper bound gener-
alization error by increasing the margin between separating hyperplane and dataset.
It automatically tells about the error and it has the potential of running linear or
nonlinear classification [10]. It helps in the early detection of breast cancer. Since
the invention of SVM playing an essential role and has proven to be working effi-
ciently with the efficiency of 98.5% in the field of medical data, it is capable of
classifying up to 90% of amino acids of different compounds due to its high effi-
ciency, and it has the potential of detecting various cancer stages at its initial stage
[11]. When machine learning algorithm is applied in the medical field, it has always
come up with high accuracy and help the population to detect the cancer stages as
early as possible and save thousands of lives across the globe.
Naïve Bayes classifier is also a type of machine learning algorithm and is one of the
most effective and simple probabilistic classifiers which works on the principle of
Bayes theorem with strong independent assumptions [12]. Naïve Bayes being simple
has proven to be a more précised method in medical data mining. The requirements
of naïve Bayes is not much as only a little piece of data is required for the detection
and diagnosis of breast cancer [2]. One of the most important facts is that naïve
Bayes mainly focuses on providing the decision based on available data. To provide
the result to a maximum of the capabilities, they take into account all the necessary
and important approaches to give out the results more transparently [13].
3.3 PCA
Cancer is one of the most dreaded diseases due to which millions of lives are being
taken successively by each crossing year. Cancerous cells are those cells which have
lost the property of contact inhibition, i.e., when cells come in contact with their
neighboring cells, then they start dividing mitotically to give rise to cancerous cells
or tumor cells. Cancer gene is known as an oncogene and its cells are called onco-
cells. Cancer cells often form lumps or unregulated growth which turns out to be
dangerous for the whole body. Tumor is a majority of two types; (i) benign tumor
and (ii) malignant tumor (Fig. 2).
Among the male population, lung cancer is the most common, whereas in female
population, breast cancer is the most dreaded one and its taking millions of lives
of females all over the globe. Breast cancer is the cancer of the breast in which the
alveolar lobes which are 15–20 in number become cancerous, and in the extreme
case, the entire breast which is being affected by cancer needs to be removed so
that it does not start infecting other especially. It comprises of four stages of cancer.
The diagnosis is being done by biopsy in which the piece of tissue is being cut and
cultured in culture media for detection of many cancerous cells.
Alpha interferon is available in which along with chemotherapy and radiation
is used for its diagnosis. Nowadays, many tablets are being developed by using the
principle of genetic engineering for curing this disease but a finalized sample is
not yet been discovered which can be used as a substitute in place of the painful
chemotherapy. So, to deal with this, machine learning algorithm are being tried to
use to cope up with the uncertainties which arose during the treatment of cancer.
Chemotherapy is a procedure of diagnosis of cancer by injecting of a certain drug
which will destroy the cancerous cells.
With the use of machine learning algorithms, it is possible to distinguish between the
risk factor related to various stages of breast cancer [15]. These induced technologies
have provided with the results, i.e., a way easier to describe four stages of breast
cancer depending upon the severity of tumor size. These four stages are shown in
Fig. 3.
This stage indicates the tumor size. The survival rate is 100% in this situation and it
is the earliest indication of cancer being developed in the body and it does not require
much initiative to get cured. It does not comprise of any types of carcinoma.
It is also known as an invasive stage. In this stage, the tumor size is not much but
the cancerous cell broke into fatty adipose tissue of the breast. It comprises of two
stages:
Stage 1A
Stage 1B
The tumor size is 2 cm or smaller and it cannot be seen in the breast lobules but is
seen in the lymph node with the size of 2 mm.
In this stage, tumor grows as well as start spreading to an associated organ. The tumor
size remains small but it starts growing and its shape become like that of walnut. It
comprises of two stages:
4.1.4 Stage 2A
The tumor size is 2 cm or smaller but you can find 1–3 cancer cells in the lymph
node under the arms.
4.1.5 Stage 2B
The tumor size is larger than 2 cm but less than 5 cm and the cancer cells have reached
the internal breast that is the mammary gland and the axillary lymph node.
5 Stage Three
In this, size of a tumor is quite prominent and it does not spread to organ and bone but
start to spread to 9–10 more lymph nodes. This stage is very hard to fight a patient
undergoing the treatment. It comprises of two stages:
5.1 Stage 3A
The tumor size is larger than 5 cm and 4–9 cancer cells have reached the axillary
lymph node.
266 Vasundhara and S. Parveen
5.2 Stage 3B
The tumor is 10–20 µm. This is the last stage and the survival is very low because
the tumor cells have been spreading to organ other breast, and hence, this stage is
known as a metastatic stage (Fig. 4).
6 Proposed Methodology
In this review paper, the dataset which has been used is being taken into account
through different sources of data for retrieving information like radii of tumor size,
the stage of cancer, the diagnosis techniques which have been used to detect cancer at
the earliest stage possible by the use of techniques like mammogram, PET-SCAN
and got the outcomes based on these techniques.
Mammogram: It is the technique which is the most crucial step in breast cancer
detection since it tells about the early symptoms and signs about breast cancer it is a
type of X-RAY of the breast. In this technique, the breast is being placed on a plate
which scans and tells about the cancer symptoms or lumps formed [16] in the breast
(Fig. 5).
A Novel Machine Learning Based Analytical Technique … 267
Data Processing
Data Classification
Performance Evaluation
Results
Data processing: It is the step in which the data is processed and it mainly emphasized
on preventing the missing values, noise reduction and picking of relevant data.
Data Classification: In this step, the data is categorized based on different machine
learning algorithm, support vector machine, naïve Bayes and decision tree.
Performance Evaluation: In this step, the processed data is being evaluated and
machine learning algorithm is applied to it and then selection of the data which is
more accurate and efficient utilizes for further medical diagnosis [18].
It helps in four arenas of requirement, i.e., accuracy, precision, F1-measure and
Recall.
Accuracy = TP + TN/TP + FP + TN + FN
Precision = TP/TP + FP
Recall = TP/TP + FN
F1-measure = 2 * precision * Recall/Precision + Recall
These instances are defined as TP as true positive, TN as true negative, FP as false
positive and FN as false negative [19].
Results—This step provides us with the output that can be applied to the analysis
of the whole data.
A Novel Machine Learning Based Analytical Technique … 269
Dataset
In this research paper, the dataset which is being taken into account is breast cancer
dataset which has been extracted from the machine learning repository. In the
given dataset, there are 569 instances which have been categorized as benign and
malignant and 30 attributes have been used.
7 Results
This research is based on the use of various machine learning algorithm being applied
to the required data to extract the required output, i.e., more accurate and précised
diagnosis and detection of breast cancer at an early stage of cancer (Table 1).
8 Conclusion
This research paper attempts to identify the most appropriate methodology for the
diagnosis and detection of breast cancer with the support of machine learning algo-
rithms such as support vector machine, naïve Bayes, decision tree and PCA. The main
point of focus is the prediction of early stages of cancer with the use of the most effi-
cient and precise algorithms. Hereby, concluded that PCA outshines other machine
learning algorithm with an accuracy rate of 98.03%, recall 97.89% and precision
97.74% more areas of improvement have also been viewed and therefore PCA can
work at a more comfortable rate with 1–2% of improvement in its methodology.
References
1. Ray S (2019) A quick review of machine learning algorithms. In: 2019 International conference
on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India,
pp 35–39
2. Mello F, Rodrıgo P, Antonelli M (2018) Machine learning: a practical approach on the statistical
learning theory
270 Vasundhara and S. Parveen
3. Zvarevashe K, Olugbara OO (2018) A framework for sentiment analysis with opinion mining
of hotel reviews. In: 2018 Conference on ınformation communications technology and society
(ICTAS), Durban, pp 1–4
4. Bharat A, Pooja N, Reddy RA (2018) Using machine learning algorithm breast cancer
risk prediction and diagnosis. In: Third international conference on circuits, controls,
communication and computing (I4C), pp1–4
5. Khuriwal N, Mishra N (2018) Breast cancer diagnosis using ANN esemble learning algorithm.
In: 2018 IEEEMA, engineer infinite conference (eTechNxT), pp 1–5
6. Bazazeh D, Shubair R (2016) Comparative studyof machine learning algorithm for breast
cancer and detection. In: 2016, fifth international conference on electronics devices ,systems
and application (ICEDSA), pp1–4
7. Khourdifi Y, Bahaj M (2018) Applying best machine learning algorithms for breast
cam=ncer prediction and classification. In: 2018, International conference on elec-
tronics,control,optimaization and computer science (ICECOCS), pp1–5
8. Bayrak EA, Kırcı P, Ensari T (2019) Comparison of machine learning methods for breast cancer
diagnosis. In: 2019 Scientific meeting on electrical-electronics & biomedical engineering and
computer science (EBBT), Istanbul, Turkey, pp 1-3
9. Chandy A (2019) A review on iot based medical imaging technology for healthcare applications.
J Innov Image Process (JIIP) 1(01):51–60
10. Potdar K, Kinnerkar R (2016) A comparative study of machine algorithms applied to predictive
breast cancer data. Int. J. Sci. Res. 5(9):1550–1553
11. Huang C-J, Liu M-C, Chu S-S, Cheng C-L (2004) Application of machine learning techniques
to web-based intelligent learning diagnosis system. In: Fourth ınternational conference on
hybrid ıntelligent systems (HIS’04), Kitakyushu, Japan, pp 242–247. https://doi.org/10.1109/
ICHIS.2004.25
12. Ray S (2019) A quick review of machine learning algorithms. In: 2019 International conference
on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India,
pp 35–39. https://doi.org/10.1109/COMITCon.2019.8862451
13. Seref B, Bostanci E (2018) Sentiment analysis using naive bayes and complement naive
bayes classifier algorithms on hadoop framework. In: 2018 2nd ınternational symposium on
multidisciplinary studies and ınnovative technologies (ISMSIT), Ankara, pp 1–7
14. Li N, Zhao L, Chen A-X, Meng Q-W, Zhang G-F (2009) A new heuristic of the decision tree
induction. In: 2009 International conference on machine learning and cybernetics, Hebei, pp
1659–166
15. Kurniawan R, Yanti N, Ahmad Nazri MZ, Zulvandri (2014) Expert systems for self-diagnosing
of eye diseases using Naïve Bayes. In: 2014 International conference of advanced ınformatics:
concept, theory and application (ICAICTA), Bandung, pp 113–116
16. Pandian AP (2019) Identification and classification of cancer cells using capsulenetwork with
pathological images. J Artif Intell 1(01):37–44
17. Vijayakumar T (2019) Neural network analysis for tumor investigation and cancerprediction.
J Electron 1(02):89–98
18. Rathor S, Jadon RS (2018) domain classification of textual conversation using machine learning
approach. In: 2018 9th ınternational conference on computing, communication and networking
technologies (ICCCNT), Bangalore, pp 1–7
19. Douangnoulack P, Boonjing V (2018) Building minimal classification rules for breast cancer
diagnosis. In: 2018 10th ınternational conference on knowledge and smart technology (KST),
Chiang Mai, pp 278–281
Instrument Cluster Design for an Electric
Vehicle Based on CAN Communication
Abstract Electric vehicles are the need of the hour due to the prevailing global
conditions like global warming and increase in pollution level. For a driver, control-
ling the EV is same as a conventional IC engine automobile. Similar to the instrument
cluster of a conventional vehicle, EV also has an instrument cluster that acts as an
interface between the human and the machine. But the later one displays more crit-
ical parameters that are very essential for controlling the EV. This paper deals with
the development of EV instrument cluster that would display vital parameters by
communicating with different ECUs of the vehicle using industrial standard CAN
bus. Speedometer and odometer details are displayed on a touch screen panel which
is designed with a user-friendly interface. Python-based GUI tools are used to design
the interface.
1 Introduction
Global warming has become a serious threat to the existence of human beings. One
of the main reasons for global warming is carbon dioxide (CO2 ) emission through
various man-made sources. One such man-made source is the internal combustion
(IC) engine that powers a variety of automobiles worldwide. Electric vehicles (EVs)
of different types are replacing IC engine vehicles. The different types of EVs are
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 271
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_21
272 L. Manickavasagam et al.
battery electric vehicles (BEV), hybrid electric vehicles (HEV), and plug-in hybrid
electric vehicles (PHEV). BEV is the main focus area as it involves core electrical
and electronic components. EVs also come in four-wheelers and two-wheelers. Two-
wheeler EV is the main area of focus.
The main components of EVs are the drive motor, battery, and many control units
like electronic controller unit (ECU), battery management system (BMS), instrument
cluster which work in harmony to make EV a working engineering marvel. Instrument
cluster is one of the most important parts of a vehicle as it acts as an interface between
the human and the machine. Based on the information given by the instrument cluster,
the driver can take the necessary decisions and actions. So, a proper interface between
the user and the machine is required. On the motor control front, there are various
methods to control the drive motor efficiently. Motor control is the heart of the EV.
Motor control involves choosing the appropriate control strategy, implementing the
same using microcontroller and integrating motor and inverter ensuring proper basic
functionality of an EV.
The instrument cluster is the main focus. In the early days, mechanical instrument
clusters were used. These clusters used gear arrangements to infer speed, and they
displayed the same using needle. The accuracy of these clusters was very less, and
they were prone to damage. These faults gave way to electronic clusters which are
very accurate and are prone to very less damage. Instrument cluster design involves
the calculation of the data to be displayed in the instrument cluster, development of
user interface, and programming of the microcontroller for the instrument cluster.
There is a need to establish communication between two control units as data is calcu-
lated in one unit and displayed in other. There are many communication protocols.
CAN bus protocol, which is the industry standard communication, is used to estab-
lish communication. Parameters such as distance covered and speed of the vehicle
are displayed. The objective of the project can be divided into three areas—motor
control, instrument cluster design, and communication between the ECUs.
BLDC motor is preferred over the DC motor because of the BLDC employees the
electronic commutation which avoids the wear and tear of the machine, unlike the
mechanical commutation which DC motor employees. There are various control
methods present for BLDC motor for reducing the torque ripple such as sinusoidal,
trapezoidal, and field-oriented control. The back electromotive force (back-emf)
waveform of a permanent magnet brushless hub motor is nearly sinusoidal, which
is suitable for sinusoidal current control. To minimize the torque ripple, the field-
oriented control based on hall-effect sensors is applied. The authors D. Li and D.
Gu propose the field-oriented control for BLDC motor for four-wheel drive vehicles.
The rotor position is estimated through the interpolation method using the signals of
the hall-effect sensors [1]. A semantic scholar S. Wang has written about the torque
ripple reduction in his paper using modified sinusoidal PWM. To reduce the torque
Instrument Cluster Design for an Electric Vehicle … 273
The overall system is designed to do the following tasks as discussed in the objective:
• BLDC motor control
• Display of vehicle parameters in a display
• CAN bus communication between ECUs
Battery, inverter, and motor (along with sensors) form the part of the system which is
readily present in the EV (Fig. 1). To start the work, the first part is to make the motor
work and is achieved using the motor control unit (MCU). The input to this MCU is the
Hall-effect sensor of the BLDC motor. The output goes as PWM signals to the three-
phase Inverter. The calculations of speed and distance travelled are computed here.
All these ECU’s should be communicating in real time to have a real-time dashboard.
There are two ECUs in the design which needs to be communicating with each other.
CAN bus is the standard communication protocol used in vehicles. Therefore, CAN
bus is chosen. The computed values are communicated to the instrument cluster
controller-Raspberry pi. A digital display is interfaced with the pi, which acts as the
dashboard. The required components and its specifications are listed (Table 1).
274 L. Manickavasagam et al.
Fig. 3 Basic UI
Fig. 4 Graphical UI
For speed control of BLDC motor, there are three methods namely trapezoidal
control, sinusoidal control, and field-oriented control. In trapezoidal control, the
stator poles are excited based on a commutation logic which will be described in
detail in the later part of this section. The pulses are chopped into smaller pulses, and
based on the speed, the pulses width of the smaller pulses is varied. This method is
comparatively easier to implement when compared with other methods. This method
276 L. Manickavasagam et al.
requires the position of the rotor to be known to excite the stator. This method gives
ripples in torque. This method is called trapezoidal control as the back-emf has
the shape of a trapezium. In sinusoidal control, back-emf is made to resemble the
sinusoids. Thus, the motor coils are excited by three sinusoids, each phase shifted by
120°. In this method, torque ripples are reduced. This method requires the position
of the rotor to be known at the accuracy of 1°. Therefore, this method is complex in
implementation as many estimations has to be done.
Instrument Cluster Design for an Electric Vehicle … 277
The third method is field-oriented control. The main principle of this method is
that the torque of a motor is at a maximum when the stator and the rotor magnetic
fields are orthogonal to each other. This method tries to maintain both the magnetic
fields perpendicular to each other. This is done by controlling the direct axis current
and quadrature axis current of the motor. These currents have arrived from the line
currents using the Park and Clarke transformations. This method provides the best
torque results but very complex to implement. Trapezoidal control is implemented.
An electric vehicle is operated using three-phase inverter. The output from the
inverter is connected to the three input terminals of the motor. In Fig. 2, the motor
phases are represented in a series of resistance, inductance, and back-emf. The gating
pulse for the inverter is given from the microcontroller by sensing the position of
the rotor with the help of the Hall-effect sensor. Each phase in the stator (A, B, C)
is phase shifted by 10°. So, at each phase, there is one Hall-effect sensor embedded
in the stator. Hall-effect sensor can sense the presence of a rotor if the rotor is at
90° left or 90° right to the position of the sensor. Each sensor conducts for 180°. Let
us assume that the rotor is at phase A. So, here hall B and hall C cannot sense the
presence of rotor since the rotor is not around 90° from the position of sensors, and
only hall A can sense it. So, logic for Hall sensors will be 100. Each sensor is active
for 180°° in each cycle. As it can be seen, the back-emf generated is trapezoidal in
shape, and the phase voltage is varied accordingly by taking the Hall-effect sensor
states. There are six different states. Based on the Hall-effect sensor outputs, the
278 L. Manickavasagam et al.
required phase of the BLDC motor is excited to move the motor forward. Thus, the
following excitation table is got (Table 2).
Model-based development is an embedded software initiative where a model
is used to verify control requirements and that the code runs on target electronic
hardware. When software and hardware implementation requirements are included,
you can automatically generate code for embedded deployment by saving time and
avoiding the introduction of manually coded errors. There is no need to write code
manually. The controller automatically regenerates code. Model-based development
can result in average cost savings of 25–30% and time savings of 35–40% [7]. Instead
of using a microcontroller to generate PWM signals for inverter switching operation,
here, STM32F4 microcontroller is used (Fig. 2).
In closed-loop speed control, the actual speed is fed back and is compared with the
reference speed. This error can be reduced by tuning the PI controller accordingly.
The output of the PI controller is given to the PWM generator which generates a
pulse based on the duty ratio. When this pulse is given to AND with the gating
pulses of the inverter, PWM pulses are obtained. By generating PWM signals, the
Instrument Cluster Design for an Electric Vehicle … 279
average voltage that is applied to the motor can be varied. As the voltage applied to
the motor changes, speed also changes. If the average voltage applied to the motor
increases, then speed also increases and vice-versa.
To generate PWM signals, the first step is creating a simulation in MATLAB.
But MATLAB as such do not support STM32F4 board. So, it is necessary to install
software which supports STM32F4 to interface with PC. Waijung is a software
which interfaces STM32F4 discovery board with MATLAB. After installing Waijung
module blocks in MATLAB, run and build the simulation. Once the simulation
builds, waijung target setup automatically generates the code, and it is dumped into
the board from which gating signals are taken. Hence, without writing code manu-
ally, MATLAB automatically generates the code using waijung blockset. Thus, the
STM32 discovery board was programmed using waijung blockset and MATLAB to
implement the above speed control method.
The motor always needed an initial push to start. So in-order to avoid this, various
methods were discussed.
The second method was to the fact that inverter is triggered based on the position
of the rotor. As rotor position cannot be accurately got to the 1° precision, triggering
of inverter always had an uncertainty of getting the correct phases excited. So, the
main aim was to generate the Hall-effect sensor output pulses manually with the
same lag and correct time period based on the speed. The main aim of this was to
excite the motor in a way to make it move initially and then give the triggering based
on the original Hall-effect sensors present in the motor. This method was done and
the motor started to run in slow speed just enough for the Hall-effect sensors to give
proper output, and then the Hall-effect sensors took care of the normal running. The
third method was to the fact that there was a lag between Hall-effect sensors input
and triggering pulses. So to improve that, the sampling time of the board as set in
the waijung blockset was reduced even further. This method was also tried, and this
method drastically improved the response of the whole system. The second and third
methods were implemented in the STM32 discovery board using the MATLAB and
waijung blockset.
One of the main computations to be done is speed. RPM data acquired from motor
control unit is converted to speed by considering the diameter of the wheel. The main
approach to get the RPM is based on the frequency of the Hall-effect sensor pulses.
The relation between frequency and RPM of the motor is got. Then, the relational
constant is used to get the RPM from the frequency of the Hall-effect pulses. Then,
the RPM is converted to kilometers per hour (kmph) using a formula.
Main objective of the project is to create a useful user interface (UI) for the driver
to get the required data from the vehicle. Raspberry Pi is used as the controller for
the instrument cluster. For developing the UI, Python language is used. Software
tool mainly used to design the interface is the Python 3 IDLE. Tkinter is the Python
module used for building the UI. A meter class with a basic structure of gauge is
formed. The class has the definitions for creating the structure of a gauge, moving
the meter needle as per the value, setting the range of the meter and other display
specifications such as height and width. Then, Tkinter is used to create a canvas,
and two objects of the class meter are placed for displaying RPM and speed. It is an
analog type meter which forms the basic structure for displaying of speed and RPM
(Fig. 3).
Here, two gauges were developed separately for speed and RPM. Initial values
are given in the Python program which is represented by the needle.
The initial UI design was very basic. So, to make the UI more graphical and
look better, various Python packages were searched. An interactive graphing library
called Plotly has many indicators which were similar to the requirement. So, the
Plotly library was used and the below UI with two meters for speed and RPM, and
a display for distance was formed. This module displays in a Web browser (Fig. 4).
Here, two gauges were developed using Plotly module in the Python program.
This presents the speed, RPM, and distance in a graphical manner than the previous
one.
To communicate data between the motor control unit (MCU) (STM32 discovery
board) and instrument cluster, control area network (CAN) is chosen as it is the
current industry standard. The CAN protocol is robust and frame-based; hence, it
allows the communication between ECUs that happen without any complex wiring
in between. CAN uses a differential signal, which makes it more resistant to noise,
so that messages are transmitted with less marginal errors.
Control area network is a serial communication bus designed to allow control units
in vehicles to communicate without a host computer. CAN bus consist of two wires
named CAN high (CAN H) and CAN low (CAN L) and two 120 resistors at the
end for termination. Traffic is eliminated since the messages are transmitted based
on the priority, and the entire network meets the timing constraints. Each device can
decide if the message is relevant or needs to be filtered. Additional non-transmitting
nodes can be added to the network without any modifications to the system. The
Instrument Cluster Design for an Electric Vehicle … 281
messages are of broadcast type, and there is no single master for the bus. Hence, it
is multi-master protocol. Speed can be varied from 125 kbps to 1mbps.
For communication in CAN, two additional hardware is required. One is CAN
controller, and the other is CAN transceiver. In the transmitting end, CAN controller is
used to converting the data to CAN suitable messages. These messages are then turned
to differential signals using CAN transceiver. In receiving end, CAN transceiver takes
the differential signals and changes it to CAN message. Then, this is changed to the
data by the CAN controller. STM32 board has inbuilt CAN controller so only CAN
transceiver is required. But Raspberry Pi does not have an inbuilt CAN controller,
so it needs to be provided separately along with transceiver. For communication
between CAN controller and microcontroller, SPI communication is used.
The STM board and Raspberry Pi were made to communicate via CAN bus.
Raspberry Pi does not have inbuilt CAN controller, so CAN module was used, and
communication between Raspberry Pi and CAN module was established using SPI
communication (Fig. 5).
SPI communication was enabled in Raspberry Pi with oscillator frequency of the
oscillator present in the CAN module, and then, CAN communication was brought
up with the baud rate. CAN message was sent from STM board and was programmed
using Simulink (Fig. 6).
Then, the CAN message was received as sent by the STM board in the Raspberry
Pi by setting the baud rate to the value used while programming STM. It was viewed
in the terminal of Raspberry Pi and also in the Python shell. CAN library for Python
was used to get the CAN message and display it.
4 Results
The integration of all three separate parts results at the intended hardware system,
the electric vehicle.
As discussed in the previous sections, the trapezoidal speed control was imple-
mented using the STM32 discovery board to control the drive motor as discussed
in the previous chapter using waijung blockset. The Hall-effect sensor data was
received using pins, and triggering pulses were given to inverter switches based on
an algorithm. A potentiometer is being used to control the speed by giving the refer-
ence speed. The inbuilt ADC is used for giving the potentiometer reading to the
controller. Thus speed control is implemented. The parameters to be displayed are
also calculated in the motor control unit as discussed earlier (Fig. 7).
As discussed in the previous sections, Raspberry Pi is used as the instrument
cluster controller unit. Raspberry Pi display is used as the display for the instrument
cluster. Python language is used to create the UI for the cluster. There are many
packages like Tkinter and wxPython for UI creation. Plotly graphs provided a better
design for UI. Plotly graphs have inbuilt gauge meters which are used in the display.
As discussed in the previous sections, communication between control units is
established using CAN bus. CAN bus is established between the discovery board
282 L. Manickavasagam et al.
(MCU) and Raspberry Pi (instrument cluster). Discovery board has inbuilt CAN
controller, so it is connected to the CAN bus using a CAN transceiver module. For
Raspberry Pi, external CAN controller and CAN transceiver modules are used. Thus,
communication is established (Fig. 8).
The calculated values are sent as a message in CAN bus by the motor control
unit. CAN package in Python language is used to retrieve the CAN message sent
by the motor control unit. The data is separated from the message and then given
to the UI to display the same. The data is separated to speed and distance based
on the bit position set on the MCU while sending data. Then, the data is converted
from hexadecimal to the decimal and then given to the UI as the appropriate variable
(Fig. 9).
5 Conclusions
Instrument cluster for an electric vehicle was implemented. The implemented motor
control is primitive which gives many disadvantages like torque ripples. A better
control algorithm can be implemented. The inverter used for implementation is very
bulky and damages the core principle of a two-wheeler which is its compactness. A
better single-chip inverter is required to maintain the compactness of the vehicle. The
calculation algorithms used for calculation of the parameters such as the speed and
distance work. The results seem to be correct as the RPM, which is the main parameter
for calculating the speed and distance, are in accordance with the tachometer reading.
But they can be perfected using better algorithms which give more precision and
accuracy. This proposed design proves that high-level programming languages can be
used to design user-friendly interfaces instead of sophisticated software tools. CAN
communication which was established between the ECUs was troublesome because
of many issues, like getting garbage values that were faced during its implementation.
Many important parameters like the SOC of battery, distance left until charge drains,
and speed of economy can be added as future works.
Instrument Cluster Design for an Electric Vehicle … 283
References
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 285
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_22
286 N. Nayar et al.
1 Introduction
In social insects’ colonies, every insect typically performs its tasks autonomously.
However, the tasks performed by individual insects are correlated because the whole
colony is able to solve even the complex problems by cooperation. Without any
kind of central controller (supervisor), these insect colonies can solve numerous
survival-related issues, e.g., selection/pick-up of material, exploring and storing the
food. Although, such activities require sophisticated planning, still, such issues are
resolved by insect colonies devoid of any central controller. Hence, such collective
behavior emerging from social insects’ group is termed as “swarm intelligence.”
During the last few decades, ever-increasing research on these algorithms suggests
that nature is an extraordinary source of inspiration for developing intelligent systems
and for providing superior solutions for numerous complicated problems.
In this paper, the behavior of real ants are being reviewed. Owing to the fact
that these ants are capable of exploring the shortest path between their food source
and their nest, many ant-based algorithms are proposed by researchers. Ant colony
optimization (ACO) algorithm is a prominent and successful swarm intelligence
technique. During the past decades, a considerable amount of research has been
conducted to develop the ACO-based algorithms as well as to develop its pragmatic
applications to tackle real-world problems. Inspired by real ants’ behavior and their
indirect communication, the ant colony optimization algorithm was proposed by
Marco Dorigo. Since then, it is gaining tremendous attention to research.
In ACO algorithms, ants (simple agents) are involved that collaborate for achieving
amalgamated behavior for the system and thereby develop a “robust” system that can
find superior-quality solutions for a variety of problems comprising of large search
space.
The paper reviews the basis of ACO. In Sect. 1, the behavior of real ant colonies is
depicted. In Sect. 2, the feature selection concepts are presented. In Sect. 3, numerous
existing ACO algorithms are reviewed.
Ants are considered to be “social” insects, and they live in colonies. Ants drop
pheromone on the ground while traveling, which helps ants to explore the shortest
route. Probabilistically, every ant prefers to follow the path that is rich in pheromone
density. However, pheromone decays with time, thus leading to less pheromone
intensity on less popular paths. Therefore, the shortest route will have more number
of ants traversing, whereas other paths will be diminished till all ants pursue the
same shortest path leading the system to converge to a “single” solution. Over time,
the pheromone intensity decreases automatically. Practically, this pheromone evap-
oration is required for avoiding speedy convergence of this algorithm towards any
sub-optimal region.
Inspired by the behavior of real ants, the artificial ants were designed to solve
numerous optimization problems as they are capable of moving through the existing
problem states and making decisions at every step.
In ACO, basic rules are defined as:
• Problem is depicted as a graph with “nodes” representing features and “edges”
representing a choice of the subsequent feature.
• η denotes the “heuristic information” that is the goodness of path.
• “Pheromone updating rule” which updates pheromone level on edges.
• “Probabilistic transition rule” which finds the probability of ant for traversing to
a subsequent node (Table 1).
• ACO algorithms are robust in nature; i.e., they are flexible according to the
changing dynamic applications.
• They have the advantage of distributed computations.
• They give positive feedback, which, in turn, brings about the revelation of optimal
solutions which may be further used in dynamic applications.
• They allow dynamic re-routing via shortest path algorithms if any node is broken.
• While analyzing real dimension networks, ACO algorithms allow network flows
to be calculated more fastly than traditional static algorithms [33].
Generic ACO algorithm is depicted below [34], which comprises of four major steps:
1. Initialization
288 N. Nayar et al.
In the last step, updations in pheromone variables are performed based on the
search background as echoed by ants.
Begin ACO Algorithm
initialization;
while (Iterate till end criteria is satisfied) do.
Formulate ant solutions;
Perform local search;
Global pheromone update;
end
end End of ACO.
2 Feature Selection
“pheromone density model” and “visibility density model.” The approach is vali-
dated on ten datasets obtained from the UCI repository. The results demonstrate that
the proposed method is capable of keeping a fair balance on classification accuracy
as well as efficiency, thereby making it apt for feature selection applications.
For enhancing the stability of feature selection, [17] proposed FACO, an improved
algorithm for feature selection, that is based on ACO. It uses two-stage pheromone
updating rule that averts the algorithm from falling into premature local optimum and
is validated on KDD CUP99 dataset. The outcomes demonstrate that the algorithm
has great practical significance as it enhances classification accuracy as well as the
efficiency of a classifier.
By conducting the literature review, some gaps can be identified, so can be
concluded that “construction algorithms” are not able to provide superior-quality
solutions and may not be optimal according to “minor” changes. Some challenges in
ACO include exploring superior pheromone models and to uphold apt balance among
“exploration” as well as “exploitation” of search space. Moreover, there is a need
for adaptive intelligent models having the capability of automatically identifying
dynamic alterations of dataset’s characteristics, thereby upgrading the algorithms in
a self-directed manner.
After carrying out an extensive literature review of ACO in feature selection, the
value of accuracies achieved by various ACO variants is summarized in Table 2.
4 Current Trends
Table 2 (continued)
Dataset Algorithm Accuracy achieved (%)
Hepatitis ACO-PSO [11] 75.34
cancer ABC-DE [11] 71.26
AC-ABC [11] 79.29
ACO-PSO [11] 87.06
ABC-DE [11] 96.01
AC-ABC [11] 99.43
• Dynamic problems are those problems where the topology and cost change
even when the solutions are being built. For example routing problem in
telecommunications network where traffic patterns keep on changing everywhere.
The ACO-based algorithms for addressing these kinds of problems are the same in
general, but they vary unquestionably in implementation details. Of late, ACO algo-
rithms have raised a lot of speculation among the researchers. Nowadays, there are
various successful implementation of ACO algorithms which are utilized to a broad
scope of various combinatorial optimization problems. Such kinds of applications
come under two broad application areas:
• NP-hard problems:
• For these problems, the best-noted algorithms have found to have the exponential
time worst-case complexity. Most ant-based algorithms are equipped with more
abilities like problem-specific local optimizers that obtain the ant solution to local
optima.
• Shortest path problems:
• In these problems, the properties of the problem’s graph representation may
vary over time (synchronously) with the optimization method, which needs to be
adapted to problem dynamics. In such a scenario, the graph can be made available
but its properties (cost of components, connections) may vary over time.
• In such cases, can be concluded that the use of ACO algorithms is recommended
as the variation rate of the cost augments but the know-how of the variation process
decreases.
From the literature studied, it can be inferred that the identification of pertinent
and valuable features for training the classifier impacts the performance of the clas-
sifier model. ACO has been and proceeds to be a productive paradigm for struc-
turing powerful combinatorial solutions for optimization problems. In this paper,
the origin and biological background of the ACO algorithm is presented and several
application areas of ACO. Finally, a survey of ACO used in the domain of feature
selection is presented. ACO algorithm has become one of the most popular meta-
heuristic approaches to resolve various combinatorial problems. The previous ACO
Ant Colony Optimization: A Review of Literature … 295
versions were not good enough to compete with other well-known algorithms, but
the outcomes were promising enough to open new avenues for exploring this area.
Since then, many researchers have explored the basic ACO algorithm and updated
it for obtaining promising results. This paper focuses on outlining the latest ACO
developments in terms of algorithms as well as ACO applications. Applications like
multi-objective optimization and feature selection are the main targets of recent ACO
developments. For enhancing the performance of ACO algorithms, these algorithms
are further combined with existing meta-heuristic methods and inter-programming
techniques. A clear improvement in results for different problems has been shown by
hybridization of ACO algorithms. Implementation of ACO algorithms with parallel
versions has been seen in latest trends. Due to the use of multi-core CPU architectures
and GPUs, the creation of enhanced parallel versions of ACO algorithms is possible.
References
1. Maier HR, Simpson AR, Zecchin AC, Foong WK, Phang KY, Seah HY, and Tan CL (2003)
Ant colony optimization for design of water distribution systems. J Water Resour Plan Manage
129(3):200–209
2. López-IbáñezM, Prasad TD, Paechter B (2008) Ant colony optimization for optimal control of
pumps in water distribution networks. J Water Resour Plann Manage 134(4):337–346
3. Zheng F, Zecchin AC, Newman JP, Maier HR, Dandy GC (2017) An adaptive convergence-
trajectory controlled ant colony optimization algorithm with application to water distribution
system design problems. IEEE Trans Evol Comput 21(5):773–791
4. Sidiropoulos E, Fotakis D (2016) Spatial water resource allocation using a multi-objective ant
colony optimization. Eur Water 55:41–51
5. Shahraki J, Sardar SA, Nouri S (2019) Application of met heuristic algorithm of ant Colony
optimization in optimal allocation of water resources of Chah-Nime of Sistan under managerial
scenarios. IJE 5(4):1
6. Do Duc D, Dinh PT, Anh VTN, Linh-Trung N (2018) An efficient ant colony optimization
algorithm for protein structure prediction. In: 2018 12th international symposium on medical
information and communication technology (ISMICT), pp 1–6. IEEE
7. Liang Z, Guo r, Sun J, Ming Z, Zhu Z (2017) Orderly roulette selection based ant colony
algorithm for hierarchical multilabel protein function prediction. Math Prob Eng
8. Özmen M, Aydoğan EK, Delice Y, Duran Toksarı M (2020) Churn prediction in Turkey’s
telecommunications sector: a proposed multiobjective–cost-sensitive ant colony optimization.
Wiley Interdisc Rev Data Min Knowl Disc 10(1):e1338
9. Di Caro G, Dorigo M (2004) Ant colony optimization and its application to adaptive routing
in telecommunication networks. PhD diss., PhD thesis, Faculté des Sciences Appliquées,
Université Libre de Bruxelles, Brussels, Belgium
10. Khan I, Huang JZ, Tung NT (2013) Learning time-based rules for prediction of alarms from
telecom alarm data using ant colony optimization. Int J Comput Inf Technol 13(1):139–147
11. Shunmugapriya P, Kanmani S (2017) A hybrid algorithm using ant and bee colony optimization
for feature selection and classification (AC-ABC Hybrid). Swarm Evol Comput 36:27–36
12. Sweetlin JD, Nehemiah HK, Kannan A (2018) Computer aided diagnosis of pulmonary hamar-
toma from CT scan images using ant colony optimization based feature selection. Alexandria
Eng J 57(3):1557–1567
13. Mehmod T, Md Rais HB (2016) Ant colony optimization and feature selection for intrusion
detection. In: Advances in machine learning and signal processing, pp 305–312. Springer,
Cham
296 N. Nayar et al.
14. Wan Y, Wang M, Ye Z, Lai X (2016) A feature selection method based on modified binary
coded ant colony optimization algorithm. Appl Soft Comput 49:248–258
15. Ghosh M, Guha R, Sarkar R, Abraham A (2019) A wrapper-filter feature selection technique
based on ant colony optimization. Neural Comput Appl:1–19
16. Dadaneh BZ, Markid HY, Zakerolhosseini A (2016) Unsupervised probabilistic feature
selection using ant colony optimization. Expert Syst Appl 53:27–42
17. Peng H, Ying C, Tan S, Bing Hu, Sun Z (2018) An improved feature selection algorithm based
on ant colony optimization. IEEE Access 6:69203–69209
18. Nandini N, Ahuja S, Jain S (2020) Meta-heuristic Swarm Intelligence based algorithm for
feature selection and prediction of Arrhythmia. Int J Adv Sci Technol 29(2):61–71
19. Rashno A, Nazari B, Sadri S, Saraee M (2017) Effective pixel classification of mars
images based on ant colony optimization feature selection and extreme learning machine.
Neurocomputing 226:66–79
20. Saraswathi K, Tamilarasi A (2016) Ant colony optimization based feature selection for opinion
mining classification. J Med Imaging Health Inf 6(7):1594–1599
21. Ding Q, Xiangpei Hu, Sun L, Wang Y (2012) An improved ant colony optimization and its
application to vehicle routing problem with time windows. Neurocomputing 98:101–107
22. Yu B, Yang Z-Z, Yao B (2009) An improved ant colony optimization for vehicle routing
problem. Eur J Oper Res 196(1):171–176
23. Wu L, He Z, Chen Y, Dan Wu, Cui J (2019) Brainstorming-based ant colony optimization for
vehicle routing with soft time windows. IEEE Access 7:19643–19652
24. Huang G, Cai Y, Cai H (2018) Multi-agent ant colony optimization for vehicle routing problem
with soft time windows and road condition. In: MATEC web of conferences, vol 173, p 02020.
EDP Sciences
25. Xu H, Pu P, Duan F (2018) Dynamic vehicle routing problems with enhanced ant colony
optimization. Discrete Dyn Nat Soci 2018
26. Huang Y-H, Blazquez CA, Huang S-H, Paredes-Belmar G, Latorre-Nuñez G (2019) Solving the
feeder vehicle routing problem using ant colony optimization. Comput Ind Eng 127:520–535
27. Zhang H, Zhang Q, Ma L, Zhang Z, Liu Y (2019) A hybrid ant colony optimization algorithm
for a multi-objective vehicle routing problem with flexible time windows. Inf Sci 490:166–190
28. Brand M, Masuda M, Wehner N, Yu X-H (2010) Ant colony optimization algorithm for robot
path planning. In: 2010 international conference on computer design and applications, vol 3,
pp V3–436. IEEE
29. Chia S-H, Su K-L, Guo J-R, Chung C-Y (2010) Ant colony system based mobile robot path
planning. In: 2010 fourth international conference on genetic and evolutionary computing, pp
210–213. IEEE
30. Cong YZ, Ponnambalam SG (2009) Mobile robot path planning using ant colony optimization.
In: 2009 IEEE/ASME international conference on advanced intelligent mechatronics, pp 851–
856. IEEE
31. Liu J, Yang J, Liu H, Tian X, Gao M (2017) An improved ant colony algorithm for robot path
planning. Soft Comput 21(19):5829–5839
32. Deng G-F, Zhang X-P, Liu Y-P (2009) Ant colony optimization and particle swarm optimization
for robot-path planning in obstacle environment. Control Theory Appl 26(8):879–883
33. Deepa O, Senthilkumar A (2016) Swarm intelligence from natural to artificial systems: ant
colony optimization. Networks (Graph-Hoc) 8(1):9–17
34. Akhtar A (2019) Evolution of ant colony optimization algorithm—a brief literature review. In:
arXiv: 1908.08007
35. Nayar N, Ahuja S, Jain S (2019) Swarm intelligence for feature selection: a review of literature
and reflection on future challenges. In: Advances in data and information sciences, pp 211–221.
Springer, Singapore
36. Manoharan S (2019) Study on Hermitian graph wavelets in feature detection. J Soft Comput
Paradigm (JSCP) 1(01):24–32
37. Aghdam MH, Kabiri P (2016) Feature selection for intrusion detection system using ant colony
optimization. IJ Netw Secur 18.3:420–432
Ant Colony Optimization: A Review of Literature … 297
38. Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony
optimization. Expert Syst Appl 36(3):6843–6853
39. Shakya S, Pulchowk LN, A novel bi-velocity particle swarm optimization scheme for multicast
routing problem
40. Ahmad SR, Yusop NMM, Bakar AA, Yaakub MR (2017) Statistical analysis for vali-
dating ACO-KNN algorithm as feature selection in sentiment analysis. In: AIP conference
proceedings, vol 1891(1), p 020018. AIP Publishing LLC
41. Sweetlin JD, Nehemiah HK, Kannan A (2017) Feature selection using ant colony optimization
with tandem-run recruitment to diagnose bronchitis from CT scan images. Comput Methods
Programs in Biomed 145:115–125
42. Sinoquet C, Niel C (2018) Ant colony optimization for markov blanket-based feature selec-
tion. Application for precision medicine. In: International conference on machine learning,
optimization, and data science, pp 217–230. Springer, Cham
43. Liang H, Wang Z, Liu Yi (2019) A new hybrid ant colony optimization based on brain storm
optimization for feature selection. IEICE Trans Inf Syst 102(7):1396–1399
44. Sowmiya C, Sumitra P (2020) A hybrid approach for mortality prediction for heart patients
using ACO-HKNN. J Ambient Intell Humanized Comput
45. Mangat V (2010) Swarm intelligence based technique for rule mining in the medical domain.
Int J Comput Appl 4(1):19–24
46. Naseer A, Shahzad W, Ellahi A (2018) A hybrid approach for feature subset selection using
ant colony optimization and multi-classifier ensemble. Int J Adv Comput Sci Appl IJACSA
9(1):306–313
47. Kashef S, Nezamabadi-pour H (2013) A new feature selection algorithm based on binary ant
colony optimization. In: The 5th conference on information and knowledge technology, pp
50–54. IEEE
48. Jameel S, Ur Rehman S (2018) An optimal feature selection method using a modified wrapper-
based ant colony optimisation. J Natl Sci Found Sri Lanka 46(2)
49. Selvarajan D, Jabar ASA, Ahmed I (2019) Comparative analysis of PSO and ACO based feature
selection techniques for medical data preservation. Int Arab J Inf Technol 16(4):731–736
50. Khorram T, Baykan NA (2018) Feature selection in network intrusion detection using
metaheuristic algorithms. Int J Adv Res Ideas Innovations Technol 4(4)
51. Manoj RJ, Praveena MDA, Vijayakumar K (2019) An ACO–ANN based feature selection
algorithm for big data. Cluster Comput 22(2):3953–3960
52. Jayaprakash A, KeziSelvaVijila C (2019) Feature selection using ant colony optimization
(ACO) and road sign detection and recognition (RSDR) system. Cogn Syst Res 58:123–133
53. Nayyar A, Le DN, Nguyen NG (eds) (2018) Advances in swarm intelligence for optimizing
problems in computer science. CRC Press (Oct 3)
54. Dorigo M, Stützle T (2019) Ant colony optimization: overview and recent advances. In:
Handbook of metaheuristics, pp 311–351. Springer, Cham
Hand Gesture Recognition Under
Multi-view Cameras Using Local Image
Descriptors
Abstract Hand gesture recognition has various applications in recent years such as
robotics, e-commerce, human–machine interaction, e-sport, and assisting people with
hearing-impaired humans. The latter is the most useful and interesting application
in our daily life. Nowadays, cameras can be installed easily and everywhere. So,
gesture recognition faces the most challenging issue under image acquisition by
multiple cameras. This paper introduces an approach for hand gesture recognition
under multi-views cameras. The proposed approach is evaluated on the HGM-4
benchmark dataset by using local binary patterns.
1 Introduction
The hand gesture is a typical and basic tool of humans for conversation. It is very
difficult to train someone to learn and understand all gestures-based sign language in
a short time. Many intelligent systems are proposed to automatically recognize and
understand those gestures. Hand gesture recognition has received a lot of attention
from many vision scientists in the last decade. It is a core process of smart home,
contactless device, and multimedia systems [1, 2]. Various methods are proposed
in literature which is based on image analysis. Dinh et al. [3] proposed a method
for analyzing hand gesture sequence images by using the hidden Markov model and
evaluates on the one- and two-hand gestures databases. Tavakoli et al. [14] recognize
hand gestures based on EMG wearable devices and SVM classifiers. Chaudhary et al.
[2] introduced a method based on light invariant for hand gesture recognition. They
applied a technique for extracting features by orientation histogram on the region of
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 299
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_23
300 K. Tran-Trung and V. T. Hoang
interest. Chansri et al. [1] presented a method based on HOG descriptor and neural
network for Thai sign language recognition.
Since cameras are installed at any outdoor or indoor position, the modern hand
gesture faces a challenging issue due to the multi-views in different angles of acqui-
sition. Figure 1 illustrates an example of one hand gesture under four cameras at
different positions. The distinct views can be seen and illusion from a unique hand
gesture. The problem of hand gestures under multi-views has been investigated in
[11, 12]. The authors introduced a method to fuse features extracted from different
cameras with two hand gestures. There exist a few public and gesture datasets in
literature [10, 13]. All images are usually captured by one camera. Recently, Hoang
[4] surveyed different hand gesture databases with multi-views and released a new
dataset (HGM-4) under four different cameras for Vietnamese sign language recog-
nition. This paper presents a preliminary result on the HGM-4 dataset based on a local
image descriptor. The local binary pattern (LBP) [8, 9] is considered to represent a
hand gesture image since it is an efficient approach and fast computing for character-
izing texture image [7]. The remainder of this paper is organized as follows. Section 2
introduces the LBP descriptor, the proposed approach, and the experimental results
on HGM-4 dataset. Finally, the conclusion is given in Sect. 3.
2 Proposed Approach
Local binary patterns (LBP) are obtained by computing the local neighborhood struc-
ture for representing the texture around each pixel of the image from a square neigh-
borhood of 3 × 3 pixels. The L B P P,R (xc , yc ) code of each pixel (xc , yc ) is calculated
by comparing the gray value gc of the central pixel with the gray values {gi }i=0 P−1
of
its P neighbors, by this formula:
P−1
LBP P,R (xc , yc ) = (gi − gc ) × 2i (1)
i=0
Several LBP patterns occur more frequently in texture images than others. The
authors in [15] proposed to define the “LBP uniform pattern” LBPu2 P,R which is a
subset of the original LBP [8]. For this, they consider a uniformity measure of a
pattern which analyzes the number of bitwise transitions from 0 to 1 or vice versa
when a circular bit transformation is applied. An LBP is named uniform if the tran-
sition is achieved at most 2. For example, the patterns 11111111 (0 transitions),
00011110, and 11100111 are uniform, and the pattern 00110010 is not. The final
features obtained from image patch are better and more representative than extracting
from a global image [7, 15]. To extract features from multi-blocks, each original
image is proposed. The features extracted from these blocks of each color compo-
nent are then fused to create a final feature vector, e.g., having a vector with 59 × 3 =
177 features, for an original image without division. An illustration of the proposed
approach is presented in Fig. 2.
The HGM-4 [4] is a benchmark dataset for hand gestures under multi-camera.
Table 1 presents the characteristic of this database. Four cameras are installed at
different positions to capture hand gesture images and have 26 distinct gestures which
are performed by five different persons. Figure 3 illustrates different images of the
same gesture under one camera (left camera). Since all images are segmented to have
a uniform background, this problem is more challenging in a complex background.
302 K. Tran-Trung and V. T. Hoang
2.3 Results
Table 2 Classification performance by 1-NN classifier and LBP uniform on the HGM-4 dataset
Decomposition (Train:Test)
Number of blocks 50:50 70:30 80:20 90:10
1×1 57.35 61.90 63.38 62.29
2×2 72.87 77.84 77.91 79.58
3×3 75.81 79.51 81.30 81.97
4×4 77.39 80.90 83.28 83.54
5×5 77.83 82.46 83.33 85.16
6×6 78.35 83.22 84.01 85.79
7×7 78.43 83.02 85.73 86.58
by using this number of blocks. This can confirm the extraction approach based on
block division for extracting LBP uniform features as in [15].
3 Conclusion
This paper presented an approach for hand gesture recognition under multi-views
cameras. The LBP uniform descriptor is used to perform the features extraction from
color images on the HGM-4 benchmark dataset. First, this work is now extending
to enhance the recognition rate by fusing many local image descriptors and deep
features. Second, a fusion scheme should be proposed by capturing all information
from different cameras.
References
1. Chansri C, Srinonchat J (2016) Hand gesture ecognition for Thai sign language in complex
background using fusion of depth and color video. Procedia Comput Sci 86:257–260
2. Chaudhary A (2018) Light invariant hand gesture recognition. In: Robust hand gesture
recognition for robotic hand control, pp 39–61. Springer
3. Dinh DL, Kim JT, Kim TS (2014) Hand gesture recognition and interface via a depth imaging
sensor for smart home appliances. Energy Procedia 62:576–582
4. Hoang VT (2020) HGM-4: a new multi-cameras dataset for hand gesture recognition. Data
Brief 30:105676
5. Just A, Marcel S (2009) A comparative study of two state-of-the-art sequence processing
techniques for hand gesture recognition. Comput Vis Image Underst 113(4):532–543
6. Lee AR, Cho Y, Jin S, Kim N (2020) Enhancement of surgical hand gesture recognition using a
capsule network for a contactless interface in the operating room. Comput Methods Programs
Biomed 190;105385 (Jul 2020)
7. Nhat HTM, Hoang VT (2019) Feature fusion by using LBP, HOG, GIST descriptors and
Canonical Correlation Analysis for face recognition. In: 2019 26th international conference on
telecommunications (ICT), pp 371–375 (Apr 2019)
304 K. Tran-Trung and V. T. Hoang
Abstract Digital filters are most commonly used in signal processing and communi-
cation systems. The fault-tolerant filters are required when the system is unreliable.
Many methodologies have been proposed to defend digital filters from errors. In
this paper, fault-tolerant finite impulse response have been designed using error-
correcting codes and Hamming codes with efficient coded in hardware descriptive
language Verilog. In this paper, we have designed custom IP for fault-tolerant digital
filter with reduced power dissipation and with high speed. This work is concentrating
on creating and packaging custom IP. The proposed design of custom IP fault-tolerant
digital filter is synthesized in Xilinx Vivado 2018.3 and selected Xilinx Zynq-7000
SoC ZC702 evaluation board.
1 Introduction
Digital filters are the essential devices in the digital signal processing system and
recently used in several applications such as video processing, wireless communi-
cations, image processing, and many imaging devices. The use of digital circuits is
exponentially increasing in space, and automotive and medical applications in reli-
ability are critical. In this type of designs, a designer has to adopt some degree of
fault tolerance. This requirement further increases in CMOS technologies that are
soft errors and manufacturing variations [1].
The generally used hardware redundancy techniques are double modular redun-
dancy and triple modular redundancy (TMR) [2]. These methods are suitable to
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 305
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_24
306 S. Malipatil et al.
identify or detect the errors but consume more area to implement these techniques.
These names itself indicates double and triple and so needs similar structures in
parallel to detect faults.
In this [3], author proposed FIR filter using reduced precision replicas which
was designed to minimize the cost of implementing modular redundancy. Some
researchers used different implementation methodologies using only one redundant
to rectify errors [4].
The new method to protect parallel filters is generally used based on ECC in
modern signal processing to the outputs of the parallel filter to identify and correct
errors. A discrete-time filter [5] is implemented by Eq. (1). In this Eq. (1), Y [n]
represents output, x[n] is input signal, and h[i] represents impulse response.
∞
Y [n] = x[n − 1] · h[i]. (1)
i=0
In system on a chip, the custom IP blocks are used to improve productivity. Custom
IP consists of pre-designed blocks that are easy and can be used in bigger. In custom,
IP is divided into two types, one is hard IP and another one is soft IP. Usually, hard
IP will have a pre-designed layout, but soft IP comes as a synthesizable module [6].
2 Proposed Work
Table 1 Properties of IP
İP properties
fault_tolerant_v1_0
Version 1.0 (Rev. 2)
Description fault_tolerant_v1_0
Status Production
License Included
Change log View change log
Vendor Xilinx, Inc
VLNV xilinx.com:user.fault_tolerant:1.0
Fig. 5 Syndrome
In this paper, a fault-tolerant digital FIR filter have been designed with reduced
power and area efficient by using ECC codes and avoiding TMR and DMR method-
ologies and also designed custom IP for fault-tolerant FIR filter using Xilinx
Vivado 2018.30 version and implemented on Xilinx Zynq-7000 SoC ZC702 evalua-
tion board. This proposed custom IP produces similar outcomes as that of the existing
module. The total on-chip power consumption is 1.803 W including dynamic and
static power consumption. The area has analyzed based on resource utilization.
Custom IP Design for Fault-Tolerant … 309
In this design, error-correcting codes and Hamming codes have been used to
design fault-tolerant FIR filter and also designed custom IP. This designed IP is
reusable and cost efficient. In this design, the check bits are produced by an XOR-
tree to the G matrix. The syndrome is generated by an XOR network corresponding
to the H matrix. No error is detected if the syndrome is zero vector.
⎡ ⎤
1000111
⎢0 1 0 0 1 1 0⎥
G=⎢ ⎥
⎣0 0 1 0 1 0 1⎦ (2)
0001011
⎡ ⎤
1110100
H = ⎣1 1 0 1 0 1 0⎦ (3)
1011001
The encoding is obtained by Eq. (4), and error is detected by computing Eq. (5).
out = x · G (4)
s = out · H T (5)
Syndrome structure consists internally of XOR, mux, and latches. It will scan for
the error, and if no error found, the signals memen1 and write will set to logic ‘1’
(Figs. 6 and 7).
Fig. 6 Package IP
310 S. Malipatil et al.
The power analysis is shown in Fig. 8. The total 1.803 W on-chip power is achieved.
It is the combination of both dynamic and static power consumption. 92% of dynamic
power is achieved, and it is useful and will consume power when the device is working
condition. They have reduced almost static power consumption and having 8% of
static power consumption. The following parameters have been used to achieve low
power consumption.
Total on-chip power: 1.803 W.
Junction temperature: 45.8 °C.
Thermal margin 39.2 °C(3.3 W).
Effective JA 11.5 °C/W.
Power supplied to off-chip 0 W.
4 Implementation Results
The implementation have been done on Xilinx Zynq-7 ZC702 evaluation board. The
product Zynq-7000, package-clg484, and speed grade which is −1 have been used.
From Table 3, it shows that all the routed nets are working properly, and there is no
unrouted nets (Figs. 9, 10 and 11).
312 S. Malipatil et al.
Table 3 Implementation
İmplementatıon Summary Route status
summary
Conflict nets 0
Unrouted nets 0
Partially routed nets 0
Fully routed nets 59
5 Conclusion
In this paper, fault-tolerant digital FIR filter have been designed with reduced power
and area efficient by using ECC codes and avoiding TMR and DMR methodologies
and also designed custom IP for fault-tolerant FIR filter using Xilinx Vivado 2018.30
version and implemented on Xilinx Zynq-7000 SoC ZC702 evaluation board. This
proposed custom IP produces similar outcomes as that of the existing module. The
total on-chip power consumption is 1.803 W including dynamic and static power
consumption. The area has been analyzed based on the resource utilization, and our
intellectual properties (IPs) have been designed. This type of fault-tolerant filters is
used in space, and automotive and medical applications in reliability are critical.
Custom IP Design for Fault-Tolerant … 313
Fig.10 IO planning
References
1. Gao Z et al (2014) Fault tolerant parallel filters based on error correction codes. IEEE Trans
Very Large Scale Integr (VLSI) Syst
2. Somashekhar, Vikas Maheshwari, Singh RP (2019) A study of fault tolerance in high speed
VLSI ciruits. Int J Sci Technol Res 8(08) (Aug)
3. Shim D, Shanbhag N (2006) Energy-efficient soft error-tolerant digital signal processing. IEEE
Trans Very Large Scale Integr (VLSI) Syst 14(4):336–348 (Apr)
4. Reviriego P, Bleakley CJ, Maestro JA (2011) Strutural DMR: a technique for implementation
of soft-error-tolerant FIR filters. IEEE Trans Circuits Syst Exp Briefs 58(8):512–516 (Aug)
5. Oppenheim AV, Schafer RW (1999) Discrete time signal processing. Prentice-Hall, Upper
Saddle River, NJ, USA
6. Software manual Vivado Design Suite Creating and Packaging Custom UG973 (v2018.3)
December 14, 2018, [online] Available: www.xilinx.com
7. Vaisakhi VS et al (2017) Fault tolerance in a hardware efficient parallel FIR filter. In: Proceeding
of 2018 IEEE ınternational conference on current trends toward converging technologies. 978–
1–5386–3702–9/18/$31.00 © 2017 IEEE
8. Nicolaidis M (2005) Design for soft error mitigation. IEEE Trans Device Mater Rel 5(3):405–
418 (Sept)
9. Kanekawa N, Ibe EH, Suga T, Uematsu Y (2010) Dependabilitu in electronic systems: mitiga-
tion of hardware failures, soft errors, and electro-magnetic disturbances. Springer, NewYork,
NY, USA
10. Lin S, Costello DJ (2004) Error control coding, 2nd edn. Prentice-Hall, Englewood Cliffs, NJ,
USA
11. Cheng C, Parhi KK (2004) Hardware efficient fast parallel FIR filter structures based oniterated
short convolution. IEEE Trans Circuits Syst I: Regul Pap 51(8) (Aug)
12. Somashekhar, Vikas Maheshwari, Singh RP (2019) Analysis of micro inversion to improve fault
tolerance in high speed VLSI circuits. Int Res J Eng Technol (IRJET) 6.03 (2019):5041–5044
13. Gao Z, Yang W, Chen X, Zhao M, Wang J (2012) Fault missing rate analysis of the arithmetic
residue codes based fault-tolerant FIR filter design. İn: Proc. IEEE IOLTS, June 2012, pp
130–133
14. Somashekhar, Vikas Maheshwari, Singh RP (2020) FPGA ımplementation of fault tolerant
adder using verilog for high speed VLSI architectures. Int J Eng Adv Technol (IJEAT) 9(4)a.
ISSN: 2249–8958 (Apr)
15. Hitana T, Deb AK (2004) Bridging concurrent and non-concurrent error detection in FIR
filters. İn: Proc. Norchip Conf., Nov 2004, pp 75–78. https://doi.org/https://doi.org/10.1109/
NORCHP.2004.1423826
16. Ponta’relli s,Cardarilli GC,Re M, Salsano (2008) Totally fault tolerant RNS based FIR filters.
İn: Proc.14th IEEE Int On-Line Test Symp (IOLTS), July 2008, pp 192–194
17. Kumar NM (2019) Energy and power efficient system on chip with nanosheet FET. J Electron
1(01):52–59
A Novel Focused Crawler
with Anti-spamming Approach & Fast
Query Retrieval
Abstract The Web pages are growing in a design of terabytes or even petabytes day
by day. In the case of the small Web, it is an easy task to answer a query, whereas
robust modus operandi for storage, searching, or spamming is needed in case of large
volumes of data. This study gives a novel approach for the detection of malicious
URLs and fast query retrieval. The proposed focused crawler checks URL before
entering in the search engine database. It discards malicious URLs but allows benign
URLs to enter in search engine database. The detection of malicious URLs is done
via the proposed feature set of URL which is created by selecting those attributes of
URL features which are susceptible to spammers. Thus, a non-malicious database
is created. The searching process performed through this search engine database by
triggering a query. Search time taken by the proposed crawler is less as compared to
the base crawler. The reason behind it is that the proposed focused crawler uses a trie
data structure for storing fetched results in Web repository instead of HashSet data
structure as used by the base crawler. On behalf of computed average search time
(for ten queries), it is observed that proposed focused crawler is 12% faster than base
crawler. To check the performance of proposed focused crawler, quality parameters,
i.e., precision and recall, are computed which are found to be 92.3% and 94.73%.
Detection accuracy is found to be 90% with an error rate of 10%.
1 Introduction
Technically, information retrieval (IR) [1] is a discipline of searching for sole data
in a document, all documents as well as metadata (text, image, or sound data or
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 315
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_25
316 R. Sachdeva and S. Gupta
databases). Search engine is a type of information retrieval that enlists the relevant
documents via specified keywords by the use of a spider robot [2].
The process of crawling is initiated with a list of URLs, called seed URLs. A queue
of pages (called frontier) is preserved by a Web crawler that is to be downloaded. The
seed set initializes the frontier (done manually). A URL from this seed collection has
been selected and submitted to the downloader to download the URL Web page. The
indexer module utilizes the fetched pages. A continuous process in which extracted
URLs from downloaded pages are fed to the URL frontier for further downloading
till the frontier becomes empty. Figure 1 illustrates how a Web crawler functions.
The main components of the crawler are URL frontier, DNS resolution, fetch
module, parsing module & URL duplicate eliminator. The URL frontier is the collec-
tion of URLs which are to be fetched next in the crawl. A DNS resolution module is
used to determine the IP address of the webserver specified by the URL in the URL
frontier. A fetch module uses the hypertext transfer protocol (HTTP) to extract the
Web page. A parsing module takes the Web page as input, extracting from it the text
and collection of hyperlinks. URL duplicate eliminator checks the availability of the
links in the frontier and discards the link if it is already fetched. The robot template
is used to determine whether or not to allow removal of the Web page.
Chakrabarti et al. had proposed the oriented crawler [3]. It is composed of a hypertext
classifier, a distiller, and a crawler. The classifier makes appropriate decisions about
the expansion of links on crawled pages, while the distiller calculates a measure of
the centrality of crawled pages to determine visit priorities. The search function of
A Novel Focused Crawler with Anti-spamming … 317
Today’s spammers target the URL to induce spamming in Web pages. Such types
of URLs are called malicious URLs. These URLs are difficult to be detected by
the end user, and user data is illegitimately accessed. These malicious URLs have
resulted in a cyberattack, unethical behavior like the breach of confidential as well
as secure content, installation of ransomware on the user devices causing massive
losses worldwide each year, etc. Benign URLs can be converted into malign URLs by
obfuscation. Obfuscation is a technique used to mask malicious URLs. It is reported
that about 50 million Web site users are visiting malicious Web sites. Black–hosting,
heuristic classification, etc., are traditional techniques based on keyword as well
as URL syntax matching are some filtering mechanisms used to reveal malicious
URLs, but these techniques are inefficient to cope up with technologies and Web
access techniques as well as detecting modern URLs.
In response to queries, the crawler locates the related Web pages and downloads their
content that is stored on the disk for further processing. These results are usually
stored in a database repository in the form of an inverted index, hash tables, etc.,
to help user queries be processed in the future. But the Web index generated must
be compact, i.e., the memory requirements for index storage should be smaller. The
main challenges are improving query performance by handling queries faster and
providing faster results in trillions of Web data. Kanagavalli [4] discussed several data
structures based on storage, process, as well as a description that is used for storing
the data. The author dictated that mostly hash tables are used as a storage structure.
The efficient data structure used for storage leads to the optimization of search engine
and ultimately the whole process of generating final results has accelerated.
symmetric, and intersection. The methods add, delete, and contain are of constant
time complexity O (1). A HashSet has an internal structure (hash), in which objects
can be easily searched and defined. It does not preserve the order of elements. There
is no access by indices. But either enumerator or built-in functions can be used to
access elements. Built-in functions convert the HashSet to a list and iterate through
it. Iterating through a HashSet (or having an object by index) is therefore very slow
particularly in the case of large text queries. Moreover, HashSets are not cache
friendly.
Trie is a dynamic ordered tree data structure used to store a set or associative array
in which keys are normally strings. These strings are arranged in lexicographic order.
The search complexity of a string of key length m is O(m) in the worst case. Updating
a trie is quite simple as inserting starts with a search and when a node with no correct
edge to follow appears, then a node is added with the remaining string on the edge
to this node. Trie can be better represented in the form of a compressed or compact
form. A compressed or compact representation of a trie is one that merges all chains
of edges that have no branches (the nodes between these edges are of degree one) to
one edge, labeled with the string of characters of the merged edges or labeling the
resulting path. In the particular case of a compact binary trie, the total number of
nodes is 2n − 1, like in a full binary tree, where there are n-strings are represented
by trie.
Tries are an incredibly special and functional data structure that is dependent on the
prefix of a string. So, these are being able to help in searching for a value having the
longest possible prefix similar to a given key. Tries can also be used for determining
the association of value with a group of keys that share a common prefix. They are
used to signify the “Retrieval” of data. Strings are placed in a top to bottom manner
based on their prefix in a trie. All prefixes of length 1 are stored up to level 1, and
all prefixes of length 2 are sorted down to level 2 and so forth. So, it is considered a
better data structure for faster searching of string in comparison to HashSet.
Trie typical makes use of the case when dealing with a group of strings rather than
individual strings. The search, insert, and delete complexity of operations is O(L)
where L is the length of a key. It is faster because of the ways it is implemented.
Do need to compute any hash function. There is no collision handling. It prints all
words in alphabetical order. These are space efficient if you are storing lots of words
that start with a similar pattern. These may reduce the overall storage cost by storing
shared prefixes once. Thus, trie can quickly answer queries about words with shared
prefixes resulting in efficient prefix queries.
A Novel Focused Crawler with Anti-spamming … 319
2 Literature Survey
A URL has many features like lexical features, host-based Features, content-based
features, popularity features, and context features, on behalf of which spammed URL
can be detected. Table 1 shows a summary of related work.
Shishibori [5] discussed the implementation of binary tries as a fast access method
by converting binary tries into a compact bitstream for reducing memory space.
320 R. Sachdeva and S. Gupta
Moreover, he explained that it takes ample time to search (due to pre-order bitstream)
and update in large key sets, thereby, hike in time and cost of each process in case
of large key sets. Kangavalli [4] discussed various data structures to be required in
information retrieval due to trillions of data. The author explains that data structures
can be process-oriented, descriptive, or storage in this case. Response time, as well as
the quality of the system, is defined for its performance. Steffein Heinz [6] proposed
a new data structure called burst tries for string keys which is faster than a trie but
slower than a hash table. Shafiei [7] discussed Patricia tree in concern to the shared-
memory system by implementing insertion, deletion, and replacement operations.
This proposed work is also justified for storage of unbounded length strings with flags
but being avoided due to consumption of a lot of memory. Thenmozhi [8] analyzed
the efficiency of various data usage models for tree- and trie-based implementations
under different hardware and software configurations such as the size of RAM &
cache, as well as the speed of physical storage media. Andersson [9] discussed the
string searching problem using a suffix tree being compressed at level, path, and data.
It is very effective for large texts due to decreasing the number of accesses to slow
secondary memory as well as limited main memory usage simultaneously. Roberto
Grossi [10] proposed fast compressed tries through path decompositions with less
memory space and latency. Nilsson [11] implemented a dynamic compressed trie,
i.e., LPC trie, with level and path compression. A comparison with balanced BST
showed that search time is better due to small average depth, but memory usage of
balanced BST and LPC trie is similar. So, LPC trie is a good choice for a data structure
that preserves order, where very quick search operations are necessary. Shishibori
[12] proposed a strategy for compressing Patricia tries into a compact data structure,
i.e., bitstream. But, compact Patricia stores information about eliminated nodes, so
large storage is required to implement it. The study also evaluates the space and time
efficiency. Isara Nakavosute [1] suggested an approach for maximizing information
retrieval (IR) time or database search time using a BST & a doubly linked list. Mangla
[13] proposed a method named context-based indexing in IR system using BST that
solves the large search space problem by indexing, stemming, and removal of stop
words in case of large documents.
all the attributes is 1. Then, for the detection of malicious URL, multiple malicious
URLs are analyzed, and the threshold value is determined based on the sum of
weights based on attributes that occurred in provided URLs. It is found to be 0.7.
Thus, the system inbreds a mathematical range from 0 to 1 and differentiates the
benign and malign URL statistically. Zero depicts that the URL is benign, while the
value greater than 0.7 (the threshold value) shows that the URL is malign.
The proposed focused crawler works in two steps. The first step filters benign and
malign URLs on behalf of selected feature set. Malicious URLs are rejected while
benign URLs are added in the search engine database. Then, the query is triggered,
and results are displayed.
• Filteration of Benign & Malign URLs
It filters malign and benign URLs on the interface of the multifeatured malicious
URL detection system. After detecting malicious URL, these URLs are blocked, and
benign URL passes the filter. The benign URLs are entered into the database.
• Fast Query Retrieval
Later, searching is performed by triggering a query on the search interface. As it is
a focused crawler, it limits the search up to a domain. This interface leads to a window
that not only gives search results but also gives the comparison of search time of base
crawler as well as proposed focused crawler. The base crawler uses HashSet, and the
322 R. Sachdeva and S. Gupta
proposed focused crawler uses trie as a storage unit during searching. Moreover, the
theme of the categorization of a focused crawler improves the search results. Also,
it reduces crawling time as well as saves database space (Fig. 2).
A Novel Focused Crawler with Anti-spamming … 323
A database of collection of showing malign and benign URLs has been downloaded
from https://www.kaggle.com/antonyj453/urldataset/data. In the implementation of
this classifier, 50 URLs from different domains are tested on the malicious URL
detection interface of the crawler (Fig. 3).
This interface filters benign and malign URLs. Malign URLs are blocked and are
not allowed to enter in the database while benign URLs are passed the anti-malicious
barrier and saved in the database. Then, these URLs take part in the searching process.
A number of queries or keywords are to be passed in search space on the searching
interface which leads to a search engine result page after searching. This window
shows the comparison of base crawler and proposed focused crawler in terms of
search time based on the storage data structure used by base crawler, i.e., HashSet,
and proposed focused crawler, i.e., trie, during searching.
See Table 3.
A Novel Focused Crawler with Anti-spamming … 325
Table 3 (continued)
Sr. no Domain Statement of Web address Malicious/Non-malicious URL
domain (URL of the Detected Actual
Web site)
http://speyfoods. M M
com/
http://mcdona NM NM
ldsrestaurantsd
ucanadalte-273.
foodpages.ca/
http://o.foodpa NM NM
ges.ca/
3 D3: Books Best fiction novels http://www.mod NM NM
ernlibrary.com/
http://www.bbc. NM NM
com/
http://muh NM NM
ich.pw/
http://www.mod NM NM
ernlibrary.com/
http://www.boo NM NM
kspot.com/
Role of books in http://mcxl.se/ NM NM
our life http://www.kli NM NM
entsolutech.
com/
http://www.rus NM NM
evec.com/
http://www.myn NM NM
ewsdesk.com/
http://lifestyle. M NM
iloveindia.com/
4 D4: Care Animal care http://www.pfa NM NM
centers faridabad.com/
http://www.san NM NM
jaygandhianimal
carecentre.org/
http://smallanim NM NM
alcarecenter.
com/
http://abhyas NM NM
trust.org/
http://www.ani NM NM
malandbirdvet.
com/
(continued)
A Novel Focused Crawler with Anti-spamming … 327
Table 3 (continued)
Sr. no Domain Statement of Web address Malicious/Non-malicious URL
domain (URL of the Detected Actual
Web site)
Health insurance http://www.app M M
leton-child-care.
com/
http://sunkeyins M M
urance.com/
http://nycfootdr. NM M
com/
http://insurance M NM
companiesinn
ewyork.com/
http://www.kai NM NM
serinsuranceonl
ine.com/
5 D5: Sports Sports arena http://richsport M M
smgmt.com/
http://2amspo M M
rts.com/
http://opensport NM NM
sbookusa.com/
http://raresport NM NM
sfilms.com/
http://www.sch NM NM
ultesports.com/
Sports famous in http://www.wal NM NM
India kthroughindia.
com/
http://www.ilo NM NM
veindia.com/
http://www.ias NM NM
lic1955.org/
http://www.ind NM NM
iaonlinepages.
com/
http://www.icc NM NM
rindia.net/
*Acronym used in the table—for malicious & NM (non-malicious)
328 R. Sachdeva and S. Gupta
On behalf of data tested, the different parameter values of true positives (TP), true
negatives (TN), false positives (FP), and false negatives (FN) is got where
TP = Case was positive & predicted positives, i.e., benign URLs
TN = Case was negative & predicted negative, i.e., malign URLs
FP = Case was positive, but predicted negative, i.e., malign URLs
FN = Case was negative, but predicted positive, i.e., benign URLs
The obtained values of TP, TN, FP & FN from Table 4 are as follows.
See Table 5.
7 Analytical Study
The proposed focused crawler is binary class classifier as it differentiates only two
classes, i.e., benign and malign. Different binary evaluation metrics are precision,
recall, false positive rate (FPR), false negative rate (FNR), detection accuracy, F-
measure, and AUC. Worked on three parameters, i.e., accuracy, rrecision & recall.
7.1 Accuracy
TP + TN
Accuracy =
TP + TN + FP + FN
TP
Precision =
TP + FP
It is the ratio of actually positive cases that are also identified as such. It is calculated
by dividing the number of correct positive predictions by the total number of positives.
TP
Recall =
TP + FN
Several queries are made for comparing the search time using HashSet and trie
storage data structures. Figure 4 graphically analyzes search time taken to search the
same keyword by using HashSet and trie data structures.
Studies of McGrath Gupta (2008) and Kolari et al. (2006) suggested that a combina-
tion of URL features should be used to develop an efficient classifier. The proposed
focused crawler is based on this theory. The classifier is developed by a combina-
tion of lexical features, Javascript features, DHTML features, and popularity-based
features. It successfully detects malign URLs with an accuracy of 90% with an error
rate of 10%. Other metrics, precision and recall, are computed as 92.3 and 94.73%.
Moreover, storing the fetched data in a trie data structure during searching leads to
less search time as compared to the base crawler that uses a HashSet data structure.
Thus, it fastens the query retrieval process.
This focused crawler can be made more resistive to spamming via adding a more
robust feature set of URL. A study on short URLs can be done for effective detection
and attack type identification because it is the most growing trend today on the
microblogging sites or online social networks like Facebook, Twitter, Pinterest, etc.
Implementing via machine-learning approach will make this classifier more dynamic.
References
1. Nakavisute I, Sriwisathiyakun K (2015) Optimizing information retrieval (IR) time with doubly
linked list and binary/search tree (BST). Int J Adv Comput Eng Netw 3(12):128–133. ISSN
2320-2106
2. Lewandowski D (2005) Web searching, search engines and information retrieval. Inf Serv Use
25(3):137–147. IOS - 0167-5265
3. Soumen C, Van Den BM, Byron D (1999) Focused crawling: a new approach to topic-specific
web resource discovery. Comput Net J 1623–1640
A Novel Focused Crawler with Anti-spamming … 331
4. Kanagavalli VR, Maheeha G (2016) A study on the usage of data structures in information
retrieval. http//www.researchgate.net/publication/301844333A
5. Shishibori M et al (1996) An efficient method of compressing binary tries. Int J Sci Res (IJSR)
4(2):2133–2138 (IEEE). 0-7803-3280-6
6. Heinz S, Zobel J, Williams HE (2002) Burst tries a fast, efficient data structure for string keys.
ACM Trans Inf Syst 20(2):192–223
7. Shafiei N (2013) Non-blocking Patricia tries with replace operations. In: Proc. Int. Conf. Distrib.
Comput. Syst. IEEE, 1063-6927, pp 216–225
8. Thenmozhi M, Srimathi H (2015) An analysis on the performance of Tree and Trie based
dictionary implementations with different data usage models. Indian J Sci Technol 8(4):364–
375. ISSN 0974-5645
9. Andersson S, Nilsson A (1995) Efficient implementation of suffix trees. Softw Pract Exp CCC
25(2):129–141. 0038-0644
10. Grossi R, Ottaviano G (2014) Fast compressed tries through path decompositions. ACM J Exp
Algorithms 19(1):1.8.2–1.8.19
11. Nilsson S, Tikkanen M (1998) Implementing a dynamic compressed trie. In: Proc. WAE’98,
pp 1–12
12. Shishibori M et al (1997) A key search algorithm using the compact patricia trie. In: Int. Conf.
Intell. Process. Syst. ICIPS ’97. IEEE, pp 1581–1584
13. Mangla N, Jain V (2014) Context based indexing in information retrieval system using BST.
Int. J. Sci. Res. Publ 4(6):4–6. ISSN-2250-3153
14. Justin MA, Saul LK, Savage S, Voelker GM (2011) Learning to detect malicious URLs. ACM
Trans Intell Syst Technol 2(3):2157–6904
15. Xu S, Xu L (2014) Detecting and characterizing malicious websites. Dissertation, Univ. Texas
San Antonio, ProQuest LLC
16. Choi H, Zhu BB, Lee H (2011) Detecting malicious web links and identifying their attack
types. In: Proc. 2nd USENIX Conf. Web Appl. Dev. ACM, pp 1–12
17. Popescu AS, Prelipcean DB, Gavrilut DT (2016) A study on techniques for proactively iden-
tifying malicious URLs. In: Proc. 17th Int. Symp. Symb. Numer. Algorithms Sci. Comput.
SYNASC 2015, 978-1-5090-4/16, IEEE, pp 204–211
18. Manun MSI et al (2016) Detecting malicious URLs using lexical analysis, vol 1. Springer Int.
Publ. AG, pp 467–482. 978-3-319-46298-1_30. http//www.researchgate.net/publication/308
365207
19. Vanhoenshoven F, Gonzalo N, Falcon R, Vanhoof K, Mario K (2016) Detecting malicious
URLs using machine learning techniques. http://www.researchgate.net/publication/31158202
20. Sridevi M, Sunitha KVN (2017) Malicious URL detection and prevention at browser level
framework. Int J Mech Eng Technol 8(12):536–541. 0976-6359
21. Chong C, Liu D, Lee W (2009) Malicious URL detection, pp 1–4
22. Patil DR, Patil JB (2018) Feature-based Malicious URL and attack type detection using multi-
class classification. ISC Int J Inf Secur 10(2):141–162. ISSN 2008-2045
23. Naveen INVD, Manamohana K, Verma R (2019) Detection of malicious URLs using machine
learning techniques. Int J Innov Technol Explor Eng (IJITEE) 8(4S2):389–393. ISSN 2278-
3075
24. Sahoo D et al (2019) Malicious URL detection using machine learning: a survey. Association
Comput Mach 1(1):1–37. arXv 1701.07179v3
A Systematic Review of Log-Based Cloud
Forensics
Abstract Inexpensive devices that leverage cloud computing technology has prolif-
erated the current market. With the increasing popularity and huge user base, the
number of cybercrimes has also increased immensely. The forensics of the cloud
has now become an important task. But due to the geographically distributed nature
and multi-device capability of the cloud computing environment, the forensics of the
cloud has become a challenging task. The logs generated by the cloud infrastructure
provide the forensics investigator with major hints that may follow to reconstruct
the crime scene chronology. This is highly critical for the forensics investigator to
investigate the case. But the logs are not easily accessible, or they often fail to provide
any critical clues due to poor logging practices. In this paper, initially, the impor-
tance of log-based cloud forensics has been discussed. Then, a taxonomy based on
the survey of the literature has been furnished. Finally, the issues in the existing
log-based cloud forensics schemes have been outlined and open research problems
have been identified.
1 Introduction
The untoward exploitation of the capability and the flexibility of the cloud computing
environment has brought in the need for cloud forensics [1]. The cloud computing
environment is not only capable of meeting minor general-purpose computing
requirements, but the tremendous power of the cloud computing environment can
be exploited by the malicious users to procure gigantic computing resources and
network bandwidth to launch various attacks on or off devices and applications. Thus,
there is a need for forensics investigation in the cloud computing environment. The
commonly used digital forensics practices do not apply to cloud forensics due to the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 333
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_26
334 A. Ghosh et al.
Logs are records generated by software under execution in some format as specified
by the developer. Each application, platform, or infrastructure usage is logged by the
CSP for various purposes such as but not limited to troubleshooting and malicious
activity tracking. In each of the cloud service models, logs are generated for every
possible access and execution of applications or any other services provided by the
CSP. So generated logs are maintained by the CSP. These logs have the potential
to reveal an enormous amount of information that might be required in various
scenarios as mentioned earlier [2] Thus, these cloud infrastructure generated logs
are used by cloud forensics investigators to reconstruct the sequence of activities
that have taken place in a cloud crime scene. A use case of logs generated by various
systems in a network is depicted in Fig. 1. The logs may be needed to track down
unauthorized access to a network by an unauthorized IP. In this scenario, the network
logs from a router or a firewall can be of tremendous help to find the details of
such intrusion and possibly prosecute the intruder in the court of law. In the cloud
computing environment, such as unauthorized access or other malicious activities,
such as sensitive data theft, causing damage to other businesses over the cloud, have
become quite common. In an investigation in the cloud computing environment,
the logs generated can give a promising direction to the investigation and may help
the prosecution to succeed as the logs generated in the cloud provide details of the
activities that have taken place in a cloud crime scene. The cloud forensics activity
which takes the help of the Logs generated in the cloud is known as log-based cloud
forensic. The users rarely have full access to logs. The CSP holds exclusive access
to the logs generated in her cloud service infrastructure. But she may or may not be
obliged to grant access to the logs to her users [3]. As mentioned in earlier sections,
the cloud computing environment is spread all over the globe. It is a mammoth task
to track down the exact location where the generated logs sit. In a cloud forensics
scenario, the CSP may provide access to the logs on an order by the court of law.
Since the cloud computing environment encompasses multi-jurisdiction property,
it again, in turn, becomes very tough to acquire the desired logs for the forensics
A Systematic Review of Log-Based Cloud Forensics 335
investigation. This grant of access to the logs by the CSP to the investigators may
lead to sensitive data leaks of other cloud service users. This is one of the main
reasons, why the CSPs do not tend to disclose the logs as doing so might lead to a
breach of the SLA among the CSP and its users. Such a breach, in turn, may defame
the CSP and lead to running out of business, let alone the jurisdictional chaos that the
CSP might have to face in case a cloud service user reports a breach of SLA to the
court of law. As per a report from the DPCI, 74% of the forensics investigators have
raised dependency on CSP as a concern. Also, ununiform logging structure leads to
the difficulty of identification and segregation of logs if access is granted.
In this section, this work of review has been compared with similar works by other
researchers. Additionally, the proposed taxonomy constructed based on the literature
review for log-based cloud forensics has been provided (Table 1).
336 A. Ghosh et al.
Continual forensics put the forensicability of a system under test. It is the practice
of continuous analysis of the logs generated by a forensics sound system. Unlike the
post-incident forensics analysis of logs, in continuous forensics, the system logs are
continuously monitored for ill behavior in the system. This is a result of the develop-
ment in cloud services and the research in forensics readiness of the cloud systems.
Due to the ever-broadening of the cybercrime landscape, several contributions have
been made in the “forensic-by-design” attribute of cloud systems. Simou et al. [9]
in their work proposed a methodology for forensic readiness of the cloud systems.
They coined the term “forensicability” to describe a system or a service that can
be designed in a forensic sound manner. They have further identified the forensic
constraints which are the concepts to be realized to enable a cloud as forensic ready.
These constraints when implemented increase the forensicability of a cloud service.
Kebande et al. [10, 11] have proposed a botnet-based solution for gathering logs
from live systems in the cloud computing infrastructure. They proposed infecting
the virtual machines with non-malicious bots that would collect live logs from the
user virtual machines and provide logs for live analysis. Park et al. [12] described the
work environments incorporating cloud services as smart work environments. They
suggested that cloud services for such work environments must implement forensic
readiness as a pro-active measure of threat preemption. They further proposed their
pro-active forensics model that was based on their investigation of the literature. They
identified and analyzed 50 components of digital forensic readiness and designed 7
detailed areas. For validating the model designed by them, they undertook surveys
of their models by digital forensics professionals. Finally, they deduced the areas
that can be worked on for the existing systems to gain forensic readiness. Datta et al.
[13] proposed a machine learning-based system that ranks malicious users in a crime
scene. This ranking of the suspect IPs helps eliminate the need to investigate all the
IPs in a crime scene (Fig. 3).
De Marco et al. [14] stated the fact that breaches in the cloud by the cloud client
happens by the violation of the service level agreements. Thus, the pro-active detec-
tion of violation of the service level agreements (SLAs) has the potential to lead a
forensic investigation to confidence. Based on this, they emphasized the need for
automation of detection of SLA breaches. They further proposed a framework for
the development of forensic ready cloud systems. Their framework considered the
technical aspects of the SLA and monitored that the system fulfilled the service obli-
gations. Baror et al. [15] emphasized the need for forensic readiness in the cloud. The
authors stated that there must be some form of communication in natural human-
understandable language be it friendly or vindictive. Thus, they proposed a natural
language processing-based framework that analyzes the language of the malicious
entity and detects an ongoing cyber-attack.
338 A. Ghosh et al.
Sporadic forensics is referred to the forensics process that includes the activities being
carried out as a response to an incident rather than a continuous process of forensics
preparedness activities being carried out in the cloud infrastructure. It is where the
forensics data is acquired on a later stage as opposed to continuous data acquisition
for future incidents of forensics as in continual forensics. Dykstra and Sherman [16]
proposed “Forensic Open-Stack Tools” (FROST) for upright log extraction from
Infrastructure-as-a-Service (IaaS) cloud platforms. FROST is capable of extracting
logs from virtual disks, APIs, and guest firewalls. The distinct feature of FROST
is that it operates in the cloud management plane and does not interact with the
guest operating system. It also ensures log data integrity by maintaining hash trees.
Marty [17] proposed a log-based cloud forensic framework that focuses solely on the
logs generated at different levels of the cloud computing environment. The proposed
model is carefully designed keeping in mind—when there is a need for logging, what
is being logged and how an event is being logged. The author also emphasizes on
the non-existence of any standard log format and proposed the must-haves in a log
record such as the timestamp of the log record, key-value pair format of the log entry,
normalized values for making the logs ready for the analysis, etc. The author has
also focussed on log transportation, storing logs in centralized storage, archiving the
logs, and retrieving the logs when needed. Anwar and Anwar [18] showed that the
system generated logs of a cloud system that can be used in a forensic investigation.
They generated their log dataset by launching known attacks on Eucalyptus. Then,
they analyzed the data generated by the attacks and built a system that could detect
further such attacks on Eucalyptus. Roussev et al. [19] showed that the traditional
forensic practices on the client-side are inefficient to be employed in cloud forensics
A Systematic Review of Log-Based Cloud Forensics 339
and it requires a new toolset for efficient forensics in the cloud computing envi-
ronment. Further, the authors developed and demonstrated tools for forensics anal-
ysis of the cloud. They proposed “kumodd” for remote acquisition of cloud drives,
“kumodocs” for acquisition and analysis of Google docs, and “kumofs” for remote
visualization of cloud drives. Ahsan et al. [20] stated that the existing systems focus
on the forensics of the traditional systems rather than the cloud computing environ-
ment. The authors then proposed their logging service system called “CLASS: cloud
log assuring soundness and secrecy scheme for cloud forensics.” In their proposed
logging system, discrete users encrypted their logs with the help of their public
key with the purpose that only the users can decrypt their logs using their private
key. To avert unsanctioned alteration of logs, the authors spawned “Proof of Past
Log (PPL)” by implementing Rabin’s Fingerprint and Bloom Filter. Park et al. [21]
affirmed that despite the extensive research and development in the field of cloud
forensics, there still exist problems that have not yet been addressed. Maintaining
the integrity of the log is one such area. To solve this problem of log integrity, they
proposed their blockchain-based logging and integrity management system. Khan
and Varma [22] in their work proposed a framework for cloud forensics taking into
consideration the service models in the cloud. Their proposed system implemented
pattern search and used machine learning for the extraction of features in the cloud
logs. This implementation of machine learning in their work enabled the prioritiza-
tion of shreds of evidence collected from the virtual machines in the cloud. Rane
and Dixit [23] emphasized that there exists a great deal of trust in the third party in
log-based cloud forensics such as acquiring the logs from the cloud service provider
(a third party in the cloud forensics investigation). Stakeholders may collude to alter
the logs for their benefit. Thus, to solve this problem, the authors proposed their
forensic aware blockchain-based system called “BlockSLaaS: blockchain-assisted
secure Logging-as-a-Service” for the secure storage of logs and solving the collusion
problem. Further, they claimed that their proposed system provides the preservation
of log integrity (Table 2).
The logs in the cloud computing environment are generated in all the levels of the
service model. In the SaaS level, the user gets little or no logs at all. In the PaaS
level, the user has access to some of the logs but the degree of detail in such logs is
limited. The IaaS level gives the highest access to logs that can be instrumental to the
process of forensics investigation in the cloud. It is the cloud service provider (CSP)
who determines what logging information is provided to the cloud user. The CSP has
exclusive control of the logs in the cloud computing environment. It is because of
this reason, in a forensics investigation, the CSP is requested to provide the relevant
logs for the investigation to proceed. The CSP is not an entity that can be completely
trusted as there is a chance of collusion and alteration of logs. In the case of collusion
among the CSP and the adversary, the investigation might not lead to confidence and
even the wrong person might get framed [24].
340 A. Ghosh et al.
VM Logs
These logs provide detailed clues to what has been done during an attack by an
adversary. The logs might not be available, and a legal notice needs to be sent out.
But multijurisdictional issues might lead to the non-availability of such logs. Due to
the pivotal role of logs from VMs, several researchers have contributed to various
methods and suggestions for VM log gathering and analysis. Thrope et al. [25]
proposed a tool called “Virtual Machine Log Auditor (VMLA).” VMLA can be used
by a forensics investigator to generate a timeline of events in the virtual machine
using the virtual machine logs. Zhang et al. [26] proposed a method for the detection
of active virtual machines and the extraction of system logs, process information,
user accounts, registry, loaded modules, network information, etc. They have exper-
imented and have been successful with current CPUs and operating systems such as
Fedora. Lim et al. [27] have emphasized the role of VM’s in forensics investigation
and have presented suggestions on how to forensically investigate a virtual machine
based on their findings. Wahyudi et al. [28] in their research demonstrated that even
when a virtual machine is destroyed, and the forensic relevant data can be recovered
from the host machine using autopsy tools and FTK.
A Systematic Review of Log-Based Cloud Forensics 341
Resource Logs
Performing forensics in the cloud computing environment not only requires the logs
from virtual machines and host machines, but the logs from other resources such as
load balancers, routers, and network firewalls are also essential. These resource logs
help the forensics investigator in the reconstruction of the crime scene. The acquisi-
tion of such logs is a challenge and demands trust in the cloud service provider as
the cloud infrastructure after all is owned and maintained by her. While surveying
the literature, it was found that there has been a significant amount of research and
development in this area too. Mishra et al. [29] emphasized the usefulness of the
resource logs along with the logs from the virtual machines and the host system. In
their work, they proposed the collection of virtual machine logs along with resource
logs and stored the logs in a database. They further demonstrated the implementation
of a dashboard for monitoring the logs for the identification of unusual activities in
the virtual machines and the resources that are being logged. They performed their
experiment in a private cloud implemented using Eucalyptus. Gebhardt and Reiser
[30] in their research, outlined the need for network forensics in the cloud computing
environment. Additionally, they have emphasized the challenges in network foren-
sics. To solve these problems, they have proposed a generic model for forensics
of the network in the cloud computing environment. They validated their proposed
model by implementing a prototype with “OpenNebula” and the analysis tool called
“Xplico.”
Logs are extracted from various levels of the cloud infrastructure as well as from
the devices that reside with the Internet service provider. During forensics investiga-
tion, logs are requested from the cloud service provider as well as from the Internet
service provider. But the issue of putting trust in a third party is subsisted in such a log
acquisition process. To mitigate the issue of putting trust in a third party, Logging-
as-a-Service has emerged. This scheme of service gathers logs and provides access
to the logs to the forensics investigator through trusted, secure, and privacy-aware
mechanisms. Thus, keeping the dependency on untrusted parties to a minimum.
Khan et al. [31] have emphasized the importance of cloud logs and have proposed
a “Logging-as-a-Service” scheme for the storage of outward gathered logs. Deploy-
ment of logging system being expensive due to the persistence of logs gathered, they
have opted for a cloud-based solution. They deployed the logging service in the cloud
where they implement “reversible watermarking” for securing the logs. This kind of
watermarking is very efficient, and any tampering of the logs can be easily detected
by the virtue of it. The logs are collected using Syslog and the logs thus collected
are stored for a longer stretch of time. Muthurajkumar et al. [32] have accentuated
the pivotal role that logs play in forensics and the usefulness of extended storage of
logs. In their work, they have implemented a system using Java and Google Drive
for the secured and integrity maintaining manner. The authors have implemented the
342 A. Ghosh et al.
“Temporal Secured Cloud Log Management Algorithm” for maintaining log trans-
action history. The logs that they store are encrypted before storage. Batch storage of
logs is implemented by the authors for seamless retrieval of the stored logs. Liu et al.
[33] have outlined the importance and vulnerability of logs in the cloud computing
environment. Considering the risks that persist for the log databases, the authors have
proposed a blockchain-based solution for such log storage. The authors have imple-
mented the logging system where the integrity of the logs to be stored is first verified
and then the logs are stored in the log database and the hash of the logs are stored
in the blockchain. Users retrieve the hashes from the blockchain and store them in
a separate database called the “assistant database.” Then, the users send acceptance
of the logs to the cloud service provider. Finally, the cloud service provider stores
the acceptance in the log database. Patrascu and Patriciu [34] discuss the problem
of logs not being consolidated. They further propose a system for the consolidation
of cloud logs to help the forensics process in the cloud computing environment.
The framework proposed by the authors consists of five layers. The “management
layer” consists of a cloud forensics module and other cloud-related services. The
“virtualization layer” consists of all virtual machines, workstations, etc. The third
layer consists of the log data storage that is sent from the “virtualization layer.”
The raw log data is then analyzed in the fourth layer. Finally, in the fifth layer, the
analyzed and processed data are stored. Rane et al. [35] proposed an interplanetary
file system (IPFS)-based logging solution. The IPFS system is used to store network
and virtual machine log meta-data. The authors claim that their system provides
“confidentiality,” “integrity,” and “availability.” The authors maintain an index of
hashes from the IPFS system. Any tampering of data will result in new hash which
will not be present in the index. Thus, providing integrity of the logs (Table 3).
Cloud services exhibit multi-device capability, i.e., the cloud services can be accessed
from different kinds of devices. A quite common example of such a cloud service
is cloud storage services such as Google Drive, Dropbox, etc. From the lens of
forensics investigation, all such devices accessing the cloud services need to be
examined for shreds of evidence. Several developments have been made in this area
of research. The logs found in such client devices have been termed as “somatic logs”
in the proposed taxonomy. Satrya and Shin [36] in their work proposed a method
for forensics investigation of the client devices and the Google Drive application on
Android devices. They demonstrated the forensics investigation by following the six
steps of a digital forensics investigation. They compared various logs generated in
the system such as login and logout data, install and uninstall data. Amirullah et al.
[37] performed forensics analysis of client-side applications on Windows 10 devices
to find remnant data on the device related to the crime scene. They discovered various
kinds of data in the device such as deleted data, install and uninstall data, browser
data, and memory data. They claim their success to be 82.63%. But they were unable
to analyze the remaining data on the device (Table 4).
A Systematic Review of Log-Based Cloud Forensics 343
Table 3 Recapitulation of contributions in CSP provided logs and LaaS provided logs
CSP provided logs
[25] Proposed system that generates timeline of events using VM logs
[26] Proposed system for extraction of logs
[27] Provided suggestions on how to perform forensics investigation of VMs
[28] Recovered evidences from deleted VMs using Autopsy tools and FTK
[29] Proposed system for acquisition and consolidation of logs
[30] Proposed a generic model for forensics of network using Xplico tool
LaaS provided logs
[31] Proposed “Logging-as-a-Service” system using “reversible
watermarking” in cloud
[32] Proposed system for secure and integrity preserving persistence of logs
using Google Drive
[33] Proposed blockchain-based solution for log storage and anonymous
authentication
[34] Proposed an extensible system for consolidation of logs for existing
clouds
[35] Proposed IPFS-based log storage system
Log-based cloud forensic faces several challenges. In this section, the challenges
faced by an investigator in log-based cloud forensics have been discussed.
• Synchronization of Timestamps
Timestamps in the cloud logs enable the forensics investigator to reconstruct the
chain of activities that have taken place in a crime scene in the cloud. By design,
cloud infrastructure is spread across the globe. This makes the logs from different
systems maintaining timestamps of respective time zones. Thus, when logs from
different time zones are analyzed, co-relating the timestamps becomes a mammoth
task.
• Logs Spread Across Layers
Moving along in the order IaaS, PaaS, SaaS, the access of logs decreases, i.e., in
SaaS, the CSP either provides the user with little or no log data. In PaaS, the user
gets access to some extent. The highest level of (comparatively) access to the logs
344 A. Ghosh et al.
is given to the user in IaaS. There is no centralized access to the logs in the cloud
computing environment. Moreover, the IaaS user is only granted access to logs
that the cloud service provider deems suitable. For detailed logs of the network
and hardware, the cloud service prover must be requested and trusted.
• Volatile Logs
In the cloud environment, if the user can create and store data then she also can
delete the data. Because of the multi-locational nature of cloud computing, the
data present at different data locations are mapped to provide abstraction and to
provide the illusion of unity to the user. When data is deleted, its mapping is also
erased, this removal of the mapping happens in a matter of seconds thus making
it impossible to get remote access of the deleted data in an investigation scenario
which partly relies on deleted data recovery.
• Questionable Integrity of Logs
The cloud service provider is the owner of most of the crucial logs and must be
requested for access to the logs in a forensics investigation. But the integrity of logs
that are provided by the cloud service provider is questionable. There is always
a chance of collusion among the parties involved in the forensics investigation.
Moreover, the cloud service providers are bound by the service level agreements
for the privacy and integrity of their clients. Thus, a cloud service provider will
not be obliged to breach the service level agreements fearing their running out of
business due to such a breach.
Despite the extensive research and development surveyed in this paper, there still
exist several unaddressed issues. Some of the open research areas that can be worked
on for the maturing of log-based cloud forensics are.
• Forensics-as-a-Service
The volume of data that must be analyzed is huge. Analysis of such a huge volume
of data requires high computing power. The typical workstation-based practice
of forensics analysis needs to be changed for an improved turnaround time of
the cases. The elasticity of the cloud computing environment can be exploited to
process such huge volumes of data. Thus, cloud-based Forensics-as-Service can
play a pivotal role in the reduction of pending cases.
• Innovation of Tools
Cloud forensics is still practiced using traditional digital forensics tools. This
makes the process inefficient and at times leads to a complete halt of the cases.
Thus, there is an urgent need for specialized tools that suit the cloud computing
environment, that can handle the huge volumes to be analyzed, and automates the
tasks in the process of forensics analysis.
• Integrity of Log Files
A Systematic Review of Log-Based Cloud Forensics 345
As discussed in Sect. 3, the integrity of the logs provided by the cloud service
provider is questionable. There is an urgent need to come with solutions that help
preserve the integrity of the log files. This is because, if the logs are modified
by any party, the case will not lead to confidence and there is a chance of the
righteous person being punished.
• Prioritization of Evidence to Be Analyzed
One of the major reasons for the high turnaround time of the cases is the exces-
sively high volumes of data that need to be examined. Thus, discarding irrelevant
data will speed up the process of examination. Hence, there is a need for smarter
and automated tools and techniques that prioritizes the data relevant to a case.
5 Conclusion
The field of computing is evolving rapidly due to the introduction of cloud computing.
Cloud computing has given computing a new dimension which the humankind can
leverage. Misuse of the tremendous power and flexibility also exists and the cloud
itself can be used to deal with the adversaries. Cloud computing has the potential
to give a significant boost to forensic investigations be it on cloud or off the cloud.
Overall, the potential is phenomenal. In this work of review, it has been explained why
traditional digital forensics fails in a cloud computing environment. By undertaking
the survey, taxonomy has been proposed with continual and sporadic Forensics being
the two main types of cloud forensics. The sub-types of the proposed taxonomy have
been discussed in detail with the coverage of tools. The challenges in log-based
cloud forensics have been identified and discussed in detail. Finally, the areas of
open research and development in the field of log-based cloud forensics have been
identified.
References
8. Khan S et al (2016) Cloud log forensics: foundations, state of the art, and future directions.ACM
Comput Surv (CSUR) 49(1):1–42
9. Simou S et al (2019) A framework for designing cloud forensic-enabled services (CFeS).
Requirements Eng 24.3:403–430
10. Kebande VR, Venter HS (2015) Obfuscating a cloud-based botnet towards digital forensic
readiness. In: Iccws 2015—the proceedings of the 10th ınternational conference on cyber
warfare and security
11. Kebande VR, Venter HS (2018) Novel digital forensic readiness technique in the cloud
environment. Austral J Forens Sci 50(5):552–591
12. Park S et al (2018) Research on digital forensic readiness design in a cloud computing-based
smart work environment.Sustainability 10(4):1203
13. Datta S et al (2018) An automated malicious host recognition model in cloud forensics. In:
Networking communication and data knowledge engineering. Springer, Singapore, pp 61–71
14. De Marco L et al (2014) Formalization of slas for cloud forensic readiness. In: Proceedings of
ICCSM conference
15. Baror SO, Hein SV, Adeyemi R (2020) A natural human language framework for digital forensic
readiness in the public cloud.Austral J Forensic Sci 1–26
16. Dykstra J, Sherman AT (2013) Design and implementation of FROST: digital forensic tools
for the OpenStack cloud computing platform. Digital Invest 10:S87–S95
17. Marty R (2011) Cloud application logging for forensics. In: Proceedings of the 2011 ACM
symposium on applied computing
18. Anwar F, Anwar Z (2011) Digital forensics for eucalyptus. In: 2011 Frontiers of ınformation
technology. IEEE
19. Roussev V et al (2016) Cloud forensics–tool development studies & future outlook.Digital
investigation 18:79–95
20. Ahsan MAM et al (2018) CLASS: cloud log assuring soundness and secrecy scheme for cloud
forensics.IEEE Trans Sustain Comput
21. Park JH, Park JY, Huh EN (2017) Block chain based data logging and integrity management
system for cloud forensics.Comput Sci Inf Technol 149
22. Khan Y, Varma S (2020) Development and design strategies of evidence collection framework in
cloud environment. In: Social networking and computational ıntelligence. Springer, Singapore
23. Rane S, Dixit A (2019) BlockSLaaS: blockchain assisted secure logging-as-a-service for cloud
forensics. In: International conference on security & privacy. Springer, Singapore
24. Alex ME, Kishore R (2017) Forensics framework for cloud computing. Comput Electr Eng
60:193–205
25. Thorpe S et al (2011) The virtual machine log auditor. In: Proceeding of the IEEE 1st
ınternational workshop on security and forensics in communication systems
26. Zhang S, Wang L, Han X (2014) A KVM virtual machine memory forensics method based
on VMCS. In: 2014 tenth ınternational conference on computational ıntelligence and security.
IEEE
27. Lim S et al (2012) A research on the investigation method of digital forensics for a VMware
Workstation’s virtual machine.Math Comput Model 55(1–2):151–160
28. Wahyudi E, Riadi I, Prayudi Y (2018) Virtual machine forensic analysis and recovery method
for recovery and analysis digital evidence.Int J Comput Sci Inf Secur 16
29. Mishra AK, Pilli ES, Govil MC (2014) A Prototype Implementation of log acquisition in
private cloud environment. In: 2014 3rd ınternational conference on eco-friendly computing
and communication systems. IEEE
30. Gebhardt T, Reiser HP (2013) Network forensics for cloud computing. In: IFIP ınternational
conference on distributed applications and ınteroperable systems. Springer, Berlin
31. Khan A et al (2017) Secure logging as a service using reversible watermarking.Procedia Comput
Sci 110:336–343
32. Muthurajkumar S et al (2015) Secured temporal log management techniques for cloud. Procedia
Comput Sci 46:589–595
A Systematic Review of Log-Based Cloud Forensics 347
33. Liu J-Y et al (2019) An anonymous blockchain-based logging system for cloud computing. In:
International conference on blockchain and trustworthy systems. Springer, Singapore
34. Patrascu A, Patriciu V-V (2015) Logging for cloud computing forensic systems. Int J Comput
Commun Control 10(2):222–229
35. Rane S et al (2019) Decentralized logging service using IPFS for cloud ınfrastructure.Available
at SSRN 3419772
36. Satrya GB, Shin SY (2018) Proposed method for mobile forensics investigation analysis of
remnant data on Google Drive client.J Internet Technol 19(6):1741–1751
37. Amirullah A, Riadi I, Luthfi A (2016) Forensics analysis from cloud storage client application
on proprietary operating system. Int J Comput Appl 143(1):1–7
Performance Analysis of K-ELM
Classifiers with the State-of-Art
Classifiers for Human Action Recognition
Abstract Recent advances in computer vision have drawn much attention toward
human activity recognition (HAR) for numerous applications similar to video games,
robotics, content recovery, video surveillance, etc. The enlightening and pursuing of
human actions recognized by the wearable sensor devices (WSD) generally used
today face difficulty in precision and reckless automatic recognition due to regular
change of body movements by the human. Primarily, the HAR system will preprocess
the WSD signal, and then, six sets of features were extracted from wearable sensor
accelerometer data that are viable from the computational viewpoint. In the end, after
the crucial dimensionality reduction process, the selected features were utilized by the
classifier to ensure high human action classification results. In this paper, to analyze
the performance of the K-ELM, classifiers-based deep model for selected features
is predominantly focused with the state-of-the-art classifiers such as artificial neural
network (ANN), k-nearest neighbor (KNN), support vector machines (SVM) and
convolutional neural network (CNN). The experimental results obtained by analyzing
performance using the metrics such as Precision, Recall, F-measure, specificity and
accuracy shows that K-ELM outperforms with less time for most of the above-
mentioned state-of-the-art classifiers.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 349
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_27
350 R. V. S. Harish and P. Rajesh Kumar
exploited for security as well as for various smart environmental applications [1].
The recognition of human action can be analyzed only through constant moni-
toring by the methods as shown in Fig. 1 among that, in recent this can be attained
mostly by the researchers by manipulating a wearable sensor device (WSD) [2].
Furthermore, among the various HAR system developed with internal and external
sensors for posture and motion estimation, accelerometers and gyroscopes are more
precisely used by the researchers [3]. Among that, accelerometers are the sensor most
commonly used in wearable devices, owing to its noted primes such as miniature
size, low cost and power stipulations, and its competency to deliver data promptly
interrelated to the motion of the people [4]. The signal logged by the accelerometer is
validated upon the human activity and its device location, and the increase in the use
of accelerometers for HAR needs to embrace certain inadequacies like positioning
issues and useableness apprehensions [5].
The accelerometer sensor-based reliable HAR system requires an efficient clas-
sifier to speed up the recognition process and its accuracy, and time taken by each
classifier is a major constraint issue [6]. Therefore, quick classification of human
action is necessary to overcome the drawback of conventional classifiers used in
processing signals as it processed as a time-series and desires to remain as continuous
as probable [7]. Furthermost, the recent research studies related to HAR make use of
classifiers such as k-nearest neighbor (kNN), support vector machines (SVM), super-
vised learning Gaussian mixture models (SLGMM), random forest (RF), k-means,
Gaussian mixture models (GMM) and hidden Markov model (HMM) [8]. Although
the advancements in recognizing daily living activities like standing, sitting, sitting
on the ground, lying down, lying, walking, stair mounting and standing up are done
through various approaches, automated HAR is still inadequate due to minor classi-
fication inaccuracies [9]. These issues drawn us toward the analysis of standardized
classifier evaluation based on WSD for multiple applications due to its exertion in
characterizing promising classifier for human action recognition system [4].
The main contribution of this paper is to evaluate the performance of the K-ELM
classifier-based deep model by comparing it with that of the conventional state-of-art
classifiers by using the real-world dataset, which was collected by W. Ugulino’s team
using wearable accelerometers. The human action recognition includes following
Vision
Based
Human Action
Wearable
Recognition
(HAR)
Sensor Object
Based Tagged
Dense
Sensing
process: (i) Accelerometer sensor placement (ii) Preprocessing (iii) Feature extrac-
tion (iv) Feature selection (v) Classification. The results obtained by the classifiers are
evaluated based on metrics such as F-measure, recall, precision and accuracy. This
paper is organized as follows: In Sect. 2, background on human action recognition
system by various research scholars was addressed. In Sect. 3, the adopted K-ELM
classifier-based HAR with above five steps was addressed. In Sect. 4, the exper-
imental results for suggested and state-of-art classifiers were deliberated. Finally,
with a short description, the perspectives of the paper are concluded in Sect. 5.
2 Related Work
Table 1 Extraction of limited classification accuracy of state of artwork in related works (Sect. 2)
Author Year Classifier Classification accuracy
Sheng et al. in [10] 2020 Extended region-aware Classification accuracy of
multiple kernels learning about 70.5%
(ER-MKL)
Weiyao et al. in [11] 2016 Kernel-based extreme Classification accuracy of
learning machine classifier about 94.5%
Jaouedi et al. in [12] 2019 Recurrent neural networks Classification accuracy of
to predict human action about 86%
Xiao et al. in [13] 2019 Convolutional neural Classification accuracy of
network about 0.6212, 0.6637, 0.9216
and 0.894 for different datasets
Zhang et al. in [14] 2019 Deep belief network as Classification accuracy of
classifier about 74.59%, 89.53%,
87.03%, 90.66% based on the
differ features
Zerrouki et al. in [15] 2018 AdaBoost algorithm Classification accuracy of
about 96.56%, 93.91%,
96.56%, 93.91%, for different
datasets
Feng-Ping et al. in [16] 2018 SVM classifier Classification accuracy of
about 92.1%, 91.3%, 91.2%,
79.8%, 88.3%, 55.2% for
different datasets
352 R. V. S. Harish and P. Rajesh Kumar
knowledge utilized. They make use of a JHMDB and UCF Sports datasets to evaluate
the performance of the proposed ER-MKL strategy in comparison with the other
conventional classifiers.
Weiyao et al. in [11] has suggested an effective framework by modeling the
multilevel frame select sampling (MFSS) model to sample the input images for
recognizing human action. Then the motion and static maps (MSM) method, block-
based LBP feature extraction approach and fisher kernel representation are used to
get the motion and static history, texture extraction and combining the block features,
respectively. By analyzing the key parameters such as τ and MSM thresholds, it was
proved that the 3-level temporal level was effective in recognizing human action
than the others. The evaluation of the proposed approach was carried out on three
publicly available datasets and as a future, suggestion Convolutional Neural Network
and NTU dataset was recommended.
Jaouedi et al. in [12] has introduced a HAR strategy by using the GMM-KF based
motion tracking and Recurrent Neural Networks model with Gated Recurrent Unit
for video sequencing. An important tactic used in this approach is to extract the
features from each and every frame of the video under analysis to achieve a better
human action recognition. The experiment outcome proves its high classification
rate and suggests an idea to minimize the video classification time for challenging
datasets like UCF Sport and UCF101 as a future scope.
Xiao et al. in [13] has suggested a new HAR approach, it includes spatial decom-
position by three-level spatial pyramid feature extraction scheme and deep repre-
sentation extraction by the dual-aggregation scheme. Then by fusing both the local
and deep features, CXQDA based on Cosine measure and Cross-view Quadratic
Discriminant Analysis (XQDA) are utilized to categorize the human action. The
experimental outcome shows its effective performance than that of the conventional
strategies.
Zhang et al. in [14] has suggested a DBN based electromyography (EMG) signal
classifier for time-domain features for 20 human muscular actions. By means of the
best set of features 4-class EMG signal classifier was designed for a user interface
system mainly used in potential applications. Due to high variance of EMG signal for
multiple features, it was difficult to choose the optimal classifier, hence they suggest
to optimize the structural parameters of DBN with dominant features for real-time
multi-class EMG signal recognitions for human muscular actions.
Zerrouki et al. in [15] has introduced a video camera monitoring along with
adapting AdaBoost classifier based human action recognition strategy in the paper. By
partitioning the human body into 5 partitions six classes of activities such as walking,
standing, bending, lying, squatting, and sitting are analyzed during the recognizing
process. To evaluate the performance Universidad de Malaga, fall detection dataset
(URFDD) was utilized and to deliberate its effectiveness they compared it with the
conventional classifiers like a neural network, K-nearest neighbor, support vector
machine and naive Bayes. Finally, as future direction, they suggest using an automatic
updating method and infrared or thermal equipped cameras to ease the recognition
process in the dusky environment.
Performance Analysis of K-ELM Classifiers … 353
Feng-Ping et al. in [16] has developed a deep learning model based on MMN
and Maxout activation function for human action recognition. The suggested
approach guarantees stable gradient propagation, avoid slow convergence process
and improves the image recognition performance. Here, high-level space–time
features from the sequences are extracted and finally classified with a two-layer
neural network structure trained support vector machine. The type of human action
and multi-class action recognition can be achieved through RBM-NN approach.
The multi-class human action recognition can be evaluated by means of 3 set of
datasets and proves to be quick and accurate recognition than that of the conventional
multi-class action recognition approaches.
3 Proposed Methodology
The main objective here is to work out the performance analysis of a K-ELM deep
model aimed at human action recognition (HAR) originate from wearable sensor
device motion analysis all for the selected set of features. The extraction of features
subsisted through the help of multilayer extreme learning machine (ML-ELM) and
finally classified in the midst of kernel extreme learning machine (KELM) classifier
which has the advantage of convolutional neural network (CNN) to overcome the
instability in ELM. The brief extension of the proposed strategy is portrayed in the
underneath sections.
ELM is one of the successful feed-forward regression classifiers suits well for large-
scale video or motion analysis exertions. The conventional neural networks involve
hidden layer and include mapping through back propagation algorithm and least
square approaches while learning. But, the learning problems in ELM are converted
into an undeviating scheme whose weight matrices were evaluated through compre-
hensive inverse operation (Moore–Penrose pseudo inverse), i.e., it will assign only
the hidden neurons and randomize the weights as well as bias in between the input and
hidden layers to evaluate the output matrix during the execution process. Finally, the
Moore–Penrose pseudo inverse method beneath the principle of least-squares method
helps to attain the weight in between the final hidden and output layer. This undevi-
ating learning scheme with the norm of weights and less error process quickly with
superior classification competence than that of the conventional learning strategies.
Figure 2 indicates the ELM network with input layer with ‘n’, ‘l’ and ‘m’ number
of inputs, hidden and output layers, respectively.
Let us assume that the network with input sample as
Output Layer
………….
β
H=G(ωX+b)
Hidden Layer
………….
Input Layer
………….
The input feature of the above sample x and its desired matrix y are represented
as follows,
In the above equations, ‘n’ and ‘m’ denote the input and output matrix dimensions,
the randomized weight wi j sandwiched between the input and hidden layer was
expressed as follows.
⎡ ⎤
w11 w12 . . . w1m
⎢w w22 . . . w2m ⎥
⎢ 21 ⎥
⎢ ⎥
⎢· · · ⎥
wi j = ⎢ ⎥ (6)
⎢· · · ⎥
⎢ ⎥
⎣· · · ⎦
wl1 wl2 wln
Likewise, the weight β jk made by the ELM in between the hidden and output
layer was represented as follows,
⎡ ⎤
β11 β12 . . . β1m
⎢β β22 . . . β2m ⎥
⎢ 21 ⎥
⎢ ⎥
⎢· · · ⎥
β jk =⎢ ⎥ (7)
⎢· · · ⎥
⎢ ⎥
⎣· · · ⎦
βl1 βl2 βlm
The Bias made by the ELM for the hidden layer neurons were expressed as
B = [b1 b2 . . . bn ]T , the network activation function was represented as g(x), and the
output matrix was represented as follows, T = [t1 t2 . . . t Q ]m×Q , i.e.,
356 R. V. S. Harish and P. Rajesh Kumar
⎡ ⎤
l
⎢ βi1 g(ωi x j + bi ) ⎥
⎢ i=1 ⎥
⎡ ⎤ ⎢ ⎥
ti j ⎢ l ⎥
⎢ ⎥
⎢t ⎥ ⎢ βi2 g(ωi x j + bi ) ⎥
⎢ 2j ⎥ ⎢ ⎥
⎢ ⎥ ⎢ i=1 ⎥
⎢. ⎥ ⎢ ⎥
⎢ ⎥
tj = ⎢ ⎥ = ⎢ .⎢ ⎥; j = 1, 2, 3, 4, . . . Q (8)
⎥
⎢. ⎥ ⎢ ⎥
⎢ ⎥ ⎢. ⎥
⎣. ⎦ ⎢ ⎥
⎢ ⎥
⎢ . ⎥
tm j ⎢ ⎥
⎢ l ⎥
⎣ ⎦
βim g(ωi x j + bi )
i=1
where H represents the hidden layer output, T is the transpose of T . To evaluate the
weight matrix value β with minimum error, least square method was utilized.
β = H +T (10)
To regularize the term β and for the hidden layer neurons with less training samples
with stabilized output results, the β is represented as follows,
−1
1
β= + HT H HT T (11)
λ
Similarly, for hidden layer neurons with more training samples, the β is
represented as follows,
−1
1
β=H T
+ H HT T (12)
λ
The multilayer extreme learning machine (ML-ELM) consists of two or more hidden
layers with ‘l’ neurons and a single output layer. The g(x) is selected for the network
layers, then the bias evaluation and weight updating for all the layers in between the
input and output layers were done by using the following equivalences.
Performance Analysis of K-ELM Classifiers … 357
Let us assume that two hidden layers ML-ELM shown in Fig. 3 has (X, T ) =
{xi , ti }; i = 1, 2, 3, . . . , Q training samples, in which x denotes the input and t
denotes the ruminated sample, the hidden layer output can be evaluated by using the
following equations.
H = g(wx + b) (13)
where w and b signify the randomly initialized weight and bias of the hidden layers,
the final layer output matrix was evaluated by using the following equation
Output Layer
………….
βnew
Hidden Layer
………….
H2=g(wH H)
Hidden Layer
H=g(wHX)
Input Layer
………….
β = H +T (14)
where H + signifies the Moore–Penrose inverse matrix of H . Let us assume that our
ML-ELM was designed with three hidden layers and its expected output and weight
matrix were evaluated by using the following Eq. (15)
+
H2 = Tβnew (15)
where
+
βnew signifies the inverse weight matrix βnew
W H 1 = [B1 W1 ] (16)
where w H 1 = g −1 (H2 )H1+ , W2 and H2 = weight and output between second and
third hidden layer, H1+ = Inverse of H1 = [1 H2 ]T ; 1 signifies the column vector
size Q and g −1 (H2 ) signifies the inverse activation function. Here, to evaluate the
performance, the logistic sigmoid function is adopted,
g(x) = 1 (1 + e−x ) (18)
Finally, the last layer, i.e., second layer and the output weight matrix with less and
more neurons than that of the training samples was evaluated by using the following
equations,
H4 = g(W H 1 H1 ) (19)
−1
βnew = 1 λ + H3T H2 H3T T (20)
−1
βnew = H3T 1 λ + H1 H1T T (21)
where f (x) is the actual final hidden layer output after parameter optimization
through all the inner layers present.
⎧
−1
⎪
⎨ h(x)H T I C + H H T T; N < l
f (x) =
(23)
⎩ h(x) I + H T H
⎪ −1
C HT T; N ≥ l
Performance Analysis of K-ELM Classifiers … 359
The deep learned six-time-domain features like mean value, standard deviation,
Minmax, skewness, kurtosis and correlation are extracted by means of the above
equivalences help us to gain better action recognition rate in the next classifica-
tion process. Though specific features are captured during the different aspects
of actions in the video, synthesis of features before classification gives us distinct
characteristics.
The ELM is a most efficient method with high-speed classification process than that
of the conventional back propagation strategies due to its capability in generating the
weights and bias randomly. Kernel based extreme learning machines proposed by
Hung et al. in [17] by following Mercer’s has been utilized here for Human action
classification process. The kernel matrix with unknown mapping function h(x) is
well-defined as follows.
⎡ ⎤
k(x1 , x1 ) . . . k(x1 , x N )
⎢· · ⎥
ELM = HH = ⎢
T
⎣·
⎥
⎦ (24)
·
k(x N , X 1 ) . . . k(x N , x N )
By considering the above equations from (24) to (26), the output weight of the
K-ELM was evaluated by using the following Eq. (27), in which is the kernel
matrix of input matrix given to the K-ELM classifier.
−1
β = + I C Y (27)
The accelerometer data collected by the WSD was given as an input to the HAR
system, assume that data with a four-dimensional sequence with respect to time is
taken here for action recognition from the real-life dataset collected by W. Ugulino’s
by means of using four wearable accelerometers. After the acquisition of data, prepro-
cessing was carried out, which includes the dimensionality reduction and segmen-
tation of moving parts, i.e., sequencing the signal data into subsequences quietly
termed as sliding window process is applied to sequential data partitioning.
Subsequently, after partitioning, the input sensor data is preceded by dint of the
feature extraction process. Here, time-domain features are aimed to extract for human
action recognition in this exertion by an ML-ELM. The accelerometer signal time
integrals are evaluated by means of heterogeneous metric so-called as integral of
modulus of accelerations (IMA) is expressed as follows.
N N N
IMA = |ax |dt + a y dt + |az |dt (28)
t=1 t=0 t=0
where
ax , a y , az —orthogonal acceleration components.
t—time.
N —window length.
Formerly from the extracted features, a set of six-time-domain features is selected
which includes mean value, standard deviation, Minmax, skewness, kurtosis and
correlation to differentiate from the original set of samples and to ease the further
classification process at less time. Finally, K-ELM classifier is used to classify human
actions based on the selected set of features with less error. The performance of the
K-ELM hierarchical classifier was compared with those of standard classifiers such
as an artificial neural network (ANN), k-nearest neighbor (KNN), support vector
machines (SVM) and convolutional neural network (CNN) based on its classification
accuracy rate was discussed in the below section.
data. The performance evaluation criteria used here for analysis include precision,
recall, F-measure, specificity and accuracy.
Accuracy = Tp + Tn Tp + Tn + Fp + Fn (29)
where
Tn (True negative)—Truly classified negative samples.
Tp (True positive)—Truly positive samples.
Fn (False negative)—Faultily classified positives.
Fp (False positive)—Faultily classified negatives.
F-measure is the integration of both the recall and precision, and it is expressed
as follows
Precision = Tp Tp + Fp (30)
Recall = Tp Tp + Fn (31)
F-Score = (1 + β ) ∗ recall ∗ precision β 2 ∗ recall + precision
2
(32)
Specificity = Tn Tn + Fp (33)
F.score of about 94.60–93.21% congruently with the time taken of about 15–30 min
desperately to classify the human action. Our proposed K-ELMM classifier has an
F.score of about 98.24% with the time taken of about 20 min, from this analysis,
K-ELM attains efficient F.score value at the more or lesser time than that of the
other classifiers under analysis with high F.score value.
Similarly, the recall, specificity and accuracy of the proposed K-ELM approach in
comparison with the state-of-art classifiers are shown in Figs. 4, 5 and 6, respectively.
The selected set of features helps to characterize the human action (sit, walk, upstairs,
stand and downstairs) well than that of the conventional classifiers for some reasons
like sensor used and procedure variances meant for validation. In the case of ANN,
KNN, SVM, CNN and K-ELM, the recall of about 91.21%, 94.57%, 90.98%, 93.45%
and 97.458% is obtained while recognizing the action. Similarly, in the case of
specificity and accuracy, the values obtained are about 97.07%, 99.67%, 96.56%,
96.57%, 98.56% and 91.83%, 94.62%, 90.33%, 95.72%, 98.97% correspondingly
by the classifiers under analysis. It should be noted that the analysis using selected
six sets of features as input to the K-ELM classifiers shows better performances than
the others from this study by recognizing the human actions.
5 Conclusion
In this paper, the performance analysis of proposed k-ELM classifier has been
presented with a selected set of features with the conventional state-of-art classi-
fiers by using W. Ugulino’s accelerometer dataset. The human action recognition
process is described with its signifying equivalences utilized for feature extraction
and classification. Finally, a comparative analysis of K-ELM is presented by using
the selected set of time-domain features (mean value, standard deviation, Minmax,
skewness, kurtosis and correlation) as an input shows effective results than that
of other ANN, KNN, SVM and CNN approaches. From the analysis, the integra-
tion of above classifiers as a future direction would perform better with accurate
364 R. V. S. Harish and P. Rajesh Kumar
References
1. Bayat A, Pomplun M, Tran D (2014) A study on human activity recognition using accelerometer
data from smartphones. Procedia Comput Sci 34:450–457
2. Casale P, Oriol P, Petia R (2011) Human activity recognition from accelerometer data using a
wearable device. In: Iberian conference on pattern recognition and image analysis. Springer,
Berlin, pp 289–296
3. Pantelopoulos A, Bourbakis N (2010) A survey on wearable sensor-based systems for health
monitoring and prognosis. IEEE Trans Syst Man Cybern Part C (Appl Rev) 40:1–12
4. Jordao A, Antonio C, Nazare Jr., Sena J, Schwartz WR (2018) Human activity recognition
based on wearable sensor data. A standardization of the state-of-the-art. arXiv preprint arXiv:
1806.05226
5. Cleland I, Kikhia B, Nugent C, Boytsov A, Hallberg J, Synnes K, McClean S, Finlay D
(2013) Optimal placement of accelerometers for the detection of everyday activities. Sensors
13:9183–9200
6. Poppe R (2010) A survey on vision-based human action recognition. Image Vis Comput
28:976–990
7. Vishwakarma S, Agrawal A (2012) A survey on activity recognition and behavior understanding
in video surveillance. Vis Comput 29:983–1009
8. Tharwat A, Mahdi H, Elhoseny M, Hassanien A (2018) Recognizing human activity in mobile
crowdsensing environment using optimized k-NN algorithm. Expert Syst Appl 107:32–44
9. Ignatov A (2018) Real-time human activity recognition from accelerometer data using
convolutional neural networks. Appl Soft Comput 62:915–922
10. Weiyao X, Muqing W, Min Z, Yifeng L, Bo L, Ting X (2019) Human action recognition using
multilevel depth motion maps. IEEE Access 7:41811–41822
11. Jaouedi N, Boujnah N, Bouhlel M (2020) A new hybrid deep learning model for human action
recognition. J King Saud Univ Comput Inf Sci 32:447–453
12. Xiao J, Cui X, Li F (2020) Human action recognition based on convolutional neural network
and spatial pyramid representation. J Vis Commun Image Rep 71:102722
13. Zhang J, Ling C, Li S (2019) EMG signals based human action recognition via deep belief
networks. IFAC-Pap OnLine 52:271–276
14. Zerrouki N, Harrou F, Sun Y, Houacine A (2018) Vision-based human action classification
using adaptive boosting algorithm. IEEE Sens J 18:5115–5121
15. An F (2018) Human action recognition algorithm based on adaptive initialization of deep
learning model parameters and support vector machine. IEEE Access 6:59405–59421
16. Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and
multiclass classification. IEEE Trans Syst Man Cybern Part B (Cybern) 42:513–529
17. Niu X, Wang Z, Pan Z (2019) Extreme learning machine-based deep model for human activity
recognition with wearable sensors. Comput Sci Eng 21:16–25
Singular Value Decomposition-Based
High-Resolution Channel Estimation
Scheme for mmWave Massive MIMO
with Hybrid Precoding for 5G
Applications
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 365
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_28
366 V. Baranidharan et al.
1 Introduction
2 Related Works
This section explores the many existing techniques which greatly overcomes the
computational complexity and feedback overhead changes. Pazdanowski have
proposed the new channel estimation schemes through parameters learning [4] for
channel estimation of mmWave massive MIMO systems. This work is fully based
Singular Value Decomposition-Based High-Resolution Channel Estimation … 367
on the off-grid channel model is widely used to spatial sample mismatch-based char-
acterization using discrete Fourier transform (DFT) method for mmWave massive
MIMO channel estimation. The main limitation of this channel estimation scheme
is that this work is used to estimate off-grid parameters of mmWave massive MIMO
channels.
Qi et al., have proposed off-grid method to estimate the channel for massive
MIMO systems over mmWave band [5]. In this method, the major advantage is that
the pilot overhead is decreased. This system employs an off-grid scenario-based
sparse signal reconstructing scheme. The accuracy of channel estimation is quite
improved in this scheme. The separated stages of channel estimation including AOA
ad AOD’s estimation followed for path gaining estimation is good. In this proposed
method, the accuracy of this algorithm is comparatively less in grid point construc-
tion. Minimizing the objective function is not refined by suppressing the effect of
off-grids. Wang et al., have proposed multi-panel mmWave with hybrid precoding for
mmWave massive MIMO scheme [6]. In this method, the channel vector is converted
into an angular domain and then CSI information is restored by the way of formulated
angular CSI. In order to exploit the strucutural features of mmWave MIMO channels
is always very difficult in angular domain. The major disadvantage of this method is
that the computational complexity is not decreased. Qibo Qin et al., have proposed
channel estimation by time-varying for millimeter wave massive MIMO systems [7].
In this method, the scatter nature (i.e., time-varying) of mmWave channels is used
AoA’s/AoD’s are estimated. In order to overcome these issues, the adaptive angle
estimation method is used to formulate the AoA’s/AoD’s estimation. Even though,
the computational complexity of separated stages of channel estimation is very high.
Jianwei Zhao et al., have proposed hybrid precoding with angle domain and tracking
method of mmWave channel for wave massive-based MIMO channel systems [8].
A hybrid precoding with angle domain and mmWave channel tracking method is
used for searching the structural features of millimetre wave MIMO channel. All the
users can be scheduled by one of the schemes, based on their direction of arrivals [9,
10]. The major limitation of this method is high SNR error value is limited. While
retraining the system, the effect of DoA tracking is not improved.
Consider mmWave massive MIMO with efficient hybrid precoding systems, with
arbitrary geometry. N T , N TRF , N R , N RRF be the number of antennas for transmission,
RF transmitter chains, receive antennas and receiver RF chains, respectively [11–15].
In real-time practical 5G systems with hybrid precoding, the number of RF chains
is less than the number of antennas, i.e., N TRF < N T , N RRF < N R The system model
368 V. Baranidharan et al.
is given below,
r = Q H HPs + n (1)
a(ϕ azi , ϕ ele ) = [1, e j2πd sin ϕ ele/λ , . . . , e j2π(N 1−1)d sin nϕazi sin ϕele/λ ]T
T
⊗ 1, e j2πd cos ϕ ele/λ , . . . , e j2π(N 2 − 1) d cos ϕele/λ (3)
Here, the ith channel matrix element at x transmitted signal at the ith transmit
antenna. In the mth time frame for the users, the combined matrix W m is obtained
by the number of RF chains N RRF dimensional received pilot arrangements.
Y = WHHX + N (7)
The estimated channel matrix H (7) is always equivalent the number of paths
estimation, AOA’s and AOD’s angles are normalized. The channel matrix (H) is
obtained due to the angle domain sparsity
min ẑ o , s.t.Y − W H Ĥ ≤ ε (8)
z,θ R ,θT F
Singular Value Decomposition-Based High-Resolution Channel Estimation … 369
where Z o is the total number of nonzero elements occurs at ẑ, and ε is the error
tolerance parameter.
The major disadvantage of the above equation is that by solving l0 -norm vector.
Because this l0 -norm is not computationally efficient as expected.
L
min F(z) log |z 1 |2 + δ , s.t.Y − W H Ĥ ≤ ε (9)
z,θ R ,θT F
l=0
where δ > 0 ensures that the below Eq. (10) is well defined. Z, θ R , and θT are the
parameters which are used to determine Ĥ . Further, it is used to optimize this as
an unconstrained optimized problem by addition of a normalization parameter λ is
greater than 0.
L
2
min G log |z 1 |2 + δ + λY − W H Ĥ (10)
z,θ R ,θT F
l−1
1 1 1
D (i) 2 + 2 · · · 2 (12)
(i) (i) (i)
ẑ 1 + δ ẑ 1 + δ ẑ 1 + δ
(i)
where the z is the ith iteration estimation of z.
between data fitting error and sparsity. The cyclic-reweighted technique (10), the
value λ is not at all fixed, however, it is updated in every cycle. If the past cycle is
ineffectively fitted, the λ is chosen smaller to make the estimated as sparser. The
larger value is chosen to quicken the search of the best-fitting estimation. In the
calculation is proposed, λ is given as
λ = min (d) r (i) , λmax (13)
Singular Value Decomposition-Based High-Resolution Channel Estimation … 371
More details of updated λ were discussed in (13). At angle domain grids the
iteration of the proposed algorithm begins. The main aim is to estimates θ̂ R(i+1) and
θ̂T(i+1) in the neighborhood of the previous estimate θ̂ R(i) and θ̂T(i) This is used to define
the smaller the objective function s (i) . This is done by gradient descent method
(GDM) of optimization. The equation is given as
θ̂ R(i+1) = θ̂ R(i) − η · ∇θ R Sopt
(i)
θ̂ R(i) , θ̂T(i) (15)
θ̂T(i+1) = θ̂T(i) − η · ∇θT Sopt
(i)
θ̂ R(i) , θ̂T(i) (16)
Depending on gradient values, the chosen step length is ï to make sure that
the new optimized objective function’s estimate is less than equal to the previous
optimized objective functions estimation. During the iterative searching, the esti-
mate becomes more accurate, it happens till when the previous estimate is the same
as the new estimate. In this scheme proposed, from the initial estimated on-grid
coarse values (θ R , θ T ) are moved to its actual positions of the off-grid. The flowchart
of the proposed algorithm to find AOA’s and AOD’s is explained in Fig. 1. It is
very much important to figure out the value of the unknown sparsity level. In this
scheme, the sparsity is beginned greater than the real value of the channel sparsity.
The iteration-based high-resolution scheme, the sparsity level is set to maximum
when comparing to the real channel sparsity, and the paths will always be consid-
ered as a noise generated in the channel instead of real paths when the path gain is
too small. Then, this proposed algorithm is pruned these channel paths to make the
result as sparser than the existing systems. During iteration, the predicted sparsity
level will have decreased to the number of paths. The proposed algorithms compu-
tational complexity of each iteration lies in gradient value calculation. The term
Ql (N X NY (N R + N T )L 2 is the computational complexity which is used to find the
gradients. The total number of starting candidates L (0) is critical. To make the compu-
tation affordable, L (0) should be smaller. The method is widely used to select the
(0)
(0)
effective initial values of θ R and θ T before the iteration will be discussed as detail
in the next section.
angle domain grids which are nearer to the AoD’s/AoA’s are identified by using this
scheme. The preconditioning is significantly minimized the calculation difficulties
when comparing to the usage of all N R and N T grids as initial candidates. By
using the singular value decomposition to the matrix Y, Y = U V H , where =
diag(σ1 , σ2 . . . σ(min(N X ,NT ) ∈ R NY ∗NT ) whose entries in diagonal is σ1 ≥ σ2 ≥
. . . ≥ σmin(N X ,NT ) ≥ 0 are Y singular values and U H U = I(N X ∗NT ) . From the above
equations,
Since the noise is comparatively small, the L paths identify the singular value and
their vector, i.e., for i = 1, 2, … L. From uniform planar array, for an N 1 × N 2 receiver
antenna arrays, the set of grids can be determined by R = {(i/N1 , j/N2 )|i =
0, 1, . . . , N1 − 1; j = 0, 1, . . . N2 − 1}. T is also determined similarly for the
transmitter.
The algorithm for SVD-based preconditioning is described in this section. The
initial candidate’s values of Fig. 1 are used to set the values of the grid, i.e.,
L (0) = N R N T . When N T and N R values are large, the computational complexity
is Ql N X NY (N R +N T )N R2 N T2 and it is unaffordable. As in Fig. 2, it will describes
the singular value decomposition-based preconditioning, the beginner candidates of
Fig. 1 is coarse estimates, i.e., L (0) = Nintt ≈ L. So the computational complexity will
be Ql N X NY (N R +N T )L 2 . This is the result after the SVD-based preconditioning. The
computational difficulty is lesser when comparing to directly applying in the scheme.
The performance metrics are investigated by using the simulation results achieved
using MATLAB is explained in this section. The proposed SVD-based high-
resolution-based channel estimation scheme with hybrid precoding is compared
with the existing systems. Initially, some simulation parameters are considered for
mmWave massive MIMO with precoding techniques.
The path gain of the channel is given as α 1 . The path gain is assumed to be
as Gaussian, i.e., α ∼ Cjð (0, σ 2 ). Here, the value of e is the power required for
transmission and it is uniformly distributed from 0 to 2π. The SNR is defined as the
average power of the transmitted signal divided by the average power of the RMS
value of the noise voltage across the system.
For this mmWave channel estimation scheme, the SNR value will become SNR =
C(σ α 2 /σ n 2 ). Where σ n 2 —noise variance. Figure 2 compares the SNR with normal-
ized mean square error (NMSE) under the both line of sight (LOS) and non-line of
sight (NLOS) channels are considered to estimate the channel of the proposed high-
resolution-based channel estimation scheme with the hybrid precoding technique
with an existing spatial mismatching-based DFT method.
The SNR value is varied from −5 to 10 dB. In this below figure, the red and
blue color line indicates the hybrid precoding method and DFT spatial mismatching
technique, respectively. The Rician K factor value is considered as 20.
In both cases, the proposed channel estimation scheme will outperform than the
existing method. The normalized mean square error value will comparatively low
than the existing method. This result is achieved by considering the uniform planer
array in this proposed scheme. The 64 antenna uniform planar array is considered in
both the transmitter and receiver with 8 rows and 8 columns. This result is achieved
by the way to estimate the azimuth and elevation angles. Here, it is observed that the
estimate values of the proposed and existing values of channel estimation schemes for
massive MIMO system over mmWave communication. Table 1 gives the comparison
values of the SNR and NMSE of the proposed and existing schemes. This table shows
that the NMSE plot statistics values are comparatively less than the existing system
(Table 2).
The squared residue value is defined as a sum of the square of the residuals, i.e., the
predicted deviations from actual empirical data values. The squared residue error
374 V. Baranidharan et al.
values are measured for the different number of samples 1, 10, 20, 30, 40 and 50.
The SNR versus average squared residue error values for different samples of the
existing (spatial mismatching using DFT method) and proposed (high-resolution-
based hybrid precoding) methods are shown in Figs. 3 and 4.
The ideal Channel State Information (CSI) of the both existing and the proposed
systems are used to find the estimation errors of azimuth and elevation angles of the
massive MIMO mmWave channels. The average residue error codes are estimated to
get an AOA’s and AOD’s of the channels. Table 3 shows the mean values of sample
1, 10, 20, 30, 40 and 50 of both existing and the proposed schemes.
From the above table, the number of samples increases the average residual values
decreases for the proposed schemes. The existing system shows that the mean value
is 32.47 but for the proposed scheme (high-resolution-based channel estimation
scheme) it is 30.04 value only. The average residue error is decreased. This will
lead to less computational complexity and channel CSI is adopted for good commu-
nication systems over mmWave MIMO channels. The proposed estimation scheme
can achieve better channel accuracy as compared with the existing system.
Singular Value Decomposition-Based High-Resolution Channel Estimation … 375
Fig. 3 Comparison of SNR versus squared residue of samples of DFT-based spatial mismatching
channel estimation schemes
Table 3 Comparison of
Iteration samples Spatial mismatching High resolution with
average residue values (error
DFT hybrid precoding
values) of the proposed work
with existing schemes Sample 1 32.75 34.32
Samples 10 32.65 33.22
Samples 20 31.93 34.36
Samples 30 32.15 32.43
Samples 40 32.59 31.39
Samples 50 32.47 30.04
376 V. Baranidharan et al.
5 Conclusion
References
12. Zhu D, Choi J, Heath RW (2017) Auxiliary beam pair enabled AoD and AoA estimation
in closed-loop large-scale millimeter-wave MIMO systems. IEEE Trans Wireless Commun
16(7):4770–4785
13. Lee J, Gil GT, Lee YH (2016) Channel estimation via orthogonal matching pursuit for hybrid
MIMO systems in millimeter wave communications. IEEE Trans Commun 64(6):2370–2386
14. Marzi Z, Ramasamy D, Madhow U (2016) Compressive channel estimation and tracking for
large arrays in mm-wave picocells. IEEE J Sel Top Signal Process 10(3):514–527
15. Fang J, Wang F, Shen Y, Li H, Blum RS (2016) Super-resolution compressed sensing for line
spectral estimation: an iterative reweighted approach. IEEE Trans Sig Process 64(18):4649–
4662
Responsible Data Sharing in the Digital
Economy: Big Data Governance
Adoption in Bancassurance
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 379
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_29
380 S. Eybers and N. Setsabi
1 Introduction
shared are imminent. Furthermore, consumers can perform many financial transac-
tions using the digital platforms offered on mobile devices; and can use social media
platforms for escalations or compliments.
One of the focus areas in the banking industry in South Africa is a drive to
identify and understand the characteristics of their customers and their needs with the
main objective of pre-empting customer needs. Due to the fast pace of information
flow by consumers, the banks have expanded this notion with predictive analysis
focusing on customer behavior, demographics, financial status, social standing and
health. This analysis has been based on large volumes of data collated from customer
profiles sourced from various data sources over an extensive period. For example,
unstructured customer data is collected from voice recordings, images and identity
documents (to name a few). This is combined with structured data, which refers
to data generated by enterprise systems used in operational and executive decision
making in this instance. The output of this is usually in the form of reports. Strict rules,
regulations, standards and policies are required to govern these big datasets to protect
the customer, ensure the streamlining of reports to provide accurate information
fostering decision making as well as conceptualization of innovations [3].
To ensure accountability in decisions over the organization’s data assets, [8] refers
to data governance as a business program that should be enforced in the entire orga-
nization. The research further suggests that effective data governance programs are
key to ensuring that processes are correctly followed when managing data assets,
similar to financial controls when monitoring and controlling financial transactions.
In a bancassurance model, a company uses the existing channels and customer
base to offer insurance services. Figure 1 depicts how bancassurance relates to big
data, what typical data is shared between the bank and insurance organization and
the need for governance.
Figure 1 starts by highlighting the different types of big data as well as the various
decision domains in data governance. Big data is then classified according to its
features, which is the 5 Vs (volume, variety, velocity, veracity and value), while the
scope of data governance is defined by the business model, the stakeholders, content
and federation. The scope of data governance and the data governance programs will
be discussed later.
The model highlights the different types of structured and unstructured data found
in each of the bancassurance entities. A decision is further required to determine
which of the different types of big data will need to be shared across entities. This
includes, but not limited to, mostly structured data (e.g., customer details like name,
age, gender, marital status, race, source of funds, residential address, banking patterns
(deposits and withdrawals), credit bureau risk indicator, insurance products sold,
banking core product held, customer personal identification number, bank account
number, bank balance, customer segment, bank account status, policy activation
status) and sometimes may include unstructured data such as FICA documents which
includes a copy of the national identity document, vehicle license document, proof
of address, company registration documents.
In the context of big data, in terms of volume, as at December 2016 Bank A had
11.8 million active customers. A competitor bank showed that they were getting
382 S. Eybers and N. Setsabi
Bancassurance Decisions
Variety
Fig. 1 A bancassurance model depicting big data sharing and data governance
Responsible Data Sharing in the Digital Economy … 383
on average 120,000 new customers per month in 2016. In terms of velocity, about
300,000 large transactions on average per minute are processed in the evening (7 pm)
and about 100,000 transactions are processed in minute midday (12 pm). Multiple
data sources make for variations in data especially in the insurance entity (referring to
the big data characteristic of variety). Data sources are multichannel within customer
interactions including call (voice) recordings from contact centers, digital interac-
tions via online channels (including social media), voice recordings at branches sites,
image scans of customer personal data via mobile devices, video recordings of the
car/building inspection uploaded via mobile devices and textual conversations using
USSD technology. Veracity in data has been implemented by the bank using various
tools to determine the data quality of its customer base. The FAIS and Banks Act
highlight the importance of insurance and banking organizations to ensure they are
presenting the right products, to the right customer according to the identified need.
This process is carried out by business compliance officers, risk operations officers
as well as data stewards.
Data monetization is evident through data analytics. According to [4] the value
characteristic in big data represents value, evident through the utilization of analytics
to foster decision making. As a result, using data analytics, decisions can pre-empt
the next move organizations should take to obtain or retain a competitive advantage.
The more analytics the organization applies to data, the more it fits the definition of
big data.
In his book, [8] remarks that three components affect the scope of data governance
(DG) programs, namely: (1) understanding the business model of your organization,
the corporate structure and its operating environment; (2) understanding the content
to be controlled such as data, information, documents, emails, contacts and media
and where it adds the most business value; and lastly, (3) the intent, level and extent
to which the content is monitored and controlled.
The business model of the bancassurance organization is the partnership of an
insurance organization selling their products using the bank’s distribution channel.
As such, the nature of this business model is that the organization has different
business units and as such does not share common information. It is in this instance
that [8] suggests the data governance programs are implemented per business unit.
Insurance and the bank hold a significant number of customers shared between the
two entities. In isolation, each organization only has a partial customer view and
understanding, placing them at a disadvantage in understanding and serving their
client needs. The two organizations hold data on their respective IT platforms and
differences in the product and customer numbers on shared customers is noted.
Content control: Bancassurance as a role player in the financial services industry is
highly regulated by national as well as international banking industry laws, insurance
laws as well as financial services legislation and regulations such as the Financial
Intelligence Centre Act 38 of 2001 (FICA), the Financial Advisory and Intermediary
Services Act (FAIS), Anti-Money Laundering, Code of Ethics, Code of Conduct,
Protection of Personal Information and National Credit Regulator (NCR) (to name a
few). As such, structured data (such as financial transactions and premium payments),
384 S. Eybers and N. Setsabi
as well as unstructured data (such as emails and insurance policies), may require
governance that would focus on identifying the data used and how it should be
used. Furthermore, archival and protection of personal information should also be
considered.
Federation: Federation as a process that involves “defining an entity (the data gover-
nance program) that is a distinct blend of governance functions where the various
aspects of data governance interact with the organization” [8]. As a result, a clear
understanding of the extent to which various governance standards will be applied
and implemented across various business units, divisions and departments within an
organization is important. The governance processes applied and implemented must
be driven by the same organizational motives and involves the same processes.
A case study approach was adopted to investigate how data Governance was instilled
during a big data implementation project. A case study is a heuristic investigation
of a phenomenon within a real-life context to understand its intricacies [9]. In this
instance, a qualitative research approach was adopted using data collected from
interviews, heuristic observations and secondary documentation to create an in-depth
narrative describing how data governance was considered during the big data project.
3.1 Interviews
Due to the nature of bancassurance organizations (the bank and insurance), different
teams manage big data and as such different policies were implemented. As a result,
the governance of big data governance is a joint responsibility and much dependent
on the stakeholders working together toward a common decision-making process
and decision-making standards on big data-related matters. There also needs to be
clearly defined roles and responsibilities in such decisions [10].
To determine which policies and governance standards were implemented in
different teams and the logic used, interviews were conducted with various role
players with varying data governance responsibilities, as well as interdepartmental
teams working with big data such as enterprise information management, business
intelligence, customer insights and analytics.
A convenience sampling method was adopted as the participants were selected
based on stakeholders’ availability, geographic location and reachability, as well as
the geographical location of the researcher. All the selected participants, as well as
the researcher, were based in Johannesburg, South Africa. Johannesburg is part of
Gauteng, which is one of nine provinces in South Africa and the largest by population
with the biggest economic contribution.
Responsible Data Sharing in the Digital Economy … 385
The participants have been working in their respective roles for a minimum of
two years (junior role) and five years in a senior role. They were, therefore, all well
informed on the topic of data governance. Based on the evidence of current data
governance implementation in the organization under study (governance structures
implemented), the data governance maturity was classified as in its infancy. A total
of eight in-depth semi-structured interviews were conducted: seven interviews in
person which was recorded and one using email. Table 1 contains a summary of
the positions of the various research participants as well as job level, the level on
which the participants focus with regard to governance-related matters as well as an
indication of their involvement in the bank or insurance-related business units.
In-depth, face-to-face interviews were conducted using a predefined interview
template. Twenty-five interview questions were grouped into three main focus areas
identified during an extensive systematic literature review on the topic of data
governance. The focus areas are described as part of the case study section.
In instances where the response from participants was not clear, second and third
round of interviews were conducted. Each interview was scheduled for one hour and
the research participants were invited to the interview via a detailed email which
provided context and background to the request for their time and why they were
identified as the ideal candidates for this research.
a sheet with comments from observations. Grove and Fisk [12] refer to this obser-
vation method as a structured observation of a more positivist than interpretivist in
philosophy. Due to the limited time frame in which to conclude the study, a few
in-person (direct) live observations will be made. Observations will be done based
on the various data governance framework themes identified by the researcher.
Organizational interdepartmental teams frequently attend scheduled forums
where all issues related to big data, including governance, were discussed. The output
and decisions taken at these forums were implemented over time and as such override
any other decisions taken within the business units. The researcher attended three of
these meetings to observe the dynamic in the group as well as to get feedback on
data governance-related topics.
3.3 Documents
The data obtained from participant interviews and observations were supplemented
by various secondary documentation, including but not limited to: documented
outputs from the respective discussion forums and divisional meetings focusing on
data governance (minutes of the meetings as well as supporting documentation); data
architecture documents; data flow diagrams; visualization and information architec-
ture documentation as well as conceptual design documents for data-related concepts
(including data storage); consumer and data engineering training plans (for big data);
business area enrollment form requesting to become a data consumer; principles and
best practices (for data asset management and data scrubbing, data governance struc-
tures) as well as metadata management (for data tokenization); progress meetings
of data artifact development teams. Thematic analysis was used to evaluate all the
documentation, in particular, the transcriptions of the interviews. Atlas.ti was used
to substantiate the manual coding process.
4 Case Study
storage structures) [6, 10, 13–15]; (2) data governance elements focusing on data
quality (which refer to data dictionaries and data libraries, metadata and processes
followed to ensure data quality) [5, 10, 14–17]; and (3) the adoption of big data
governance guidelines and frameworks (pertaining to the data lifecycle, and safe-
guarding of data, including during interoperability procedures that will maximize
the big data value proposition) [5, 8, 13, 15–19]. These three focus areas were used
to describe the findings of the research.
At the executive level, the majority of participants in both the banking and insurance
functional levels indicated that they were aware of big data governance structures that
were in place. Linked to the organizational structure, one executive mentioned that
he was unsure if the existing governance policies were specific to the handling of big
data or data in general. On a lower functional level, engineers were not aware of data
governance policies, while one out of three senior managers shared a similar view:
“yes, having a group-wide forum where having a set data governance standards. A
framework have been put together on how our data is governed as an organization.
As a business unit, what the rest of the group are doing is being adopted. This
was drafted as a document after functional business unit inputs and shared as the
organizational standard”.
Organizational data governance structures: Executives confirmed that they were
aware of the current group data governance structure. This could be attributed to the
seniority level they have in the organization which is privy to the group’s structures.
Senior managers shared the same view, adding that “there is an executive data office
(EDO) which governs the usage of data”. Another senior manager added that an
enterprise data committee (EDC) was formed: “individual business units have their
own data forums and own data committee. This is mandated from the EDC. What-
ever have been agreed upon can be taken up to EDC for noting”. Importantly data
stewards, acting as business unit request coordinators and business unit represen-
tatives, play an integral part in the data committees. Interestingly, junior interview
participants were not aware of these structures.
Key data stakeholders: Executive level interview participants highlighted that a lot
of interest has been shown lately in data-related projects and as a result, many partic-
ipants volunteered to become part of the big data project. Unfortunately, participants
were not necessarily skilled in the area of data science and therefore lacked skills
of data scientists. However, various data-related roles existed within the organi-
zation namely data analysts, information management analysts, domain managers,
data stewards, provincial data managers and data designers. Executive participants
alluded that “…specific roles are critical in driving this contract [Big data project],
e.g., data engineers and data stewards in the information management space”. Apart
from these resources, business stakeholders were also included in data governance
Responsible Data Sharing in the Digital Economy … 389
forums and of vital importance: “…the owner of the data is the one who signs off
on the domain of the data. A business will thus be the owner in that domain of that
data. For example, head of the card division will be the domain holder of credit card
data”. In contrast, one executive claimed that all participants were data practitioners
(and not business stakeholders).
All senior managers, as well as the middle manager and the junior resource, agreed
that stakeholders with specific roles were invited to the data governance forums.
Although these roles were involved in the forums, resources fulfilling these roles
were not sure of what was expected of them. Other roles that were included in the
forums were “support casts” (as postulated by [20]) and include IT representatives,
compliance officers, IT security and business analysts.
Data storage: Participants acknowledge the importance of the ability of current tech-
nologies, as part of the IT infrastructure, to cater for organizational data storage
and easy retrieval. Importantly, one of the executives mentioned that regulatory and
compliance data storage requirements might differ from operational, business unit
requirements. “The analytics of financial markets comes in quick and fast and this
requires a reliable storage system… From an analytics perspective, you would need
as much of historic data as possible, way beyond the five years”. The current ability
of organizational data storage was debated among research participants. Although
one advised that the current technologies do cater to the storage needs of the organi-
zation, it can be improved. Furthermore, other participants indicated that the storage
facilities and infrastructure was adequate as it adhered to regulatory and compli-
ance prescriptions. However, the value can still be derived from data stored and
not necessarily required to meet regulatory requirements. This can be meaningful
in the analysis of current customer retention. It was also felt that the organization
should focus more on the processes prescribed to store data and not necessarily data
storage technology ability. Junior resources indicated their frustration in delays when
requesting data, mainly attributed to processes.
All interview participants believed that a data dictionary for each business unit is of
vital importance. This was attributed to the nature of the bancassurance organization
(the difference between the meaning of data entities). Data dictionaries were “living
documents” authored by data stewards and referred to as an “information glossary”. A
data reference tool was being created at the insurance group level to assist the business
of updating the glossary, but currently under development. This will, in particular,
assist with the need to include data source to target mapping requirements.
An important feature of Bank A’s current big data implementation strategy is
the focus on metadata. Metadata existed for structured and semi-structured data
(particularly) in different toolsets. Despite the availability, the additional meaning
was not derived from data entities. For example, referring to a personal identity
390 S. Eybers and N. Setsabi
number, “knowingly that it’s an id number but if they derive age and date of birth
… that level of maturity is not there yet. It’s on the cards”. No metadata existing for
unstructured data. The senior managers all concur that there was metadata available
for the different types of data content on both group and business unit level. The
biggest concern was that metadata was not frequently updated. The junior resource
mentioned that they have not been exposed to unstructured metadata and as such
believes it does not exist. The resource suggested that this could be due to the large
volume of insurance data.
On the topic of big data quality, executives mentioned that a data quality process
exists that ensures data quality throughout the data lifecycle. One of the executives
added that “there is no one size fits all”, referring to the standard data quality process
across business functions, but that measures or weightings applied to data elements
might differ. Research participants did not concur when asking about data quality.
Although raw data was thoroughly checked during the data generation phase (during
customer interaction), not enough data checks were performed after the acquisitions
phase. However, junior resources who actively engaged with the data felt that the
data quality checks performed during the data generation phase were insufficient as
they had to frequently perform data cleanup tasks.
Research participants agreed on the existence and enforcement of data checks to
ensure data accuracy and completeness. The process has been seen as “fail-proof
and well executed”. For example, a clear set of criteria was identified against which
datasets were checked, including data models. Tools and infrastructure technologies
from the IT partners have been employed to assist with third-party data sources.
Additional data checks are performed whereby personal data (e.g., national identify
numbers) are verified against the central government database and vehicle detail
verified against the national vehicle asset register (NaTIS system). Trends analysis
is also used to perform data validity checks. For example, if funeral sales were 100
with a general 10% monthly increase, a sudden monthly increase by 40%, with no
change in business strategy or business processes, the dataset is flagged.
Communication and data management: To manage changes to datasets, the
respondents highlighted that various tools were used to communicate changes and the
introduction of new datasets, mainly by data stewards. Only data owners are included
in the communication process at various stages of the data lifecycle management on
an event management basis. Should there be a need, data stewards will escalate data
related to higher levels of governance such as the information governance forum.
Not surprisingly research participants felt that the data management process if far
from perfect as it is reactive in nature—only when there’s an impact in business or
in systems is communication effected.
Big data policies and standards—audit checks: The organization is forced to imply
strict auditing checks as prescribed by industry compliance regulations. As a result,
respondents indicated that they have employed “…. an information risk partner,
compliance and risk and legal. Depends on how frequently the data is used, they’ll
either audit you every quarter or once a year. There’s also internal and external
audit initiatives”. Another executive, in charge of data analytics, mentioned frequent,
Responsible Data Sharing in the Digital Economy … 391
“periodic deep sticks” into sample datasets. Furthermore, it was highlighted that Bank
A also leverage off the data supporting cast such as IT security to run risk checks on it.
Apart from infrequent, ad hoc data checks, the compliance criteria were programmed
into reporting artifacts such as dashboards. An interesting finding was that most data
practitioners were aware of the policies and standards, but business stakeholders
lacked knowledge on the topic.
Focusing on the issue of big data privacy and security, the majority of executives
explained there were adequate measures to guard against unauthorized access to
data. The general notion was that the financial services industry was very mature
when dealing with data due to strong regulatory prescriptions of data being handled
in real time, near real time and batches. One of the three executives, however, high-
lighted that although the measures were in place and validated in conjunction with
IT partners, he was not sure that measures were sufficient.
One of the senior managers questioned the adequacy of access to internal data
as well as the access to bank data available to third parties. According to the senior
manager, internal data access is adequate. However, third parties have been lacking
adherence to safeguarding principles set out to safeguard the bank’s data against
unauthorized access. The rest of the senior managers concurred that measures were
in place and maintained in the entire organization; however, the competence and capa-
bility of those measures are sometimes inadequate. Junior staff members supported
this viewpoint and elaborated that measures are prone to human discretion to grant
access to data. As a result, predefined quality checkpoints can be ignored. The middle
manager felt at the localized level (insurance) adequate measures were informed to
external parties who request access to data.
Big data interoperability: Executives indicated that terms of references have been
agreed at the Bank’s group architecture level when sharing data. Predefined methods
and procedures exist for the sharing of data to ensure data integrity during the move-
ment process. One of the executives threw caution to the wind highlighting that,
even though data is securely transported between landing areas, integrity in the data
is sometimes compromised between the landing areas.
Research participants were confident that data security and integrity measures
were successfully employed when sharing data. Training was provided to data users
as well as the senders of the data. Software security tools were approved by group
information security to ensure that data was not compromised during live streaming.
In addition, “failsafe” methods are currently being developed. Apart from this, addi-
tional sign off procedures was employed at each stage of data movement which
ensures integrity and safe transportation. This can also be accredited to the source
392 S. Eybers and N. Setsabi
to target mapping exercise that is done that sets a baseline on what to expect from
source data as well as thereafter.
Big data analytics access: Only one of the executives mentioned that their role didn’t
require to have access to the ability to analyze data. However, the other executives
as well as senior managers indicated that data analytics was important to them and
therefore drive business unit level data analytics interventions. Junior and middle
manage confirmed that they have access to data analytics employed data mining and
analytical tools like SAS. Data analytics was used to predict customer behavior, based
on a large historical dataset, to identify fraudulent activities. Prescriptive analytics
was still in its infancy. An example of prescriptive analytics in short-term insurance
would be Google’s driverless car. Knowing that there will be no driver as part of
the underwriting questions, what would be the next course of action? Or there are a
number of algorithms that play input to Google’s self-driving car to determine what
the next course of action it needs to take, i.e., take a short-left turn, sharp curve ahead
drive slowly, pick up speed as going up the mountain, etc.
The general look and feel of data visualization artifacts are governed by a data
visualization committee. This committee provides guidance and standard practice in
the design of the dashboards, visualization tools to be used as well as who needs to
use them is discussed at the guilds.
Scope of big data: Currently, the scope of big data intervention projects is clearly
defined. Senior-level research participants remarked that there is no need for data
governance in instances where the business unit can attend to their big data request
locally and supported by the data steward and IT. Only in instances where there is
“inter-sharing of data including from 3rd parties, then the Governance process will
kick in”. For enterprise level, big data projects, the scope of the projects was defined
at the executive data office level.
Business value: Research participants indicated that the value of (correct) data is
measured as an information asset. Subsequently, it is important to understand the
objective of data analysis. For example, does the organization want to increase
revenue considering existing customer trends, or saving costs by predicting fraud-
ulent activities? Quality data drives meaningful decisions. One of the executives
mentioned by looking at the output of the data, a four dimension matrix comes into
play—“(1) I’m expecting an outcome and there’s an outcome (2) I’m not expecting an
outcome but there’s an outcome (3) I’m expecting an outcome but I’m seeing some-
thing different (4) I’m not expecting an outcome and there is no outcome. …Mea-
suring the value of data looks at the quantum of the opportunity that sits on the
data.” Another executive highlighted, measuring the value of data is done through
prioritization of initiatives within the business or the organization: “to ensure that it
works on the most impactful drivers of the organization is needed”.
Responsible Data Sharing in the Digital Economy … 393
An academic literature review on the topic of data governance and big data high-
lighted three main data governance focus areas that should be considered in the
implementation of big data projects. These three focus areas were used in an in-
depth case study to identify the data governance elements that should be considered
in bancassurance organizations when implementing big data projects.
Focus area one, in general, highlighted the fact that current data governance struc-
tures in the bancassurance organization under study did not cater to big data interven-
tions per se but data in general. It, therefore, seems as if some unique elements of the
planned big data intervention might be missed. His research [16] has indicated that
big data interventions might need special intervention and definitional clarification.
A lack thereof can have a huge effect on focus area three, the value proposition of
big data implementations. Data governance specifications also need to cater for the
unique characteristics of big data such as volume, velocity and variety.
Focus area two indicated that formal education and training should be included as
a formal structure or at the very least a part of the communication decision domain
within the big data governance structures. This is because business stakeholders
required to attend the data governance structures are either new to the business, new
to the role or simply not aware of the importance of having big data governance
structures in place. Education and training on big data via “big data master classes”
and “insurance data sharing” held by the bancassurance under study are a stepping
stone toward bringing awareness to every stakeholder working with data and their
role in the decision making of the data asset. The importance of the clarification
of different data governance roles and responsibilities and subsequent educational
background was highlighted by extensive research by Seiner [20].
The researcher also noted most big data governance structures have been adopted
for structured data but not for unstructured data. Metadata for unstructured data
is nonexistent and as such the management of unstructured data is pushed over to
another guild referred to as “records management”. Unstructured data also proves
difficult to apply the big data quality processes due to its nature. Thus, a lot more
work will need to be put in to ascertain standardized processes that will be required
to govern unstructured data. Governance structures ensuring data quality of current
structured and semi-structured data was well enforced and adequate. The need for
quality assurance of unstructured datasets remained.
The researcher versed some limitations in this study as the content dealt with is
highly classified and several governance processes had to be followed to obtain it. At
some point, only one contact at the executive level was used to verify the accuracy of
the data obtained. The availability of research participants was also limited as they
are based in different buildings, as such non-face-to-face meetings were held with
most of them in the interest of time.
Finally, research area three highlighted adequate data interoperability governance
structures. Although research participants took cognizance of the value of big data,
394 S. Eybers and N. Setsabi
no evidence of such calculations on group level (both banking and insurance) could
be found.
An unintentional finding of the research was the reasons for the failure of data
governance discussion forums. It should be interesting to investigate this matter in
future research work.
References
1. Data Management Association (2009) The DAMA guide to the data management body of
knowledge (DMA—DMBok Guide). Technics Publications
2. Rowley J (2007) The wisdom hierarchy: Representations of the DIKW hierarchy. J Inf Sci
33:163–180
3. Munshi UM (2018) Data science landscape: tracking the ecosystem. In: Data science landscape:
towards research standards and protocols. Springer, Singapore
4. Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19:171–209. https://doi.
org/10.1007/s11036-013-0489-0
5. Ghasemaghaei M, Calic G (2019) Can big data improve firm decision quality? The role of data
quality and data diagnosticity. Decis Support Syst 120:38–49
6. Al-Badi A, Tarhini A, Khan AI (2018) Exploring big data governance frameworks. Procedia
Comput Sci 141:271–277. https://doi.org/10.1016/j.procs.2018.10.181
7. Elkington W (1993) Bancassurance. Chart Build Soc Inst J
8. Ladley J (2012) Data governance: how to design, deploy, and sustain an effective data
governance program. Morgan Kaufmann, Elsevier
9. Yin RK (2014) Case study research design and methods. Sage
10. Kuiler EW (2018) Data governance. In: Schintler LA, McNeely CL (eds) Encyclopedia of
big data, pp 1–4. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-
32001-4_306-1
11. Ice GH (2004) Technological advances in observational data collection: the advantages and
limitations of computer-assisted data collection. Field Methods 16:352–375
12. Grove SJ, Fisk RP (1992) Observational data collection methods for service marketing: an
overview. J Acad Mark Sci 20:217–224
13. Soares S (2014) Data governance tools: evaluation criteria, big data governance, and alignment
with enterprise data management. MC Press online
14. Mei GX, Ping CJ (2015) Design and implementation of distributed data collection management
platform. Presented at the 2015 international conference on computational intelligence and
communication networks, Jabalpur, India, 12 Dec 2015
15. Ballard C, Compert C, Jesionowski T, Milman I, Plants B, Rosen B, Smith H (2014) Information
governance principles and practices for a big data landscape. RedBooks
16. Al-Badi A, Tarhini A, Khan AI (2018) Exploring big data governance frameworks. In: Procedia
computer science, pp 271–277
17. Khatri V, Brown CV (2010) Designing data governance. Commun. ACM 53:148–152
18. Almutairi A, Alruwaili A (2012) Security in database systems. Glob J Comput Sci Technol
Netw Web Secur 12:9–13
19. Davenport TH, Dyche J (2013) Big data in big companies
20. Seiner R (2014) Non-invasive data governance: the path of least resistance and greatest success.
Technics Publications, USA
A Contextual Model for Information
Extraction in Resume Analytics Using
NLP’s Spacy
Abstract The unstructured document like resume will have different file formats
(pdf, txt, doc, etc.), and also, there is a lot of ambiguity and variability in the language
used in the resume. Such heterogeneity makes the extraction of useful information a
challenging task. It gives rise to the urgent need for understanding the context in which
words occur. This article proposes a machine learning approach to phrase matching in
resumes, focusing on the extraction of special skills using spaCy, an advanced natural
language processing (NLP) library. It can analyze and extract detailed information
from resumes like a human recruiter. It keeps a count of the phrases while parsing
to categorize persons based on their expatriation. The decision-making process can
be accelerated through data visualization using matplotlib. Relative comparison of
candidates can be made to filter out the candidates.
1 Introduction
While the Internet has taken up the most significant part of everyday life, finding
jobs or employees on the Internet has become a crucial task for job seekers and
employers. It is vastly time-consuming to store millions of candidate’s resumes in
the unstructured format in relational databases and requires considerably a large
extent of human effort. In contrast, a computer which parses candidate resumes has
to be constantly trained and adapt itself to deal with the continuous expressivity of
human language.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 395
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_30
396 Channabasamma et al.
Resumes may be represented with different file formats (pdf, txt, doc, etc.) and
also with different layouts and contents. Such diversity makes the extraction of useful
information a challenging task. The recruitment team puts a lot of time and effort in
parsing resumes and pulling out the relevant data. Once it is extracted, matching of
resumes to the job descriptions is carried out appropriately.
This work proposes a machine learning approach to phrase matching in resumes,
focusing on the extraction of special skills using spaCy [1], an advanced natural
language processing (NLP) library. It can analyze and extract detailed information
from resumes like a human recruiter. It keeps a count of the phrases while parsing to
categorize persons based on their expatriation. It improves recruiter’s productivity,
reduces the time in the overall candidate selection process, and improves the quality
of selected candidates.
The rest of the paper is organized as follows: Section 2 highlights the literature
survey, Sect. 3 presents the proposed work of extracting relevant information from the
unstructured documents like resumes, Sect. 4 discuses about the implementation and
the obtained output, and Sect. 5 concludes the work with scope for future research.
2 Literature Survey
This section summarizes the contributions made by various researchers toward the
extraction of relevant information for resume parsing.
Sunil Kumar introduced a system for automatic information extraction from
the resumes. Techniques for pattern matching and natural language processing are
described to extract relevant information. Experimental results have shown that the
system can handle different formats of resumes with a precision of 91% and a recall
of 88% [2].
Papiya Das et al. proposed an approach to extract entities to get valuable informa-
tion. R language with natural language processing (NLP) is used for the extraction
of entities. In this paper, the authors briefly discussed process of text analysis and
extraction of entities by using different big data tools [3].
Jing Jiang worked on information extraction, described two fundamental tasks—
named entity recognition [NER] and the relation extraction. NER concept is to find
names of entities, for instance, people, locations, and organizations. A named entity
is often context dependent. Relation extraction aimed at finding semantic relations
among entities [4].
Harry Hassan et al. introduced an unsupervised information extraction framework,
which is based on mutual reinforcement in graphs. This framework is mainly used
in acquiring the extraction patterns for the content extraction, relation detection
and then for characterization task, as it is one the difficult tasks in the process of
information extraction due to the inconsistencies in the available data and absence
of large amounts of training data. This approach achieved greater performance when
compared with supervised techniques with reasonable training data [5].
A Contextual Model for Information Extraction … 397
3 Methodology
NLP is an artificial intelligence (AI) technique which allows the software to under-
stand human language whether spoken or written. The resume parser works on the
keywords, formats, and pattern matching of the resume. Hence, resume parsing soft-
ware uses NLP to analyze and extract detailed information from resumes as like a
human recruiter.
The raw data needs to be preprocessed by the NLP algorithm prior to which the
consequent data mining algorithm is used for processing. NLP algorithmic process
involves various sub-tasks such as tokenization of raw text, part-of-speech tagging,
and named entity recognition etc.
Tokenization: In this process, the text is first tokenized into small individual tokens
such as words, punctuation. This process is done by the implementation of rules,
specific to each language. Based on the specified pattern, the strings are broken
into tokens using regular expressions. The patterns used in this work (r \w ) remove
the punctuation in the data processing. The function add.lower() can be used in the
lambda function to convert to lowercase.
Stopword Removal: The stopwords are a group of often used words in the language.
Like in English, having several stop words such as “the”, “a”, “is”, “are”, etc. The
perception of using these kinds of stop words is, removal of low informative words
from the text could lead to focus more on the important words. spaCy has inbuilt
stopwords. Based on the context, for example in sentiment analysis, the word “not”
and “no” are important in the meaning of a text such as “not good”, stopwords list
can be modified accordingly.
Stemming and Lemmatizing: Both stemming and lemmatization shorten words
from the text to their root form. Stemming is the process of decreasing or removing the
inflection in words to their root form (for instance performs/performed to perform).
In this case, the “root” might not be true root word, but simply a canonical form
of its original word. It streams the prefixes of words based on common words. In
some cases, it is helpful, but not always as a new word may lose its actual meaning.
Lemmatization is the process of converting a word into its base form, for example,
“caring” to “care”. spaCy’s lemmatizer has been used to obtain the lemma (base)
form of the words. Unlike stemming, it returns an appropriate word that could be
easily found in the dictionary.
Part-of-speech tags and dependencies: After the process of tokenization, spaCy
will parse and provide the tags to a given document. At this point, statistical models
are used, which enable the spaCy to predict the label or tag that likely appears with
the context. A model will consist of binary data, it shows the system good enough
examples to make the predictions which may generalize across the language—say,
a word following “the” in the English language is most of the times a noun.
A Contextual Model for Information Extraction … 399
Named entity recognition (NER): NER is possibly the first step in the information
extraction; it identifies and classifies the named entities in the document into a set
of pre-defined categories like the person names, expressions of times, locations,
organizations, monetary values, quantities, percentages [11]. The more accuracy in
the recognition of a named entity as a preprocessing step, the more information on
relations and events can be extracted. There are two types of NER approaches: a
rule-based approach and a statistical approach [4], and even a combination of both
(hybrid NER) has also been introduced. The hybrid approach provided a better result
compared to relying only on the rule-based method in recognizing the names [12, 13].
The rule-based approach defines a set of rules that determines the occurrence
of an entity with its classification. To represent the cluster of relatively independent
categories, ontologies are also used. These systems are most useful for the specialized
entities and their categories, which have a fixed number of members. The quality of
the rules determines performance.
Statistical models use supervised learning that is built on very large training sets
of data in the classification process. Algorithms use real-world data to apply rules;
rules can be learnt and modified. The process of learning could be accomplished
through a fully supervised, unsupervised, or semi-supervised manner [14, 15].
There is a lot of ambiguity and variability in the language used in the resume.
Someone’s name can be an organization name (e.g., Robert Bosch) or can be an
IT skill (e.g., Gensim). Such heterogeneity makes the extraction of useful informa-
tion a challenging task. It gives rise to the urgent need of understanding in the context
where the words occur.
Semantics and context play a vital role while analyzing the relationship between
objects or entities. The most difficult task for unstructured data is extracting the rele-
vant data because of its complexity and quality. Hence, semantically and contextually
rich information extraction (IE) tools can increase the robustness of unstructured data
IE.
The Problem: In various scenarios, running of a CV parser for a person’s resume and
to look for data analytical skills will help you to look for candidates with the knowl-
edge in data science. The parsing fails if the search is more specific, like if you are
looking for a Python developer who is good in server-side programming with having
good NLP knowledge in a particular development environment for the development
of software systems in the healthcare domain. This is because the parsing of job
descriptions and resumes do not bring quality data from unstructured information.
The Solution:
• Have a table or dictionary which covers various skill sets categorized.
400 Channabasamma et al.
• An NLP algorithm to parse the entire document for searching the words which
are mentioned in the table or dictionary.
• Count the occurrence of the words belonging to different categories.
In the proposed system spaCy [1], an advanced library of natural language
processing (NLP) is used. It can analyze and extract detailed information from
resumes as like a human recruiter.
To evaluate information extraction systems, in this work, 250 resumes with different
formats and templates have been considered which are downloaded from Kaggle.
These are parsed by an advanced library of NLP, Spacy, which has a feature called
“Phrase Matcher.” It can analyze and extract detailed information from resumes by
preprocessing.
When raw text is fed as an input to NLP, spaCy tokenizes it, processes the text,
and produces a Doc object. Then, the Doc processed in several different steps—it is
known as the processing pipeline. The pipeline depicted in Fig. 1 includes a tagger,
a parser and an entity recognizer (detailed in Sect. 3). The output Doc that has been
processed in each phase of pipeline is fed as input to the next component.
The proposed model is designed as shown in Fig. 2; the steps to implement the
module are as follows:
1. Dictionary has been created (Table 1) which includes various skill sets catego-
rized from a different domain. The list of words under each category is used to
perform the phrase matching against the resumes.
2. Documents are parsed by the advanced library of NLP, Spacy, which has a feature
called “Phrase Matcher.”
3. Which intern parses the entire document to search for the words which are listed
in the table or dictionary.
4. Finds the phrase.
5. Then the frequency of the words of different categories will be counted.
4.1 Merits
1. The code automatically opens the documents and parses the content.
2. While parsing it keeps a count of phrases, for easy categorization of persons
based on their expatriation.
3. The decision-making process can be accelerated using data visualization.
4. Relative comparison of candidates can be made to filter out the job applicants.
402 Channabasamma et al.
Fig. 3 Visualization of shortlisted candidates with the frequency count of special skills
5 Conclusion
References
1 Introduction
Convolutional neural network (CNN) is a state-of-the-art method that has been rele-
vant for quite a long time in today’s image processing field. It tries to identify an
object by identifying its subcomponents, which are further identified by key points.
It has been widely used as it is known to yield very good accuracy.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 405
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_31
406 A. Koppar et al.
However, CNN has its own limitations. One major limitation is that it does not
consider spatial information when it tries to detect the object. When a CNN identifies
a key point somewhere, it is always considered as a match irrespective of where it
found the key point and in what direction it was found. This could hence lead to a
few misclassifications.
Capsule network [2] tries to address this particular issue. It obtains vectors as
an output from each neuron instead of a single intensity value. The neuron then
communicates with vectors obtained from other neurons, before it decides on how
confidently it was a part of a bigger subcomponent. This way, it ensures that the
spatial orientations are taken into consideration.
Capsule network [2] was originally designed for discrete data. This paper attempts
to modify the architecture so that it could be used for continuous data. It tries to predict
the bone age of a patient based on an X-ray of the person’s wrist bone. It tests its
validity using the dataset provided by the Radiological Society of North America
(RSNA) called RSNA Pediatric Bone Age Challenge [1].
2 Literature Survey
The capsule network was introduced in the paper “Dynamic Routing Between
Capsules” by Hinton et al. [2]. It performed handwritten digit recognition on the
MNIST dataset. The input image is fed into a convolution layer with RELU acti-
vation function. The output was normalized by using a safe norm that scales the
length of the probability vectors between 0 and 1. Then the vector with the highest
estimated class probability was taken from the secondary capsule and the digit class
is predicted. The loss function consists of margin loss and reconstruction loss added
together with a learning rate alpha. The dataset was split into two parts, where 80% of
the dataset was used for training and 20% of the dataset was used for validation. The
model achieved a final accuracy of 99.4% on the validation dataset after 10 epochs.
This paper was hence ideal to classify discrete data, although it may not be suitable
for continuous data.
The paper “Capsule Networks and Face Recognition” by Chui et al. [3] talks
about performing the task of face recognition by using a capsule network. It used the
dataset Labelled Faces in the Wild (LFW) [4]. A total of 4324 images was sampled
from the dataset where 3459 were used as training images and 865 as testing images.
In another attempt to train the network, the dataset comprised 42 unique faces from
a collection of 2588 images with at least 25 faces for every unique person. The train-
test split is similar to the way mentioned above. The model achieved an accuracy of
93.7% on the whole test dataset.
The paper “Pediatric Bone Age Assessment Using Deep Convolutional Neural
Networks” by Iglovikov et al. [5] approaches the problem of pediatric bone age
assessment by using a deep convolution neural network where the CNN performs
the task of identifying the skeletal bone age from a given wrist bone X-ray image.
This paper implements a stack of VGG blocks. It used the exponential linear unit
Pediatric Bone Age Detection Using Capsule Network 407
(ELU) as the activation function. The output layer was appended with a softmax
layer comprising 240 classes as there are 240 bone ages which result in a vector of
probabilities. A dot product of the probabilities with the age was taken. The model
used mean squared error (MSE) as the loss function. This way, CNN was used to
predict continuous data.
The paper “the relationship between dental age, bone age and chronological age
in underweight children” by Kumar et al. [6] talks about the relationship between
dental age, bone age, and chronological age in underweight children. It was experi-
mentally proven that a normal female has a mean difference of 1.61 years between
the chronological age and the bone age and a mean difference of 1.05 years for males.
In addition, the study concludes by saying that bone age and chronological age have
a positive correlation with each other which are the maturity indicators of growth.
Therefore, any delay incurred between the bone age and the chronological age is an
attribute significant to a sample of 100 underweight children.
The paper “Bone age assessment with various machine learning techniques” by
Luiza et al. [7] talks about the traditional approaches to assess the bone age of the
individual as one of the topics among many other topics it talks about. Traditional
methods include the Fels method which is based on radio-opaque density, bony
projection, shape changes, fusion. It also talks about Fischman technique, which is
based on the width of the diaphysis, a gap of the epiphysis, fusion of epiphysis and
diaphysis between third and fifth finger. But manual methods are usually very time
consuming and prone to a lot of errors as humans are involved.
3 Proposed Methodology
This paper proposes an approach based on capsule network [2] to detect the bone age.
Capsule network is an object recognition tool that is a modification of a convolutional
neural network (CNN) [8]. It imparts an additional property of making it robust to
spatial orientation. Capsule network follows a hierarchical approach for detection
just like a CNN. For example, in facial recognition of a three-layered network, the
first layer may detect the types of curves. The second layer may use the detected
curves to identify features such as an eye, a nose, or an ear. The third layer may use
these detected subparts and identify a face.
The difference lies in the fact that a CNN outputs a single value from each neuron
which represents the confidence of the particular shape. The confidence value may
or may not be on a scale of one. However, a capsule network outputs a vector which
not only represents the confidence, but also the direction in which the shape was
identified. The neurons in each layer communicate with each other to decide on the
magnitude of the vector. The vector is then passed to the next layer, which tries to
see the bigger picture of the image based on the output vectors from the previous
layer.
408 A. Koppar et al.
An issue with capsule network is that it has been designed to work only on discrete
data. This paper modifies it to detect continuous data, which is the bone age of the
wrist X-ray of the given patient.
Capsule network was proposed by Hinton et al. [2]. It consists of the following layers
1. Primary capsules—This is a set of convolutional layers that are applied to the
image. Each neuron represents a particular shape. The output from this layer is
a feature map from which n vectors of m dimensions are derived, where m and n
are constants depending on the architectural decision by the user. There is usually
more than one convolutional layer in the neural network.
2. Squash function—This acts as an activation layer and imparts nonlinearity to the
network so that it could effectively learn from state-of-the-art backpropagation
algorithms, which depend on nonlinearity. It is given by the formulae
2
s j sj
vj = 2 (1)
1 + s j s j
3. Digit capsule—This is the output layer that gives the probability of occurrence
of each value. For example, in handwritten digit recognition, there are 10 digit
capsules as there are 10 outputs between 0 and 9. Similarly, this paper proposes
to use 228 digit capsules as the age range of the pediatric bone age dataset as
given by RSNA [1] is between 1 and 228 months
4. Decoder—This component tries to reconstruct the original image from the digit
capsule. This reconstructed image is used to calculate the reconstruction loss,
which is the loss of image data after it passes through the network.
One of the most important features in a capsule network is routing by agreement.
After obtaining the output from each convolutional layer in the primary capsules,
this operation is performed before the output goes to the next convolutional layer.
This enables communication across neurons to see if a feature identified by a neuron
has an equivalent feature identified by other neurons in the same layer.
Let the output of layer 1 be u1 , u2 , u3 … un , the output vector be m dimensions
represented as v1 , v2 , v3 … vm , the weights from u to v be W 1,1 , W 1,2 … W n,m . The
following constants is got
u 1|1 = W1,1 u 1
u 1|2 = W1,2 u 2
.. (2)
.
u n|m = Wn,m u n
Pediatric Bone Age Detection Using Capsule Network 409
The network includes another set of values b1,1 , b1,2 … bn,m whose ultimate goal
is to indicate how the vector outputs of the neurons from the previous layer correlate
to the input of the neurons from the next layer based on other vector outputs from
the next layer. These are initialized to the same value at the beginning. The weights
c1,1 , c1,2 … cn,m are then calculated by applying a softmax function on the values
b1,1 , b1,2 … bn,m .
c1,1 , c1,2 . . . c1,m = softmaxc1,1 , c1,2 . . . c1,m
c2,1 , c2,2 . . . c2,m = softmax c2,1 , c2,2 . . . c2,m
.. (3)
.
cn,1 , cn,2 . . . cn,m = softmax cn,1 , cn,2 . . . cn,m
The term uj|i . vj talks about how much vj has changed with respect to uj|i . The
network is then run again with the new bi,j . This is done for a fixed number of
iterations, so that the final vj appears like, all the neurons have communicated with
each other to decide the final output vector.
The routing by agreement algorithm of a capsule network is given in Fig. 1.
5 Proposed Preprocessing
Before the image is fed to the neural network, it is always important to ensure that
it is in its best form and could be easily understood by the neural network. The
preprocessing in this paper has 3 main goals
• To identify the edges in the image.
• To remove the background as much as possible with minimal loss of data.
• To highlight the edges of the wrist bone.
Before the goals were achieved through various preprocessing techniques, the
image was first resized to a standard size of 1000 * 1000 to ensure that the effect of
any technique applied on the image was similar for every image.
The edges were then identified using adaptive thresholding. This method iden-
tifies the areas of interest based on the intensity of the neighboring pixels. In
order to strengthen the working of adaptive thresholding, contrast enhancement was
performed, so as to widen the intensity difference between the pixels. This was
followed by smoothing, using a Gaussian filter to remove noise and hence reduce
the chance of salt and pepper noise in the output of adaptive thresholding.
Once to have all the edges, the next aim is to ensure that the background edges
such as the frame of the X-ray are removed as much as possible from the image, so
that the network can focus on the wrist bone features. This was removed by applying
a closing filter on the image using kernels with long horizontal line and vertical
line. To cope up with the real-world data, random white spots were added to these
kernels. These kernels were applied 10 times on the image, each time with white
spots at different places.
Following this, the image was converted to grayscale using a Gaussian filter. This
could get intermediary values depending on the surrounding. Also, color inversion
was performed for human convenience for evaluating the quality of output as humans
are generally more accustomed to seeing X-rays as white bone on black background.
Hence, one can genuinely see if quality has improved. Following this, contrast is
enhanced to ensure a maximum difference between pixels.
The image was then sharpened two times to make the edges glow. In between,
the image was smoothed using an edge preserving filter. Edge preserving filter is
the latest smoothing tool that smoothes the pixels by identifying an edge instead of
using the surroundings. Hence, it is ideal in this case.
Once the image runs through this pipeline, it is ready to be used by the neural
network (Fig. 2).
6 Neural Network
The neural network is based on capsule network architecture. Capsule network [2]
is a modification of convolutional neural network [8] that imparts the property of
Pediatric Bone Age Detection Using Capsule Network 411
considering the spatial orientation of the image in addition to the properties provided
by a CNN [8].
However, capsule network has been designed to classify discrete data. This paper
uses it to predict continuous data. It tries to make sure that the accuracy is not biased
to a particular age range as is usually the case when a network that classifies discrete
data is applied on continuous data. The original capsule network architecture has
two loss functions—margin loss and reconstruction loss.
The margin loss is given by the following formulae
2 2
L k = Tk ∗ max 0, m + − vk + λ(1 − Tk ) ∗ max 0, m − − vk (6)
where
L k = margin loss
0, if k is correct
Tk =
1 if k is incorrect
m + = 0.9
m − = 0.1
412 A. Koppar et al.
λ is a constant
The reconstruction loss is a function that indicates how well it has coded the image
to represent the original image. It is based on the output obtained from the decoder.
Final loss function is given by
where alpha was the learning constant and was taken as 0.0005 in the original capsule
network architecture [2].
In the network, the margin loss function tries to ensure that the network gets as
close as possible to the original distribution, while the reconstruction loss tries to
ensure that the final layer represents as much information of the original image as
possible.
In discrete data, like handwriting recognition, when an image of digit 3 is given,
then recognizing the digit as 4 is equally wrong as recognizing the digit as 5. However,
in continuous data, if the original age is 15 months, predicting it as 18 months is much
better than predicting it as 30 months. It is hence clear that the goal of the network in
a continuous data is to get as close to the value as possible, while in case of discrete
data if it cannot reach the exact value, it does not matter what value is predicted.
Let us now examine the margin loss function. Let the correct value be k. Consider
3 points k—alpha, k—(alpha–gamma), k + alpha, where 0 < gamma < alpha < k.
Most backpropagation algorithms propagate its network based on the loss function is
known. Hence, for that particular iteration, when all other coefficients are constant,
if the loss function varies, then backpropagation is taken on different steps. However,
if the loss function is the same, the neural network propagates by the same step in
the same direction.
Case 1—The Predicted Value for the Iteration is (k-Alpha)
T k = 0 as prediction is incorrect
From Eq. 6
2 2
L k = Tk ∗ max 0, m + − vk + λ(1 − Tk ) ∗ max 0, m − − vk
2 2
= 0 ∗ max 0, m + − vk + λ(1 − 0) ∗ max 0, m − − vk
From Eq. (8), (9), and (10), it is evident that the step is taken across the same
direction and magnitude irrespective of how far or on which direction the data is
present. Hence, the convergence to any minima is dependent on the order in which
the data is fed to the network. The problem here is evident. Although the margin loss
function indicates if the value is correct or incorrect, it does not indicate how close
it is to the actual value. Hence, the capsule network as proposed by Hinton et al. [2]
is not suitable for continuous data and needs modifications for usage on continuous
data.
For this purpose, this paper uses mean squared error as a loss function and is
scaled down to 3 using the following formulae
yk
y_normk = ∗3
228
y_pred_normk
y_pred_normk = ∗3
228
L k = (y_normk − y_pred_normk )2 (11)
where
yk is the actual age
y_pred_normk is the predicted age
228 is the highest expected age
This is then added to reconstruction loss using Eq. (7).
There are 228 output layers called digit capsules in the network, with each layer
representing the confidence value of the respective output from 1 to 228 months.
These were made into probabilities by passing them to a softmax layer. From here,
20 highest probabilities were taken and scaled such that they add to one. This was
done by dividing each value by the sum of 20 probabilities, which could be denoted
as
Pi
Pi = j=20
(12)
sort_desc j=0 (P j )
414 A. Koppar et al.
where
P i is the updated probability at age i
Pi is the initial probability at age i
i is a subset of all j values
These probability values were then multiplied to the age they represent and were
added.
This paper proposes to take the top 20 outputs instead of all the probabilities
in order to address the problem of vanishing gradients during the initial phase of
training that eventually leads to network collapse. At the beginning of the training,
due to a large number of output neurons with Gaussian initialization, the probabilities
are almost equal to each other. Hence, it outputs the same value for every image.
Later, when the neural network begins to alter weights during backpropagation, it
still continues to output the same values as there are too many weights to alter. In
the end, because multiple alterations do not affect the result, the network collapses
and stops learning.
In order to make sure that these values were not subjected to excessive variance,
each batch of size 3 was given 3 consecutive tries, which tried to make the image
get as close as possible to the actual distribution. The value 3 was obtained using
parameter tuning.
When top 20 probabilities are taken, it is made sure that each time different digit
capsules are taken, thus resulting in different values based on the image. The top 20
probabilities represent 20 most significant outputs of the distribution of the neural
network X and should effectively represent most of X . It is expected that the neural
network learns such that the top one or two values are much higher than the rest.
Hence, this setup is expected to work well in the long run too.
Another modification made to the network was to change ReLU [9] activations
in the convolution layers to Leaky ReLU [10]. This helped to solve the problem of
“dying ReLU”, where if a neuron reaches the value of zero, ReLU [9] never provides
a way that the weight could be altered again, which implies the neuron has effectively
died.
On using four layers in the network, as proposed by Hinton et al.[2], there are
too many weights in the neural network. This introduces the possibility of exploding
gradients. Hence, this paper proposes to use only two layers in order to address this
problem.
To summarize, the following modifications are proposed to the capsule network
to make sure it handles continuous data
1. The backpropagation was done using mean squared error (MSE) scaled to three in
place of margin loss. This makes the model try to minimize the distance between
the predicted value and the actual value instead of trying to focus on getting the
exact value. The reconstruction loss was still added to this network.
2. The values from the 228 digit capsules were passed through the softmax layer
and the probabilities were obtained. Following this, top 20 values were taken and
were scaled such that these 20 probabilities add up to 1. The top 20 probabilities
were multiplied with their respective values.
Pediatric Bone Age Detection Using Capsule Network 415
3. In order to make sure that these values were not subjected to excessive variance,
each batch of size 3 was given 3 consecutive tries, for the neural network to get
as close as possible to the actual distribution.
4. The ReLU [9] was changed to Leaky ReLU [10] to address the problem of “dying
ReLU”
5. To address the problem of exploding gradients, only two layers were used.
The specifications of the filter size are given in Fig. 3.
7 Convergence
f x + x ≥ f (x) + f x (13)
Theorem 2 A composite function f (g(x)) is convex if f (x) and g(x) is convex and
the range of g(x) is within the domain of f (x).
416 A. Koppar et al.
0<y<z (14)
L(x, x + y) = y 2 (15)
L(x, x + z) = z 2 (16)
Hence, when the top k values are taken, have been effectively sampled X .
Hence, X is still close to X + mu and changes are made to the distribution X
by backpropagation is not major, as it should rightfully be.
Case 2 (When X is far away from X+ mu)
There could be two sub cases here
1. When the top k neurons taken do not change in the next iteration, the probability
of the appropriate neurons is still reduced as X is far away from X + mu.
2. When the top k neurons taken a change in the next iteration change
• If the top k significant neurons of X obtained is such that it is closer to X +
mu, it is converging closer.
• If the next highest probability is farther to X + mu, then the neuron is propa-
gated with a huge loss function in the next iteration. Hence, the most significant
outputs of X are propagated with a bigger loss function, in the next trial or
when a similar sample is taken again.
8 Dataset Used
The dataset used was RSNA Pediatric Bone Age Challenge (2017) [1]. It was devel-
oped by Stanford University and the University of Colorado, which was annotated
by multiple expert observers. The dataset contains images of wrist bone X-ray in
multiple orientations using multiple X-ray scanners, each resulting with a different
texture of the X-ray.
9 Results
The experiments were conducted on Google Cloud on a TensorFlow VM. The system
was configured to have two virtual CPUs, 96 GB RAM on tesla P100 GPU, to support
the comprehensive computations.
The training set was split into 9000 images for training and 3000 images for
validation. The results of the model were analyzed on the validation dataset.
Figure 4 was plotted on the results obtained when all the 228 output layers were
taken into consideration instead of the top 20 probabilities. From Fig. 4, one can
observe that the algorithm outputs a constant value within a very narrow range of
113–114 months for random samples. This happens because of vanishing gradients,
as a large number of weights are learned. Hence, it is justified why top 20 probabilities
are taken and scaled to 1 instead of taking all 228 probabilities.
Figure 5 is a depiction of parameter tuning to identify the optimal number of trials
to be given for the network to come close to the actual distribution with a batch of
three images. It could be found here that the lowest point in the graph is obtained at
three trials and is hence best while training. In the same graph, the “not as number
418 A. Koppar et al.
Fig. 4 Scatterplot of a random sample predicted bone age samples for first 100 iterations when all
228 probabilities were taken
Fig. 5 Average MSE for number of trials with a batch size of three images
(NaN)” obtained corresponding to 1 trial and 2 trials also show us why the network
is needed to give a few trials for it to get close to the actual distribution.
Figures 6 and 7 and the images following it are the results plotted on the validation
set after training with the proposed methodology. One can observe in Figs. 6 and 7
that the deviation is unbiased to any particular age range in general. In other words,
it could be observed that the ratio of the number of patients with age deviation >15
and the number of patients with age deviation <15 is somewhat constant across most
age ranges.
One can observe from Fig. 8 that the actual age and the predicted age have a
positive correlation. This further confirms the fact that the bone age has effectively
been deployed for continuous data.
Pediatric Bone Age Detection Using Capsule Network 419
Fig. 6 Actual age in months of individuals having an age deviation of less than or equal to 15 months
Fig. 7 Actual age in months of individuals having an age deviation greater than 15 months
10 Inference
Hence, the above results show that the data is not biased toward any particular age
range in general. Hence, this network could be more suitable for continuous data
than capsule network proposed by Hinton et al. [2]. [11] [12] [13] [14] [15]
References
1. Stanford Medicine (2017) RSNA pediatric bone age challenge. Dataset from https://www.kag
gle.com/kmader/rsna-bone-age
2. Hinton GE, Sabour S, Frosst N (2017) Dynamic routing between capsules. Code from https://
github.com/ageron/handson-ml/blob/master/extra_capsnets.ipynb
3. Chui A, Patnaik A, Ramesh K, Wang L. Capsule Networks and Face Recognition
4. Labelled faces in the wild (LFW). University of Massachusetts, https://vis-www.cs.umass.edu/
lfw/
5. Iglovikov V, Rakhlin A et al (2018) Pediatric bone age assessment using deep convolutional
neural networks. Pubished by DLMIA,19 June 2018. In: Ahmad F, Najam A, Ahmed Z (2018)
Image-based face detection and recognition. Published by IJCSI, 26 Feb 2013. University
of Waterloo, https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=
rja&uact=8&ved=2ahUKEwjAu_zVgoXuAhXS4jgGHQi0B8YQFjAAegQIARAC&url=
https%3A%2F%2Flindawangg.github.io%2Fprojects%2Fcapsnet.pdf&usg=AOvVaw1GB
oE1a_eSnUMnkLQpUdKE
6. Kumar V, Venkataraghavan K, Krishnan R (2013) The relationship between dental age, bone
age and chronological age in underweight children. US National Library of Medicine National
Institutes of Health
7. Dallora AL, Anderberg P, Kvist O, Mendes E, Ruiz SD, Berglund JS (2019) Bone age
assessment with various machine learning techniques: a systematic literature review and
meta-analysis. Published by Plos one, 25 July 2019
8. Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional
neural networks, 1097–1105
9. Agarap AFM (2019) Deep learning using rectified linear units (ReLU)
10. Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in
convolutionnetwork
11. Kingma DP, Ba JL (2017) Adam: a method for stochastic optimization
12. VijayakumarT (2019) Comparative study of capsule neural network invarious applications. J
Artif Intell 1(01):19–27
13. Patrick MK, Adekoya AF, Mighty AA, Edward BY (2019) Capsule networks—a survey
14. Wang Y, Huang L, Jiang S, Wang Y, Zou J, Fu H, Yang S (2020) Capsule networks showed
excellent performance in the classification of hERG blockers/nonblockers
15. Mughal AM, Hassan N, Ahmed A (2014) Bone age assessment methods: a critical review
Design High-Frequency and Low-Power
2-D DWT Based on 9/7 and 5/3
Coefficient Using Complex Multiplier
Abstract The distinctive 1-D and 2-D discrete wavelet transform (DWT) models
that subsist in the composing are scull-section, equal channel, crumbled, flipping,
and iterative skeleton. The plans vary with approbation to the competition and gear
essential, the memory reacquired to provision the data picture and widely appealing
coefficients. The guideline focus of this investigation striving is to decide capable
VLSI structures, for the gear use of the 9/7 and 5/3 DWT, using complex multiplier
(CM) and improving the speed and hardware complicities of existing plans.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 421
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_32
422 S. Tripathi et al.
with a couple of assumptions like scanner geometry and rough data, for instance,
cognizance of the projections and quiet estimations, etc. [2, 3].
To achieve better picture quality from comparable rough data, continuously down
to earth speculations about scanner geometry and upheaval estimations must be made.
This is done in the more computationally complex iterative revamping methods.
Such iterative diversion techniques may realize longer revamping occasions yet also
in extensively less picture commotion from a comparative rough data through a
progressively complex showing of discoverer response and the quantifiable lead of
the estimations. Iterative multiplication estimation is a lot of capable than logical
computation, among which iterative propagation is considered in this assessment.
Nowadays, iterative generation is playing a genuine activity in PC tomography to
improve the nature of picture and decrease the development of old rarities. Along
these lines, a lot of exploration works have been done as such as to improve the
reproduced picture in both visual and error assessment [4].
The DWT is the best methodology in the field of Image weight and Image coding.
Joint Pixel Master Group (JPEG) is the essential standard method for image pressure.
The coding ability and picture quality are capable in the DWT when appeared differ-
ently in relation to the standard DCT. JPEG has explained the irreversible sort of
the discrete wavelet transform for the beneficial picture pressure. Mechanized image
is one of the standard requirements for both persistent applications additionally as
examination zone. The essential of the image pressure is commonly high on account
of the traffic conveyed by the media sources. The one dimensional what is progres-
sively, two-dimensional discrete wavelet transform is the key limit with respect to
picture getting ready. The multi-objectives signal assessment is cultivated in both
time and repeat space in DWT. The DWT is extensively used in the image pressure
in JPEG 2000 as a result of its time and repeat properties [5].
The image redoing is described as the arrangement of inclusive of two-
dimensional similitudes into macintosh by looking at the condition of the image.
The image redoing is prevalently used in various appositeness like medicine applies
self-governance and gaming. In DWT, there is some course of action of wavelet obli-
gation that is used for the weight, racket diminishes, and revamping process. With
everything taken into account, all correspondence channels have a sporadic racket
in light of these characteristics, and these aqueducts are impacted by the horren-
dous relationship from the fountainhead of the aqueduct. The image amusement is
percolated by the up testing sought after by the modernized aqueduct [6].
Multi-objectives wavelet recondition is the traditional technique of diversion.
The essential downside of the customary system is the utmost amazing gear indis-
pensability to plethora the midway characteristics. The computational deferment
of the tenacious is moreover remarkable. To fatigue these concerns, the multiband
wavelet vicissitude is essentially used for the image revamping process. By using
the proposed multiband wavelet vicissitude, the repeat covering of the equipment is
diminished. The summation aqueducts are used to gather the multiplication square.
The image distinction and force are capable in the multiband wavelet to vicissitude
when appeared differently in relation to standard multiresolution wavelet vicissitude
[7, 8].
Design High-Frequency and Low-Power 2-D DWT Based … 423
2 CM
R = Rr + j Ri
I = Ir + j Ii
R multiplier I then
O=R ×I
O = Rr × Ir − Ri × Ii + j (Rr × Ii + Ri × Ir )
Or = Rr × Ir − Ri × Ii
Oi = Rr × Ii + Ri × Ir
Fig. 1 Structure of N × N Rr
CM N×N
Ii Multiplier
KSA Oi
Adder
Ri
N×N
Ir Multiplier
N×N
Multiplier
Sub- Or
tractor
N×N
Multiplier
424 S. Tripathi et al.
3 DWT
The resolution analysis limit and time-gradation district properties of the DWT have
set up it as a stunning resource for different applications, for instance, signal exam-
ination, picture pressure, and numerical assessment, as communicated by Mallat. It
is driven different exploration social occasions to make counts and gear models to
execute the DWT.
In the standard convolution method for DWT, several finite impulse response
(FIR) aqueducts are applied in equal, to decide high-pass and low-pass aqueduct
coefficients. Mallat’s monolith estimation can be recycled to addresses the wavelet
coefficients of an illustration in a couple of spatial headings.
The plans are by and large crumbled and can be completely requested into consecu-
tive and equal structures as discussed [7]. The designing discussed executes aqueduct
bank structure capably, using digit consecutive pipelining. This building structures
the explanation behind the gear execution of sub band rot, using the conversational
DWT for JPEG 2000. An accustomed plan in whichever DWT break down the
information picture is showing up underneath in Fig. 2.
Each crumbling level showed up in Fig. 2 incorporates two stages arrange performs
level isolating, and stage 2 operate vertical permeate. In the primary level rot, the
breadths of the data picture are N by N size and dissociate four standby federate L_L,
H_H, L_H and H_L. L is imitated by low and H is imitate by high frequency. Four
standby federate are N/2 by N/2 size. L_L standby federates more dossier compared
to other standby federate by virtue of L standby federate is boilerplate value of the
pixel and H standby federate is difference value of a pixel. H_H standby federate is
fewer dossiers. Derived all standby federate is below:
K −1 K
−1
x LJ L ((n 1 , n 2 ) = h(i 1 )h(i 2 )x LJ −1
L (2n 1 − i 1 )(2n 2 − i 2 )
i 1 =0 i 2 =0
K −1 K
−1
x LJ H (n 1 , n 2 ) = h(i 1 )g(i 2 )x LJ −1
L (2n 1 − i 1 )(2n 2 − i 2 )
i 1 =0 i 2 =0
K −1 K
−1
x HJ L (n 1 , n 2 ) = g(i 1 )h(i 2 )x LJ −1
L (2n 1 − i 1 )(2n 2 − i 2 )
i 1 =0 i 2 =0
K −1 K
−1
x HJ H (n 1 , n 2 ) = g(i 1 )g(i 2 )x LJ −1
L (2n 1 − i 1 )(2n 2 − i 2 )
i 1 =0 i 2 =0
4 Proposed Methodology
In the DWT, the bi-balanced wavelets are realized by using the lifting strategy. The
spatial territory and lifting system is used to create a lifting strategy. In the lifting plan,
three guideline steps are generally played out that is, split, anticipate and update. The
information picture tests x(n) are apportioned concerning the odd and even models
in the split square. The channel is required for the odd and even guides to keep from
the bothersome hailing. Lifting plan is performed by based kind of the channel.
The scaling step is used to find the low-pass subgatherings of the odd and even
tests. Channel utilization is changed into the growth of cross sections in the lifting
plan (Fig. 3).
The image pressure is performed successfully by using the lifting plan, and the
gear uses are significantly diminished by using the channels.
Inward item calculation can be communicated by the complex multiplier. The
DWT detailing utilizing convolution plot given in can be communicated by internal
item, where the 1-D DWT definition surrendered (1) and (2) cannot be communicated
by inward item.
In spite of the fact that convolution DWT requests number juggling assets than
DWT, tortuousness DWT is speculated to ensnare the benefits of CM-based plan.
CM definition of tortuousness-based DWT utilizing 5/3 and 9/7 biological channel
is introduced here (Fig. 4).
As per (5) and (6), the 5/3 wavelet channel calculation in tortuousness structure
is communicated as
426 S. Tripathi et al.
4
YL = h(i)X n (i)
i=0
2
YH = g(i)X n (i)
i=0
Position of h(i) and g(i) is boilerplate by low- and high-pass distill 5/3 coefficient.
Position of Y L and Y H is boilerplate by low- and high-pass distill O/P. Position I
vary between 0 to 4 for low distill and 0 to 2 for high distill.
5 Simulation Result
CM-based 5/3 and 9/7 2-D DWT is designed Xilinx software with 14.2i version.
Xilinx is work on two steps, i.e., primary and secondary design. The primary design
is defined on the I/O part of the systems and second part has defined the relation
between I/O part. 5/3 1-D DWT is represented primary and secondary design in
Figs. 5, 6 and 7.
9/7 1-D DWT is represented primary and secondary design in Figs. 8 and 9 (Table
1).
Design High-Frequency and Low-Power 2-D DWT Based … 427
X(n)
DMUX
YL YH
DMUX DMUX
6 Conclusion
It is concluded that the CM-based 2-D DWT provides the best result compared to
the previous year. The compared result is based on delay, adder, frequency, and net
power (Figs. 10, 11 and 12).
428 S. Tripathi et al.
References
1. Gardezi SEI, Aziz F, Javed S, Younis CJ, Alam M, Massoud Y (2019) Design and VLSI imple-
mentation of CSD based DA architecture for 5/3 DWT. 978-1-5386-7729-2/19/$31.00©2019
IEEE
2. Mohamed Asan Basiri M, Noor Mahammad S (2018) An efficient VLSI architecture for convo-
lution based DWT Using MAC. In: 31th international conference on VLSI design and 2018,
17th international conference on embedded systems. IEEE
3. Chakraborty A, Chakraborty D, Banerjee A (2017) A memory efficient, high throughput
and fastest 1D/3D VLSI architecture for reconfigurable 9/7 & 5/3 DWT filters. In: Interna-
tional conference on current trends in computer, electrical, electronics and communication
(ICCTCEEC-2017)
4. Biswas R, Malreddy SR, Banerjee S (2017) A high precision-low area unified architecture for
lossy and lossless 3D multi-level discrete wavelet transform. Trans Circuits Syst Video Technol
45(5)
Design High-Frequency and Low-Power 2-D DWT Based … 433
5. Bhairannawar SS, Kumar R (2016) FPGA implementation of face recognition system using
efficient 5/3 2D-lifting scheme. In: 2016 International conference on vlsi systems, architectures,
technology and applications (VLSI-SATA)
6. Martina M, Masera G, Roch MR, Piccinini G (2015) Result-biased distributed-arithmetic-based
filter architectures for approximately computing the DWT. IEEE Trans Circuits Syst I Regul
Pap 62(8)
7. Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representa-
tion. IEEE Trans Pattern Anal Mach Intell 110:674–693
8. Alam M, Rahman CA, Jullian G (2003) Efficient distributed arithmetic based DWT architec-
tures for multimedia applications. In: Proceedings of IEEE workshop on SoC for real-time
applications, pp 333–336
9. Zhao X, Vi Y, Erdogan AT, Arslan T (2000) A high-efficiency reconfigurable 2-D discrete
wavelet transform engine for JPEG 2000 implementation on next generation digital cameras.
978-1-4244-6683-2/10/$26.00 ©2010 IEEE
10. Fan X, Pang Z, Chen D, Tan HZ (2010) A pipeline architecture for 2-D lifting-based discrete
wavelet transform of JPEG2000. 978-1-4244-7874-3/10/$26.00 ©2010 IEEE
11. Baviskar A, Ashtekar S, Chintawar A, Baviskar J, Mulla A (2014) Performance analysis of sub-
band replacement DWT based image compression technique. 978-1-4799-5364-6/14/$31.00
©2014 IEEE
12. Deergha Rao K, Muralikrishna PV, Gangadhar C (2018) FPGA implementation of 32 bit
complex floating point multiplier using vedic real multipliers with minimum path delay. https://
doi.org/10.1109/UPCON.2018.8597031@2018 IEEE
13. Lian C, Chen K, Chen H, Chen L (2001) Lifting based discrete wavelet transform architecture
for JPEG 2000. In: Proceedings of IEEE international symposium on circuits systems, vol 2,
Sydney, Australia, pp 445–448, May 2001
14. Aziz F, Javed S, Gardezi SEI, Younis CJ, Alam M (2018) Design and implementation of
efficient DA architecture for LeGall 5/3 DWT. In: IEEE international symposium on recent
advances in electrical engineering (RAEE)
Fuzzy Expert System-Based Node Trust
Estimation in Wireless Sensor Networks
Abstract Wireless sensor networks are used in most recent real-time systems glob-
ally. In WSN, many researchers feel they are arriving at the destination in all aspects
of security and trust. Because trust is the significant factor to implement a secure
network by transmitting a packet in a trusted way. Most of the models evaluate the
present trust value of the nodes and do not predict the upcoming changes in the trust
factor of the nodes. This paper provides a new trust estimation model to evaluate
the trust value of the node with a fuzzy expert system that predicts the changes
that may be going to occur in the future based on the inference mechanism. This
proposed work also concentrates on energy efficiency to optimize the node energy
level even though the nodes are undergone trustworthy data transmission. Further,
the results obtained from the experiments show that the proposed model outperforms
the existing in both parameters as trust and energy efficiency.
1 Introduction
WSNs comprise of a huge volume of sensor nodes that are tiny and have restricted
process capableness and energy supports [1]. It performs multi-role like servers,
routers, etc., in general during active mode and in standby behavior also goes to
processing mode while detecting events in encircled space, particularly in explode
cases of subtle packets generated by malicious nodes which increase network traffic
as well as the energy consumption problem. Moreover, the presence of malicious
K. Selvakumar
Department of Computer Applications, NIT, Trichy, India
e-mail: kselvakumar@nitt.edu
L. Sai Ramesh (B)
Department of Information Science and Technology, CEG Campus, Anna University, Chennai,
India
e-mail: sairamesh.ist@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 435
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_33
436 K. Selvakumar and L. Sai Ramesh
nodes with a magnified tendency of malfunction would worsen the network perfor-
mance. Designing an optimal path-based routing and energy consumption system in
WSNs has been proposed previously without and with a fuzzy expert system. These
research works deal with the packet routing and energy saving by establishing a trust
model to measure the trust value of nodes in WSNs.
A fuzzy logic-based research model is proposed to efficiently construct network
traffic and reduces the packet transmission loss for prioritized event-driven traffic
approaches [2]. An energy consumption-based QoS packet routing algorithm for
WSNs was developed to run effectively with best-attempt traffic [3]. However, these
models are assaying to cut down the packet transmission overhead based on the
trust value of the nodes and stable environment without considering the current trust
values of the untrusted nodes which are trying to increase the network traffic as well
as energy consumption.
The main aim of the trust estimation model is to predict trust values that are used
to depict the trustworthiness, reliableness, or competency of each node, with the aid
of some management techniques [4]. Hence, the estimated trust information is used
for the top-level layer to perform packet routing [5–7], data accumulation [8], and
energy optimization process [9–11]. There are a variety of trust management schemes
which are planned for WSNs [12–18]; however, most of them did not establish an
inexpensive trust management theme to precise the sound judgment, uncertainty, and
transitivity of trust characteristics in WSNs. This research article proposes a fuzzy
expert system (FES)-based node trust estimation with the help of a trust estimation
model to optimize the packet transmission as well as energy consumption.
2 Related Works
Trust is outlined because of the belief that the trusting agent has within the trustworthy
agent’s temperament and capability to deliver a reciprocally united service in an
exceedingly given context and in an exceedingly given timeslot [19]. It conveys the
level of an object or a process which is conceived to be true. In most recent years,
many research work on the trust direction system has been pulling more attention,
and also, some peer-to-peer trust control models were introduced.
Trust control models are used to manage the network security issues which also
incorporates the support of encryption techniques with decision building authorities
[20]. Even though there is much significant research works done on trust estimation
models by various researchers, their pertinency on mobile ad-hoc systems as well as
wireless sensor networks has been experienced determined research scope.
The trustworthy protocols have merged the ideas from two various broad spec-
trums: trust estimation model as well as packet routing protocols for both mobile
ad-hoc and wireless sensor networks. Beth et al. [21] suggested a trust control
model, which brought in the idea to convey and calculate the trust, in writing the
worthy equation was extracted and combined. This model also splits the trust into
the following two major types, namely direct trust and recommendation trust, which
Fuzzy Expert System-Based Node Trust Estimation … 437
are the key measures to represent the relationship of trust between the subject and
object and recommendation object to subject. A trust control model projected by
Josang [22] supported the subjective logic model, which introduced the evidence
and invention space to explain and compute the construct of trust relationships. This
model outlined a collection of subjective logic operators for ancestry and intensive
computation of trust worth. Nevertheless, mobile ad-hoc networks had some apparent
underlying features, like restricted resources, easy application, deficiency of central-
ized committed server, varying topology, etc. As a result, the authorized deputation
mechanism, as well as public-key encryptions, is not seemed to be appropriate for
this environment. Therefore, the ancient trust management models are not suitable.
Hence, in this circumstance, so many trust estimation models are projected within
the region of the network. In the mobile ad-hoc atmosphere, trust is often thought
of because of the reliance of a network node on the power of alternative nodes to
sending packets or providing services timely, combinedly, and dependably [23]. In
this research work, a novel adaptive and intelligent trust-based model has been built
to estimate the trustworthiness of nodes existing in the network topology to achieve
secure packet transmission between sender and receiver nodes. Also, the fuzzy rule
base which is used to classifying the node’s trust to reduce the malicious behavior of
those nodes causes the savings of energy in the network. Hence, the energy saving is
achieved and also predicts the optimal path between the sender and receiver nodes
in WSNs.
values, the present trust value NTVij is evaluated as mentioned in Eq. 1. The trust
values are updated at specific time intervals which are mentioned as t 1 and t 2 .
The weights α and β (α, β ≥ 0, α > β, and α + β = 1) are assigned to BTij and
CTij . Now the basic trust is computed using the relation represented by SEm (i, j) as
given Eq. 2.
N tk
SEm (i, j)
BT(t)itkj = m=1
(2)
Ntk
The current trust (CT) value estimated in this model is the trust value of the node
in the time interval between t and t + 1. This proposed trust model from this research
work is to compute the node’s entire trust value based on the fuzzy expert system
approach. In this article, the term current trust (CT) represents the node’s current
trust value as given in Eq. 3. Another factor that evolved using threshold value based
on trust is creditability. The node which is going to be a part of the transmission path
is based on the creditability value. The value of the creditability is high when it is
higher than the specified threshold value; otherwise, it will be considered as medium
or low. The creditability value will change dynamically based on the factors used for
evaluating the creditability.
In this work, current trust (CT) is computed using the mathematical representation
given in Eq. 4.
If n nodes are present in the communication, having current trust values: Using
these n values, CT(t) is computed using the form shown in Eq. 5.
n
CT(t)i j = W Pk × CT(t)i Pk j (5)
k=1
This model which estimates the trust value of the node i on node j in time interval
t + 1 is represented as (T ij (t + 1)) which is derived with the help of both basic trust
of i on j at time t (BTij (t)) and current trust on j to i by few other nodes at the time
of t as (CTij (t)) as shown in Eq. 6 as follows
Fig. 1 Fuzzy membership function representation of the node’s basic trust (BT)
In this research work, the proposed model incorporates Gaussian fuzzifiers for
estimating membership values of the number of packets transmitted by each node
using Eq. 7.
−(x−c)2
μTrust-value (X ) = e 2σ 2 (7)
Based on the knowledge of domain experts, input parameters (low, low medium,
medium, and high), as well as output parameters (low, low medium, medium, and
high), are selected. The range of fuzzy value for each linguistic variable of the
trust-based parameter is shown in Table 1. The fuzzification process begins with
the transubstantiation of the given node-based trust parameters using the functions
that are represented in Eq. 7. Both basic and current trust of node’s related fuzzy
membership representation is shown in Figs. 1 and 2, respectively.
The proposed model combines both global as well as local-based trust optimization
and provides an acceptable and accurate prediction of malicious nodes as well as path
recommendation. The environment for this experiment is created using NS2.3.5. The
simulation environment considered 25 nodes in an area of 500 x 500 m2 . Nodes are
static and each node having equal energy 1 J at the initial stage. The membership
440 K. Selvakumar and L. Sai Ramesh
Fig. 2 Fuzzy membership function representation of the node’s current trust (CT)
values was determined from these values by using the Gaussian fuzzy membership
function which is discussed in Sect. 3.
In the crisp set approach, the minimum threshold value is assumed as 0.4. If the
trust value is greater than threshold value, it is represented as 0, i.e., the trusted
node, and if it is lesser than the threshold value, then it is represented as 1, i.e.,
untrusted node (malicious node) in crisp set (Table 2). Even though the crisp set
value is accurate, but they do not explain anything about the range of trust value. To
overcome this dynamism of truth value, a fuzzy expert system is approached which
corresponds to low, low medium, medium, high.
Fuzzy expert system values are more accurate than the crisp set value which does
not provide anything about the range of trust value. With the help of a fuzzy expert
system, a trusted path is established for transferring data from source to destination.
Hence, a fuzzy expert system-based trust evaluation model is a better, accurate,
reliable result than existing approaches depicted in Fig. 3.
References
1. Forghani A, Rahmani AM (2008) Multi state fault tolerant topology control algorithm for
wireless sensor networks. future generation communication and networking. In: FGCN ‘08.
Second ınternational conference, pp 433–436
2. Munir SA, Wen Bin Y, Biao R, Man M (2007) Fuzzy logic based congestion estimation for
QoS in wireless sensor network. In: Wireless communications and networking conference,
WCNC.IEEE, pp 4336–4346.
3. Akkaya K, Younis M (2003) An Energy-Aware QoS Routing Protocol for Wireless Sensor
Networks. Distributed Computing Systems Workshops, Proceedings. 23rd International
Conference. 710–715
4. Sun YL, Han Z, Liu KJR (2008) Defense of trust management vulnerabilities in distributed
networks. Commun Mag 46(4):112–119
5. Sathiyavathi V, Reshma R, Parvin SS, SaiRamesh L, Ayyasamy A (2019) Dynamic trust
based secure multipath routing for mobile Ad-Hoc networks. In: Intelligent communication
technologies and virtual mobile networks. Springer, Cham, pp 618–625
6. Selvakumar K, Ramesh LS, Kannan A (2016) Fuzzy Based node trust estimation in wireless
sensor networks. Asian J Inf Technol 15(5):951–954
7. Thangaramya K, Logambigai R, SaiRamesh L, Kulothungan K, Ganapathy AKS (2017) An
energy efficient clustering approach using spectral graph theory in wireless sensor networks.
In: 2017 Second ınternational conference on recent trends and challenges in computational
models (ICRTCCM). IEEE, pp 126–129
8. Poolsappasit N, Madria S (2011) A secure data aggregation based trust management approach
for dealing with untrustworthy motes in sensor networks. In: Proceedings of the 40th
ınternational conference on parallel processing (ICPP ’11), pp 138–147
9. Feng RJ, Che SY, Wang X (2012) A credible cluster-head election algorithm based on fuzzy
logic in wireless sensor networks. J Comput Inf Syst 8(15):6241–6248
10. Selvakumar K, Karuppiah M, SaiRamesh L, Islam SH, Hassan MM, Fortino G, Choo KKR
(2019) Intelligent temporal classification and fuzzy rough set-based feature selection algorithm
for intrusion detection system in WSNs. Inf Sci 497:77–90
11. Raj JS (2019) QoS optimization of energy efficient routing in IoT wireless sensor networks. J
ISMAC 1(01):12–23
12. Claycomb WR, Shin D (2011) A novel node level security policy framework for wireless sensor
networks. J Netw Comput Appl 34(1):418–428
444 K. Selvakumar and L. Sai Ramesh
13. Selvakumar K, Sairamesh L, Kannan A (2017) An intelligent energy aware secured algorithm
for routing in wireless sensor networks. Wireless Pers Commun 96(3):4781–4798
14. Feng R, Xu X, Zhou X, Wan J (2011) A trust evaluation algorithm for wireless sensor networks
based on node behaviors and D-S evidence theory. Sensors 11(2):1345–1360
15. Ganeriwal S, Balzano LK, Srivastava MB (2008) Reputation-based framework for high integrity
sensor networks. ACM Trans Sens Netw 4(3):1–37
16. Kamalanathan S, Lakshmanan SR, Arputharaj K (2017) Fuzzy-clustering-based intelligent
and secured energy-aware routing. In: Handbook of research on fuzzy and rough set theory in
organizational decision making. IGI Global, pp 24–37
17. Shaikh RA, Jameel H, d’Auriol BJ, Lee H, Lee S, Song YJ (2009) Group-based trust
management scheme for clustered wireless sensor networks. IEEE Trans Parallel Distrib Syst
20(11):1698–1712
18. Selvakumar K, Sairamesh L, Kannan A (2019) Wise intrusion detection system using fuzzy
rough set-based feature extraction and classification algorithms. Int J Oper Res 35(1):87–107
19. Chang EJ, Hussain FK, Dillon TS (2005) Fuzzy nature of trust and dynamic trust modeling in
service-oriented environments. In: Proceedings of workshop on secure web services, pp 75–83
20. Guo S, Yang O (2007) Energy-aware multicasting in wireless ad hoc networks: a survey and
discussion. Comput Commun 30(9):2129–2148
21. Beth T, Borcherding M, Klein B (1994) Valuation of trust in an open network. In: Proceedings
of ESORICS, pp 3–18
22. Josang A (2001) A logic for uncertain probabilities. Int J Uncertainty Fuzziness Knowl Based
Syst 9(3):279–311
23. Darney PE, Jacob IJ (2019) Performance enhancements of cognitive radio networks using the
improved fuzzy logic. J Soft Comput Paradigm (JSCP) 1(02):57–68
Artificial Neural Network-Based ECG
Signal Classification and the Cardiac
Arrhythmia Identification
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 445
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_34
446 M. Ramkumar et al.
1 Introduction
The heart enables the triggering of minute electrical impulses at the sinoatrial node,
and it enables its spread through the heart’s conduction system in order to make the
rhythmic contraction. The recording of these impulses is done by the ECG instrument
in terms of sticking the surface electrodes over the layer of skin in various parts of the
chest surrounding the cardiac muscle. The electrical tracings of the heart’s activity
are represented as the ECG waveform, and the spikes and dips will determine the
conditions of the cardiac muscle. The generation of normal ECG waveform is shown
in Fig. 1. An ECG waveform is represented as the series of positive waves and the
negative waves which are resulted due to various deflections in each section of the
cardiac beat.
The tracing of typical ECG signal is consisting of the P wave, QRS complex,
and T wave for each cycle of a cardiac beat. The ECG makes the detection over the
ion transfer via the myocardium which gets varied in each heartbeat. The isoelectric
line is denoted as the ECG signal’s baseline voltage wherein which it gets traced
following the sequence of T wave and the preceding of the successive P wave. The
heart’s upper chamber initiates the P wave. The P wave is declared as the initial wave
to be generated because of the contraction of heart’s upper chamber followed by the
flat straight line caused because of the electrical impulse and travels to the lower
chambers. As it is being discussed, the ventricle contraction determines the QRS
complex and the last production of T wave for resting the ventricles. The periodic
cycle of the heart’s electrical activity is denoted by the sequence of P-QRS-T. The
normal values of various ECG waveforms are represented in the following Table 1.
Different data mining and machine learning methods have been formulated for
improving the accuracy of ECG arrhythmia detection. Due to the non-stationary
and the nonlinear nature of the ECG signal, the nonlinear methods of extraction
are denoted as the best candidates for the information extraction in the ECG signal
[1]. Because of the ANN is denoted as the pattern matching method on the basis of
mapping the nonlinear input–output data, it can be efficiently utilized for making the
detection of morphological variations in the nonlinear signals like the ECG signal
component [2]. This study proposes the usage of neural network using backpropaga-
tion algorithm and the technique of Levenberg–Marquardt (LM) technique for ECG
signal classification with which the data has been acquired from MIT-BIH arrhythmia
database.
2 Review of Literature
The presentation over a few studies has been made over the neural network system
performance when it is being utilized for detecting and recognizing the abnormal
ECG signals [3]. The utilization of neural network systems for analyzing the ECG
signal produces few advantages over several conventional techniques. The required
transformations and the clustering operations could be performed by the neural
network simultaneously and automatically. The neural network is also capable of
448 M. Ramkumar et al.
recognizing the nonlinear and the complex groups in the hyperspace [4]. The capa-
bility of producing distinct classification results over various conventional appli-
cations holds a better place for neural network computational intelligence systems.
However, minute work has been dedicated to making the derivation over better param-
eters for the network size reduction along with the maintenance of good accuracy
value in the process of classification.
The model of artificial neural network is being utilized for the prediction of coro-
nary cardiac disease on the basis of risk factors which comprises of T wave ampli-
tude variation and a segment of ST [5]. Two stages have been adapted in the neural
network for the classification of acquired input ECG waveform into four different
types of beats that aid in the improvement over the accuracy on diagnosis [6]. Support
vector machine (SVM) is denoted as one of the machine learning algorithms which
is utilized for the classification can proceed with the process of pattern recognition
based on statistical learning theory [7]. The KNN method is denoted as the process
of learning on the basis of instance which is widely utilized the technique of data
mining in recognizing the pattern and classifying the problems [8].
Mode, median, standard deviation and mean are represented as the first-order
probabilistic features. Variance, skewness, and kurtosis denote the top order proba-
bilistic features [9]. Standard deviation lends the calculative measure for quantifying
the total amount of depression or variation for a set of values in a data. Kurtosis is
declared as the measurement of the data whether it is flat or peaked in relation to
the normal distribution. Informational data which possess a high value of kurtosis,
it is assumed to possess a distinct mean subsequent to the mean, rapidly it declines
added to that it possesses heavy tails [10]. Skewness makes the indication over the
deviation and asymmetry in the analysis of distribution to the normal distribution.
(a) Mean:
When the values set possess the deep central tendency sufficiently, the relation by
the set of numbers to its respective moments determines the additive component of
the values integer powers. The arithmetic mean for the set of values for x1 , . . . , xn
is denoted by the following equation.
1
N
x= xj (1)
N j=1
(b) Variance:
When the description of the mean is made to the distribution location, the variance
is declared as the path for capturing its degree or scale of its condition being spread
apart. The variance unit is denoted as the square of the original variable unit. The
variance positive square root is termed to be a standard deviation.
1 2
N
Var(x1 , . . . x N ) = xj − x (2)
N − 1 j=1
Artificial Neural Network-Based ECG Signal Classification … 449
(e) Kurtosis:
The definition of Kurtosis is made in such a way that the cumulant of the 4th scale is
divided by the square of cumulant in the second scale which is equated to its fourth
moment surrounding the mean which gets divided by the variance square of the
statistical distribution subtracted by 3, which is denoted as the term excess kurtosis.
The conventional expression of the kurtosis is denoted by the following expression.
⎧ 4 ⎫
⎨1 N
Xj − X ⎬
Kurt x1,..., x N = −3 (5)
⎩N σ ⎭
j=1
where the term (−3) denotes the zero value for the normal distribution.
If this is being considered for a case, the 3rd moment or skewness and the 4th
moment or kurtosis must be utilized with caution else it need not be used with these
considerations. The kurtosis is being determined as the quantity of non-dimensional.
It establishes the measurement of the relative distribution over its flatness or peak. The
positive kurtosis distribution is represented as leptokurtic, and the negative kurtosis
distribution is represented as the platykurtic. The middle part of the distribution is
450 M. Ramkumar et al.
Fig. 2 Distributions whose 3rd moment and 4th moment significantly getting varied from the
Gaussian or normal distribution. a 3rd moment or Skewness, b 4th moment or kurtosis
termed to be as mesokurtic [12]. Figure 2 depicts the variation of its distribution from
the Gaussian to the skewness and kurtosis.
The term skewness determines the representation over the distribution in an asym-
metrical manner. The distribution which possesses an asymmetric tail that tends to
extend toward right side direction is denoted to be skewed in a positive direction.
And similarly, an asymmetric tail that tends to extend toward left side direction is
assigned to be skewed in a negative direction. In our study, it is mainly used for the
measurement and verification of symmetricity of data that indicates the statistical
variable distribution. Kurtosis is the measurement which makes the determination
over the distribution degree of flatness whether it is flattened or tapered made in
comparison with the normal pattern characterization. When the value of Kurtosis
becomes higher the resulted values will be greater than that of the average value.
Thus, in this classification study, these realizations are made with respect to select
the adequate features to process within the ANN classifier.
The proposed approach for classifying the ECG cardiac arrhythmias makes the
involvement over ECG signal preprocessing, feature extraction over the distin-
guished statistical and the non-statistical parameters and finally classifying the
cardiac arrhythmias utilizing the artificial neural network technique with the appli-
cation of Levenberg–Marquardt backpropagation neural network (LM-BPNN). The
schematic diagram for the classification of ECG arrhythmia using ANN has been
represented in Fig. 3.
Artificial Neural Network-Based ECG Signal Classification … 451
Output
categorization of ANN
cardiac beats and Classification
detection of of ECG
arrhythmias
Fig. 4 Block diagram of neural network system in diagnosing the cardiac arrhythmia
452 M. Ramkumar et al.
which the expression is made in terms of the square’s sum of real-valued functions
which is nonlinear [13, 14]. LM algorithm can be considered as the combination of
Gauss–Newton method and the steepest descent algorithm. This LM algorithm is
declared as one of the most robust methods when compared to that of GN algorithm
with which most importantly it identifies the solution even if it is initiated with
the final minimum. At the time of iterations, the new weight configuration in the
sequential step k + 1 for which the calculation is made as follows.
−1 T
W (k + 1) = W (k) − J T J + λI J ε(k) (6)
where J-denotes the Jacobian matrix, λ-denotes the adjustable parameter, ε-denotes
the error vector. The modification over the λ parameter is based on error function (E)
development. If the step induces the reduction of E, then it could be accepted. Else
the value of λ will be varied. Finally, the original value is being reset and recalculation
is made for W (k + 1).
(b) Preprocessing of data
Data preprocessing is the primary initial step for developing any model. The columns
which is consisting of all 0’s is being deleted along with the disappeared values and
the columns with most of the zero values are also being deleted. It is being acquired
with 182 columns out of which 12 are meant for categorizing and the balance 170 are
meant as numerical one. As the next step, 32 rows have been deleted which comprises
of missing values and the balance 37,500 number of samples are considered for
determining the analysis in the system. Randomization is completely done on the
datasets after deleting the unwanted records. In this mode, there is no presence of
outlier in the processing data. The partitioning of datasets has been made into three
different representation. They are 68% of training sets, 16% of validation sets, and
16% of the testing dataset.
(c) Classification of arrhythmia
In this study, following are the arrhythmias of ECG beats considered for classification
and it is made into 12 classes.
1. Normal beat
2. Left Bundle Branch Block (LBBB) beat
3. Right Bundle Branch Block (RBBB) beat
4. Atrial Escape (AE) beat
5. Nodal (Junctional) Escape (NE) beat
6. Atrial Premature (AP) beat
7. Premature ventricular contraction (PVC) beat
8. Fusion of ventricular and normal (FVN) beat
9. Ventricular escape (VE) beat
10. Paced beat
11. Supra-ventricular premature beat (SP) beat
12. Nodal (junctional) Premature (NP) beat
Artificial Neural Network-Based ECG Signal Classification … 453
By utilizing ANN, for classifying the ECG cardiac arrhythmias the analysis is
being determined for mean, standard deviation, variance, skewness, and kurtosis as
the variables of input, and it is acquired from the heart rate signals. The suitable
values for various cardiac arrhythmias are being chosen as provided from Table 1
[15, 16].
(d) Method of Performance Evaluation
By the utilization of ANN, the classification performance has been evaluated by
utilizing 4 performance measures of metrics. They are sensitivity, specificity, positive
predictivity, and accuracy. These performance metrics are determined using True
Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) [17,
18].
1. True Positive: An instance with which the detection of cardiac arrhythmia is
being coincided with the diagnosis of the physician.
2. True Negative: An instance with which both the physician and the classifier output
has provided a suggestion that the result declares the absence of arrhythmia.
3. False Positive: An instance with which the classification system wrongly
classifies the healthy ECG as the arrhythmia.
4. False Negative: The classification system determines the result as healthy instead
of arrhythmia.
5. Classification Accuracy: It is determined as the ratio of the total count of correctly
classified signals and is denoted with the following equation.
TP + TN
Accuracy = (7)
N
N denotes the total count of inputs.
6. Sensitivity: It denotes the rate of positively classified correct samples. It is also
named as True Positive Rate. Normally for a system, the value of sensitivity must
be higher.
TP
Sensitivity = (8)
TP − FN
7. Specificity: It denotes the detection over the negative samples has made correctly.
It is also referred to as the False Positive Rate. Normally for a system, the value
of specificity must be the highest.
TN
Specificity = (9)
FP − TN
8. Positive Predictivity: It denotes the ratio of the total count of correctly detected
events (TP) to the total count of events with which the analyzer has been detected.
TP
Positive Predictivity = (10)
TP − FP
454 M. Ramkumar et al.
The training of neural network is made with backpropagation algorithm, i.e., variable
learning rate backpropagation Levenberg–Marquardt. The neural network training
window is being shown in Fig. 6. The neural network fitting function is shown in
Fig. 5. The neural network is allowed to process with 37,500 various samples for
doing the training and the testing processes. Among those various samples, 68% of
samples are utilized for training the neural network, 16% of the samples are utilized
for testing the neural network, and the balance 16% of the samples are utilized for
the validation the network. The comparison of the results has been made over the
periodical repetition of iterations through the adaptive mechanism and by shuffling
the sample values during the process of training. The error histogram is denoted
as the plot in-between the value of error and the total count of instances in which
the error has been formed. The 20-bins error histogram with respect to the different
instances on y-axis and the error value (target-output) has been plotted, and it is
depicted in Fig. 7. At the middle of the histogram plot, it has a minimum error and
the error value increases as it is moved away from its center.
Figure 8 depicts the neural network training regression plot which shows the
relation between the target and the output. In the NN training, for classifying the
cardiac arrhythmias it takes 50 iterations to complete the cycle. At 17th epoch, the
regression window has been depicted. The neural network training state has been
shown in Fig. 9 which shows the relation between the gradient and the epochs at
14th iteration. The best validation performance has been determined at the 17th
epoch, and it is acquired with the value of 0.0016349, and its window is shown in
Fig. 10. The output response as the result of classification has also been analyzed
with the individual element determining the characteristics over the target, output
and error with respect to time. Its window is shown in Fig. 11. The indication over the
output and the target is related by the regression value. The plot of regression lends
the information on how close the output is matched with the target values. The output
of the network would possess a strong linear relationship with the desired targets if
the regression coefficient value termed to be as unity. If the regression coefficient
value is approaching zero, then the prediction over the output and the target cannot
be done with the relation.
The performance plot is denoted as the plot across the mean square error and the
total count of epochs. Mean square error is denoted as the squared average difference
0.5
0
-0.985
-0.8828
-0.7806
-0.6785
-0.5763
-0.4741
-0.3719
-0.2697
-0.1675
-0.06532
0.03687
0.1391
0.2412
0.3434
0.4456
0.5478
0.65
0.7522
0.8544
0.9566
between the output data and the target data. MSE with zero indicates that there is no
error. When the training is initiated and it is under progress, the error gets reduced.
When the value of mean square error is reduced to the minimum value, the process of
training gets stopped and the validation of the network happens with the samples. In
the phase of validation, if the behavior of the network is identified properly, then the
training comes to an end and it will be ready to undergo the testing process. The LM
determines better performance when the comparison is made with other methods on
the basis of calculating MSE. Table 2 determines the classification of 12 different
types of arrhythmias along with its accuracy, sensitivity, positive predictivity, and
specificity values.
The error plot has been obtained on the basis of the acquired error value at the
time of training stage, validation stage and testing stages utilizing the individual
cycle data. The mechanism of data sharing with respect to statistical feature selection
techniques is significantly attaining high performance with minimum error. Hence,
Artificial Neural Network-Based ECG Signal Classification … 457
in the histogram plot of 20 bins, the peak value is in the middle region of 0.03687,
and it holds minimum error only in the middle region. And also, it indicates that
when the error histogram moves away from the middle region, it will result in more
error. This proves the ANN system has been classified with high accuracy.
The regression plot which relates the output and target is shown in Fig. 8. At the
regression value R with 0.99387, 0.99431, and 0.9951 for training validation and
testing, respectively, the output has been acquired with almost linear which is nearly
unity. This linearity determines the best realization in determining the relationship
between the training, validation, and the testing values. As a whole, the processing
data has been attained with the regression value of 0.99412. At 17th epoch, the neural
network training state is being shwon in Fig. 9 which infers the performance over the
neural network training process is highly efficient and has resulted in the gradient
value of 0.00052892. It yields the accuracy of the ANN system by evaluating its
performance. The realization has been made between the MSE and the total number
of iterations that have been undergone. It has been resulted in the best validation
performance of 0.0014894 at 17th iteration. The best realization has attained from
the training, validation, and the testing data in formulating the error analysis. The
458 M. Ramkumar et al.
response characteristics of the output element had also been shown in Fig. 11 realizing
its response of error, target, and output.
From the above classification results, it could be inferred that the accuracy is
higher with 98.8% for classifying the normal beats, and the sensitivity is high with
97.64% for the fusion of ventricular and normal (FVN) beat, specificity is high with
96.68% for left bundle branch block (LBBB) beat and the positive predictivity is
with the higher value of 97.63% for fusion of ventricular and normal (FVN) beat.
5 Conclusion
0.998
0.9975
0.997
10-3
3
Targets - Outputs
2
Error
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time
and False Negative (FN). The results over the experimentation have shown that the
accuracy of classification is being existed from 91.18 to 98.8% for the 12 class of
ECG arrhythmias.
460 M. Ramkumar et al.
Table 2 Probabilistic results of classifying ECG signal by LVQ NN showing the performance
metrics of 12 class of arrhythmias
S. No. ECG arrhythmia Accuracy Sensitivity Specificity Positive
beats (%) (%) (%) predictivity
(%)
1 Normal beat 98.8 96.48 55.48 96.55
2 Left bundle branch 94.48 93.34 96.68 93.47
block (LBBB) beat
3 Right bundle branch 92.68 90.84 92.24 90.89
block (RBBB) beat
4 Atrial escape (AE) 91.25 90.97 91.18 90.92
beat
5 Nodal (junctional) 95.62 91.68 39.14 91.67
escape (NE) beat
6 Atrial premature 96.24 91.43 84.49 91.45
(ap) beat
7 Premature 94.64 95.45 85.68 95.41
ventricular
contraction (PVC)
beat
8 Fusion of 95.54 97.64 92.46 97.63
ventricular and
normal (FVN) beat
9 Ventricular escape 91.18 92.21 95.57 92.23
(VE) beat
10 Paced beat 97.68 97.04 91.28 97.09
11 Supra-ventricular 94.14 96.67 85.66 96.68
premature beat (SP)
beat
12 Nodal (junctional) 96.66 90.01 78.24 90.09
premature (NP) beat
References
1. Turakhia MP, Hoang DD, Zimetbaum P et al. (2013) Diagnostic utility of a novel leadless
arrhythmia monitoring device. Am J Cardiol 112(4):520–524
2. Perez de Isla L, Lennie V, Quezada M et al (2011) New generation dynamic, wireless and
remote cardiac monitorization platform: a feasibility study. Int J Cardiol 153(1):83–85
3. Olmos C, Franco E, Suárez-Barrientos A et al (2014) Wearable wireless remote moni-
toring system: An alternative for prolonged electrocardiographic monitoring. Int J Cardiol
1(172):e43–e44
4. Huang C, Ye S, Chen H et al (2011) A novel method for detection of the transition between
atrial fibrillation and sinus rhythm. IEEE Trans Biomed Eng 58(4):1113–1119
5. Niranjana Murthy H, Meenakshi M (2013) ANN model to predict coronary heart disease based
on risk factors. Bonfiring Int J Man Mach Interface 3(2):13–18
6. Ceylan R, Özbay Y (2007) Comparison of FCM, PCA and WT techniques for classification
ECG arrhythmias using artificial neural network. Expert Syst Appl 33(2):286–295
Artificial Neural Network-Based ECG Signal Classification … 461
7. Dubey V, Richariya V (2013) A neural network approach for ECG classification. Int J Emerg
Technol Adv Eng 3
8. Zadeh AE, Khazaee A, Ranaee V (2010) Classification of the electrocardiogram signals using
supervised classifiers and efficient features. Comput Methods Prog Biomed 99(2):179–194
9. Jadhav SM, Nalbalwar SL, Ghatol AA (2010) ECG arrhythmia classification using modular
neural network model. In: IEEE EMBS conference on biomedical engineering and sciences
10. Sreedevi G, Anuradha B (2017) ECG Feature Extraction and Parameter Evaluation for
Detection of Heart Arrhythmias. I Manager’s J Dig Signal Process 5(1):29–38
11. Acharya UR, Subbanna Bhat P, Iyengar SS, Rao A, Dua S (2003) Classification of heart rate
using artificial neural networkand fuzzy equivalence relation. Pattern Recognit 36:61–68
12. Kannathal N, Puthusserypady SK, Choo Min L, Acharya UR, Laxminarayan S (2005) Cardiac
state diagnosis using adaptive neuro-fuzzy technique. In: Proceedings of the IEEE engineering
in medicine and biology, 27th annual conference Shanghai, China, 1–4 Sept 2005
13. Acarya R, Kumar A, Bhat PS, Lim CM, Iyengar SS, Kannathal N, Krishnan SM (2004)
Classification of cardiac abnormalities using heart rate signals. Med Biol Eng Comput
42:288–293
14. Shah Atman P, Rubin SA (2007) Errors in the computerized electrocardiogram interpretation
of cardiac rhythm. J Electrocardiol 40(5):385–390
15. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple
way to prevent neural networks from overfitting. J Mach Learn Res 15(1): 1929–1958
16. Turakhia MP, Hoang DD, Zimetbaum P, Miller JD, Froelicher VF, Kumar UN, Xu X, Yang
F, Heidenreich PA (2013) Diagnostic utility of a novel leadless arrhythmia monitoring device.
Am J Cardiol 112(4):520–524
17. Xiong W, Droppo J, Huang X, Seide F, Seltzer M, Stolcke A, Yu D, Zweig G (2016) Achieving
human parity in conversational speech recognition. arXiv preprint arXiv:1610.05256
18. Melo SL, Caloba LP, Nadal J (2000) Arrhythmia analysis using artificial neural network and
decimated electrocardiographic data. In: Computers in cardiology 2000, pp. 73–76. IEEE
CDS-Based Routing in MANET Using Q
Learning with Extended Episodic Length
1 Introduction
Routing in MANETs is challenging due to the dynamic nature of the network. The
routing information in MANET needs to be updated on regular intervals due to the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 463
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_35
464 D. S. John Deva Prasanna et al.
Connected dominating set (CDS) is a resilient technique used for the formation of a
backbone in the MANET. CDS is a graph theory concept in which every node in the
graph either will be in the dominating set of the graph or will be a one-hop neighbour
to the dominating node. The concept of CDS is used in MANET routing, as the CDS
will act as a backbone for all communications. MANET routing is usually done by
broadcasting, and hence, if a node transmits a data, then all the neighbouring nodes
will be receiving that message. By routing all the communications through CDS, this
can be avoided.
Most CDS constructions techniques follow a centralized approach, which needs
information about the node and edge weight for all the nodes in the graph. Central-
ized CDS formation approaches are hard to implement in a MANET scenario, as it
needs information about the entire MANET. CDS is constructed using nodes with
better network metrics like link stability and residual energy. Though these network
parameters are considered, the construction of CDS is done by greedy approaches in
MANET. Greedy approaches are easy to implement but might result in less efficient
CDS with an increased number of CDS nodes.
CDS-Based Routing in MANET Using … 465
2 Literature Survey
A Q learning algorithm for MANET routing was proposed in [2], which uses
residual energy, node mobility and link quality for finding the efficient intermediate
nodes in a data transfer. These parameters are used in calculating the Q values of
the nodes. The nodes in the MANET are bootstrapped with these parameters before
the learning begins. This algorithm suffers a setback when the size of the MANET
increases. The algorithm requires the nodes to know about the hop count between
any pair of nodes and hence obtaining the topological information from all the nodes
in the MANET, which is challenging in large networks. Performance evaluation
of routing algorithm for MANET based on the machine learning techniques was
proposed in [3].
A Patially Observable Markov Decision Process (POMDP) was modelled for the
entire MANET Scenario in [4]. Here the MANET is considered as an observable
entity since nodes cannot obtain topological information for the entire network. The
nodes are marked as agents. The packet delivery actions like unicasting, multicasting
and broadcasting the packets and considered as actions. The nodes can interact with
the environment by sending packets in the network. Status of the nodes during the
packet transfer like packet sent, packet received and packet dropped is considered
as the state of the nodes. If a node successfully transfers a packet to its neighbours,
then it received a reward value, and if the transmission fails, the node will receive a
penalty. The reward/penalty model accumulates the value of a node.
A MANET path discovery was done using on policy first-visit Monte Carlo
method in [5]. This algorithm combines with ticket-based path discovery method
with reinforcement learning. The nodes send probing tickets to the destination node
and learn about the intermediate nodes and their available resources. The nodes will
maintain a table of routes to other nodes in the MANET. Though the algorithm mini-
mizes resource utilization, it is hard for the energy-constrained to probe the route in
a dynamic environment. The algorithm also suffers a set back when the size of the
MANET grows.
In [6], an algorithm was built to construct connected dominating set using the
nodes, which have a high packet delivery ratio and residual energy. This algorithm is
called RDBR and follows conventional beacon method exchange of network param-
eter information like packet delivery ratio and residual energy. Usage of machine
learning algorithms which can learn on partial observable conditions like MANET
is not used in this paper.
Several algorithms with centralized CDS formation approach and optimizations
were proposed in [7] and [8] by forming a maximal independent set (MIS) and
then optimizing the MIS nodes to be connected to each other in order to form a
CDS. Though this algorithm computes the minimum CDS, it follows a centralized
approach and will only be suitable for stable networks. The algorithms have higher
time complexity and may not be suitable for dynamic networks, where observing the
topological changes of the entire network is not possible within the stipulated time.
In [9], a CDS reconstruction algorithm was proposed to reconstruct the CDS
locally with minimum changes by exchanging tiny control message packets. The
algorithm reduced the control over considerably, and hence, the performance of the
MANET is increased. This works much focuses on the reconstruction of the CDS
CDS-Based Routing in MANET Using … 467
and contributes less to the initial establishment of CDS. Adopting CDS construc-
tion strategies based on network performance metrics will further improve the
performance of this algorithm [9].
A CDS construction using the Q values of the nodes estimated using Q learning
was proposed in [10]. In this algorithm, the CDS construction is done in a greedy
fashion using the Q values of the nodes in the MANET. This algorithm suffers due
to the greedy approach as some CDS nodes might have all neighbour nodes with
low Q values. The algorithm will not have any option except to add one of the low
Q-valued nodes as CDS node.
From the literature survey, it can be inferred that many MANET routing algorithms
are formulated using reinforcement learning technique. Each algorithm has its own
pros and cons. Constructions of CDS using reinforcement learning is done using
a greedy approach. Therefore, a Q learning-based CDS construction algorithm in
MANETs is proposed, and to avoid the greedy nature of the algorithm, the length of
the reinforcement learning episode is extended from one hop to two hop.
This paper aims to achieve an efficient route between the source and destination by
proposing an algorithm to construct CDS using Q learning algorithm with extended
episode length. Nodes in the MANET interact with its neighbours by sending a
message to its neighbour. Successful transactions earn a node a rewards and every
failed transaction will incur a penalty. Every node in the MANET develops cumula-
tive Q values of its neighbours by sending messages at various point of time. This
scenario of a node sending a message and calculating reward/penalty is called an
episode.
In MANET, nodes can assess the network parameters like link quality and
residual energy for its one-hop neighbours only. In conventional routing techniques,
collecting parameter values beyond one-hop neighbours will need extra control
message exchanges. In the proposed algorithm, the Q value of a node is estimated
using its signal stability, residual energy and the Q value of its neighbour. Hence,
the Q value of any node will now reflect the quality of itself as well as the quality its
best neighbour. When this Q value is used for CDS construction, not only nodes with
higher Q values and quality neighbours are selected as CDS members. Through this,
the visibility of a node is increased from one hop to two hops and the obtained CDS
is more efficient and stable. The conceptual diagram and workflow of the proposed
extended Q CDS algorithm are shown in Fig. 2.
468 D. S. John Deva Prasanna et al.
During the CDS exploration process, the algorithm always considers the immediate
next node with the maximum Q value to be added as the next CDS node. This tech-
nique is greedy and sometimes will result is longer and inefficient CDS. Figure 3
illustrates this scenario, where the greedy approach constructs a sub-optimal solu-
tion. The initial Q values are assumed according to the residual energy and signal
stability ratio of the individual nodes in the network for the illustrative purpose. The
Q values are learned and estimated during the exploration phase and are updated
during the exploitation phase based on signal stability, residual energy and assigned
reward/penalty values. Here the node n2 chooses node n5 as its next CDS node as it
has the highest Q value. After including the node n5 to the CDS, the only possible next
Fig. 3 MANETs with CDS comprising of the nodes n2, n5, n3, n7, n11 and n10
CDS-Based Routing in MANET Using … 469
CDS node is n3 which has a very low Q value. Constructing CDS through this path
will only result in sub-optimal CDS, which may require frequent re-computation.
Moreover, the number of nodes in the CDS is also increased due to this technique.
To solve the above issue every, it is prudent to increase the episode length in RL
learning phase. The nodes learn about their one-hop neighbouring nodes through
interaction, in this technique, the nodes will also know about the neighbours of its one-
hop neighbours. When a node sends a message to its neighbour node, the receiving
node will acknowledge the message and appends the highest Q value among of all
its one-hop neighbours. In the example scenario shown in Fig. 4, node n2 sends a
message to the node n7 and node n7 will acknowledge along with the highest Q value
of n7’s one-hop neighbours, which is in this case n11. Node n2 incorporates the Q
value of the node n11 into the Q value of node n7. In this way, the nodes that are
having high Q value and high-quality neighbour nodes will be selected to form the
CDS. The obtained CDS is found to be optimal in terms of CDS size and versatile
in terms of CDS lifetime.
Fig. 4 MANETs with CDS comprising of the nodes n2, n7, n11 and n10
470 D. S. John Deva Prasanna et al.
Q values of the nodes are calculated by sending beacon packets to its neighbours and
verifying the status of the packet delivery. The node, which is sending the beacon
packet, is called an assessor node, and the node, which receives the packet, is called
as an assessed node. The assessor node will set a reward of one, for all the assessed
nodes, which are responding to the initial beacon with an acknowledgement. This
reward value will be further characterized by the learning rate parameter calculated
based on the residual energy and the signal stability obtained from the received
acknowledgement. The Q values calculated for every packet transmission are added
to form a single cumulative Q value for the particular node.
The Q value of the nodes will be estimated using the generic Q learning formula
personalized by learning rate and decay constant.
⎛ ⎞
In Eq. (1), ‘S’ refers to the set of states a node can be in any point of time, ‘at ’
refers to the action taken at time ‘t’ that can change the state of the node from one
to another.
S = {D, F}, the terms D and F refer to the states ‘“Delivered’ and ‘“Failed’,
respectively.
at = {T, B}, the terms T and B refer to the action of ‘Transmit’ and ‘Buffer’,
respectively. When the node transmits the data, the action is referred to as ‘“Transmit’.
If the node stalls the transmission or the transmission fails, then the data will remain
in the buffer and the action is referred to as ‘Buffer’.
If a node successfully delivers the packet to its neighbour, then the state of the
node will be delivered, and if the packet is not delivered, then the node will be in the
failed state.
Q(S, at ) refers to the exiting Q value of the node.
max (β · Q(S, at+1 )) refers to the policy of selecting an action which will yield
S∈D,F
maximum reward, in our case, ‘D’ is the desired state which can be obtained by
taking an action of data transmission. If the data transmission fails, then the node
will attract a negative reward.
Rt+1 refers to the reward allotted for taking an action.
The learning rate ρ and the decay factor β are elaborated in the following sections.
CDS-Based Routing in MANET Using … 471
Decay parameter is used to estimate the staleness of the learned Q value in Q learning.
The Q value will be reduced as time passes on because the estimation once done might
change over a period of time. In the proposed algorithm, residual energy of the nodes
is considered as decay parameter. The energy levels of the nodes will reduce as they
involve in the data transfer, more the nodes involve in data transfer more they will
drain their residual energy. The decay factor calculated using the residual energy
will add a standard deduction to the node’s Q value whenever the node transmits
a data. When a CDS node’s Q value goes below than the threshold, then the CDS
re-computation is triggered. Decay factor is represented using the symbol ‘β’.
β = r0 e−0.001tr (2)
Learning rate is a parameter, which controls the algorithm’s speed of learning. When
a node receives acknowledgement from more than one neighbour by beaconing, not
all nodes can be assigned with the same Q value. Q learning has to ensure that the
best neighbour receives the highest Q value. The proposed algorithm identifies the
best neighbour node by using the residual energy of the node and the signal stability
between the nodes as learning rate parameters. Through the learning rate, the node
with better residual energy and link stability will receive a higher Q value than other
nodes.
Estimation Link Stability Learning Rate ρ ss . The signal stability between the two
nodes will be estimated based on bit error rate calculation.
1
ρss = LSi j ∝ (3)
BERi j
1
LSi j = k (4)
BERi j
where
K is the proportionality constant and can be assigned with an integer.
LSij → link stability between the node i and j.
Estimation of Residual Energy Learning Rate ρ re . The residual energy of the node
will be evaluated using the initial energy of the node and the energy consumed.
472 D. S. John Deva Prasanna et al.
ρre = 1 − E c (5)
Ec = Et + Er + Ep (6)
E c → Energy consumed
E t → Energy spent to transmit a packet.
E r → Energy spent to receive a packet.
E p → Energy spent to process a packet.
ρre = E AR + E PR (7)
In the extended Q learning algorithm, the Q value of a node is computed with its Q
value as well as the Q value of its best neighbour.
Hence,
Q x (S, at ) = (w1 Q i (S, at )) + w2 max Q j (S, at ) (9)
j=1...n
where
Q x (S, at ) refers to the extended Q value of a node ‘i’ incorporated with the Q
value of its neighbour.
Qi (S, at ) is the Q value ofthe node ‘i’ estimated through Eq. (1).
The term max Q j (S, at ) refers to the highest Q value found among the neigh-
j=1...n
bouring nodes of the node ‘i’. The direct Q value of the node and the maximum Q
value of the neighbour nodes are given the weightage of w1 and w2 which are 0.6
and 0.4, respectively. So the Q value of any node will reflect its own quality as well
as its one-hop neighbour where w1 > w2 here w1 and w2 .
CDS-Based Routing in MANET Using … 473
The process of deducing the CDS using the estimated Q values of the nodes is called
an exploration process. CDS exploration will happen only during the initial phase
of the CDS establishment and when the Q value of any one of the CDS node goes
below the threshold value. During the exploration process the node, which initiates
the CDS construction will select its neighbour node with highest extended Q value
as the next CDS node. All other one-hop neighbour nodes are declared as covered
nodes. This incremental addition of nodes with the highest extended Q values to CDS
will continue until all the nodes in the MANETs are covered by the CDS. Once the
CDS is established, all communications will go through the backbone and the process
is called exploitation. During the exploitation process, the Q value is calculated on
every transaction. If any CDS node’s Q value goes below the threshold, then CDS
exploration process is triggered.
Step 1: Initialize the MANET by placing nodes randomly with equal energy and
specified terrain dimension.
Step 2: Bootstrap the nodes with distance, BER and residual energy.
Step 3: Estimate signal stability learning rate using BER and distance.
Step 4: Estimate residual energy learning rate.
Step 5: Estimate the overall learning rate using signal stability learning rate and
residual energy learning rate.
Step 6: Assign reward and penalty values for nodes based on packet transitions.
Step 7: Calculate Q value of the neighbouring nodes and incorporate the Q value of
the two hop nodes obtained from neighbouring nodes.
Step 8: Explore the next best neighbour based on highest Q value and include it in
the CDS. All the immediate neighbours will act as covered nodes.
Step 9: Repeat step 8 to form CDS until all nodes in the network are covered. Each
and every node will update its Q value table about their neighbours.
Step 10: If the Q value of any one of the nodes decays below the threshold then
reinitiate exploration again.
Figure 5 illustrates the flowchart of the extended Q CDS.
474 D. S. John Deva Prasanna et al.
The extended Q CDS is implemented using NS2,and the simulation parameters are
provided in Table 1.
Figure 6 and 7 show the screenshots of NS2 simulation of the algorithm. The
experiments have been carried out for different seed values, and the average is used
for result and analysis.
The algorithm is experimented by varying the number of nodes and metrics
such as packet delivery ratio, end-to-end delay, residual energy and size of CDS
Table 1 Simulation
Parameter Values
parameters in NS2
Number of nodes 100
Speed Upto 20 m/s
Mobility model Random waypoint
Node placement Random
Initial energy of the node 400 J
Simulation area 1000 m × 1000 m
Simulation time 10 min
was measured. The extended Q CDS algorithm is compared with the reliable CDS
(RCDS) [4] and cognitive CDS (CDSCR) [2] and Q CDS [10] and the algorithm
performs considerably better.
Figure 8 illustrates that extended QCDS can perform better than other algorithms
with respect to packet delivery ratio. The QCDS and extended QCDS algorithms
constructs almost same CDS when the number of nodes are less, but when the number
95
90
85
RDBR
80
75 CDSCR
70 QCDS
65
Ex QCDS
60
10 20 30 40 50 60 70 80 90 100
No.of Nodes
0.12 RDBR
0.1
0.08 CDSCR
0.06
0.04 QCDS
0.02
0 Ex Q CDS
10 20 30 40 50 60 70 80 90 100
No.of Nodes
25
20
RDBR
15
CDSCR
10
QCDS
5
Ex Q CDS
0
20 40 60 80 100
No.of Nodes
800
Control Overhead
600 RDBR
400 CDSCR
QCDS
200
Ex Q CDS
0
20 40 60 80 100
No.of Nodes
5 Conclusion
References
4. Nurmi P (2007) Reinforcement learning for routing in ad hoc networks. In: 2007 5th interna-
tional symposium on modeling and optimization in mobile, ad hoc and wireless networks and
workshops
5. Usaha W, Barria J (2004) A reinforcement learning ticket-based probing path discovery scheme
for MANETs. AdHoc Netw
6. Preetha K, Unnikrishnan (2017) Enhanced domination set based routing in mobile ad hoc
networks with reliable nodes. Comput Electr Eng 64:595–604 (2017)
7. Tran TN, Nguyen T-V, An B (2019) An efficient connected dominating set clustering based
routing protocol with dynamic channel selection in cognitive mobile ad hoc networks. Comput
Electr Eng
8. Hedar AR, Ismail R, El-Sayed GA, Khayyat KMJ (2018) Two meta-heuristics designed to solve
the minimum connected dominating set problem for wireless networks design and management.
J Netw Syst Manage 27(3):647–687
9. Smys S, Bala GJ, Raj JS (2010) Self-organizing hierarchical structure for wireless networks.
In: 2010 international conference on advances in computer engineering. https://doi.org/10.110
9/ace
10. John Deva Prasanna DS, John Aravindhar D, Sivasankar P (2019) Reinforcement learning
based virtual backbone construction in Manet using connected dominating sets. J Crit Rev
A Graphical User Interface Based Heart
Rate Monitoring Process and Detection
of PQRST Peaks from ECG Signal
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 481
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_36
482 M. Ramkumar et al.
this proposal of current work is made in such a way that it is a very dependable and
simple method in detecting the values of P, Q, R, S, and T peak values of an ECG
waveform. This technique is proposed on the basis of determining the mathemat-
ical relationship between the ECG signal’s peak value and the time sequence. This
methodology has been focused in the direction of making the graphical user interface
(GUI) design for exhibiting the detection of PQRST by utilizing the MATLAB tool
and plots these peak values on the waveform of ECG signal at each instant of its
respective time. These types of ECG signal processing techniques will be aided for
scientific research purposes instead of proceeding with a medical diagnosis.
1 Introduction
The ECG or electrocardiogram is determined as one of the simplest and accurate tests
utilized for estimating the heart’s clinical condition. An ECG analysis is the most
widely utilized technique since it is capable of screening the different abnormalities
of the heart. ECG tools and devices which is utilized for measuring the heart rate
and plotting the waveform based on the frequency of heartbeat are most commonly
available in the medical centers and it is declared to be as less expensive and riskless
with respect to treatment. From the tracing of the ECG waveform, the identification
of the following data can be determined [1].
1. Heart rhythm
2. Heart rate
3 Thickening of cardiac muscle
4. Any physical symptoms for the possibilities of heart attack
5. Detection and the estimation of coronary artery disease
6. Detection of conduction abnormal state (an abnormal state with which the
spreading of electrical impulse results across the cardiac muscle).
The above-mentioned traits shall be treated as one of the most significant degra-
dation factors affecting cardiac functionalities. The results of ECG signal tracing
make the doctors to determine the exact clinical condition of the heart and also the
researches in this field does the work in enhancing the non-invasive detection tech-
nique in estimating the cardiac abnormalities. Additional testing of ECG is normally
carried out for determining the serious problem which induces a better way to proceed
with the treatment. In the hospitals, the test related to ECG most commonly includes
stress test and catherization of cardiac muscle. The subjects with which they are
resulted with severe heart pain, increased blood pressure, variation in the velocity
of flow of blood through the arteries and veins, and unbalanced cholesterol levels
in the blood are susceptible to undergo the ECG diagnosis test to check whether
there is any thickening of valves over the cardiac muscle which causes arrhythmia
or abnormality condition. It is very easier to identify through the variations in the
A Graphical User Interface Based Heart Rate Monitoring Process … 483
electrical flow of the signal pattern of ECG lends the reason for the abnormal condi-
tion of the heart. Hence, at the time of every cardiac cycle, an ECG is represented as
the graphical pattern of the bioelectrical signal which is being generated within the
human body [2]. It is highly possible to acquire the relevant and useful information
via the ECG chart or graphical waveform which makes the relation to the heart’s
function through the waves and the baseline denoting the variations of the voltage
of cardiac muscle at every instant of time [3]. An ECG is very much essential and it
possesses its adequate amplitude values at every instant of time [4] and it aids in the
determination of certain clinical cases after diagnosis. They are as follows.
1. Heart attack (myocardial infarction)
2. Electrolytic transformations
3. Ventricular and auricular hypertrophy
4. Pericarditis
5. Cardiac arrhythmias
6. Medicine effects over cardiac muscle, specifically quinidine, and digital
7. Abnormal blood pressure and inadequate velocity of blood flow.
The proposal over this study determines the prototype over the package of soft-
ware. This development over this software has been made mainly for scientific and
technological research over the medical diagnosis under the clinical setting. In this
proposed study, MATLAB has been utilized to develop the software package in
designing the graphical user interface for analyzing P, Q, R, S, and T peak values of
an ECG signal. The computation of all the peak value parameters has been made from
the recordings of the ECG signal. The acquisition of ECG signal in terms of input
informational data can be done in terms of binary or text-formatted files or simple
excel files. The structure of this paper has been prescribed as follows. The second
section determines the literature review on feature extraction techniques of ECG
signal. The third section determines the summary on the nature of ECG continuous
signal patterns and its placing of leads. The fourth section describes the software
developed techniques which analyzes the ECG signal to obtain P peak, Q peak, R
peak, S peak, and T peak amplitudes. The fifth section focuses on the results and
discussion obtained from the simulations, and lastly, the sixth sections are ended
with its conclusion and the future scope of work.
The recording of continuous ECG signal plays a vital role in the initial stage of
diagnosing the cardiac disease which is later on processed and the analysis which
will be carried out with the assistance of signal conditioning device. Even though the
ECG analysis has been acquired through the determination of heartbeats and heart
rates, the abnormalities over the heart functioning results due to the physical change
in position and the size of the chambers or the causes resulted due to the continuous
consumption of drugs prescribed by the physicians. Acquiring the ECG signals from
484 M. Ramkumar et al.
method for performing the classification of ECG images on the basis of its extracted
features. Extraction of features is made from wavelet decomposition method with
the intensity of ECG images and then processed further by the utilization of artificial
neural networks. The essential features are median, mean, maxima, minima, stan-
dard deviation, mean absolute deviation, and variance [16]. Another technique by the
utilization of artificial neural networks for the detection of PQRST peak waveform
by the utilization of derivative with which the establishment is made for maximum
and the minimum search for the derivative of an ECG wave component. The R peak
which is denoted as the highest peak must be within the zero crossing between the
derivative of the minimum and the maximum. Subsequently, the Q peak must be
existing prior zero crossing maximum and the presence of S peak must be relying
on the zero crossing after the point of minimum.
The P peak and the T peak are made similarly by focusing its view on the local
maxima in the original existing signal and then utilizing the derivative for making the
identification of peak and the end points [17]. In this proposed study, it is presented
with the dependable and the simple methodology for detecting the P peak, QRS
peak, and T peak values of an ECG signal. This method is preceded on the basis of
determining the mathematical relationship between the maximum peak and valley
values of the ECG signal with respect to time. In this proposed study, the GUI
has been designed by using the MATLAB software for the detection of PQRST
peak waveform by the application of the simple mathematical algorithm in order
to acquire the PQRST waveform and plot these values over the ECG signal with
respect to time. Apart from these, the process of denoising has been implied for the
extraction of noise-free signal.
An ECG signal is denoted as the waveform plot which gets explored from the print
out as a paper trace with which the recording is done to acquire the activity of elec-
trical impulses of the human’s cardiac muscle. The normal ECG signal component
is comprising of a series of negative and the positive cycles of waveforms such
as P wave signal, QRS complex waveform, and T wave signal. The existence of
486 M. Ramkumar et al.
P wave amplitude and the QRS complex determines the linear relationships over
distinguishing different irregularities of cardiac muscle. The typical ECG waveform
is depicted in Fig. 2, wherein which the peak amplitude of P wave denotes the state
of atrial depolarization and the initial part of deflection in the upward direction.
Whereas the QRS complex is comprising of three peaks, namely Q peak, R peak,
and S peak, which determines the state of depolarization of ventricles and the T
peak of the waveform corresponds to the ventricle repolarization and results with the
termination of ventricle systolic effect [18].
The typical ECG waveform which is depicted in Fig. 2 determines that the hori-
zontal axis of the plot denotes time variant parameter, whereas the vertical axis of
the plot denotes the depth and height of the wave and its amplitude is measured in
terms of voltage. The first timing interval over the horizontal axial line is termed to
be as P-R timing interval which denotes the time period from the P peak onset to the
initial position of the QRS complex. The interval denotes the axial time in between
the initial position of the atrial depolarization and the initial arise of ventricular
depolarization. The QRS complex which is followed by the S-T framed segment
determines the section between the terminal point of S peak denoted as J point, and
the initial position of T peak which makes the representation over the timing space
between the depolarization and the repolarization of ventricles. The interval of Q-T
is denoted as the time scale between the initial arise of Q peak to the terminal end of T
peak over the cardiac muscle’s electrical impulse cycle. The interval of Q-T denotes
the entire duration of electrical impulse activity of the ventricle depolarization and
repolarization. Table 1 denotes the normal levels of amplitudes for an ECG signal.
muscle with different states of response over its function. The standard 12 leads of
ECG are partitioned into two groups with which the first set of the cluster is denoted
as limb leads and it is comprising of three bipolar limb leads (1, 2, and 3), wherein
which the lead 1 is acquired between the positive electrode and the negative electrode
in which the negative electrode is made located on the right forearm and the positive
electrode is made located on the left forearm. Lead 2 is acquired between the positive
and the negative electrode with which the negative electrode has relied on the right
forearm, whereas the positive electrode relies on the left foot. Then, the lead 3 is
acquired between the positive and the negative electrode with which the negative
electrode has relied on the left forearm, whereas the positive electrode relies on the
left foot. The second cluster of leads is represented as chest leads with which they are
denoted in terms of AVR, AVL, and AVF. They are also represented as V leads (V1,
V2, V3, V4, V5, and V6) or precordial leads. The 12 ECG leads were described and
the schematic representation which determines the position of electrode mapping is
depicted in Fig. 3.
Heart rate is determined as the velocity of heartbeat and the measurement is done
by the total count of heartbeats in the specific interval of time and it is normally
expressed in terms of bpm (beats per minute). The normal rate of heart from the
normal human being is ranging from 60 to 100 beats calculated per minute and its
value gets varied with the variation in sex, age, and also other relevant factors. When
the heart rate is minimum than 60 beats per minute then the condition is represented
as Bradycardia, whereas when the heart rate is maximum than that of 100 beats per
minute the condition is stated as Tachycardia [24, 25]. There are several techniques
to determine the value of heart rate which is dependent on the preceding ECG signal
by utilizing interval space of R-R peak as the subsequent follows. The initial one
depends on making the count of the total number of R peak in the strip of 6 s cardiac
rhythm and the value has to be multiplied by the factor of 10. As the second step,
determines the total count of small boxes, with which the R-R interval is represented
in mm as a typical value. Then perform the process of division with the factor of
1500 in order to determine the heart rate. As the third step, determines the total count
of large boxes in-between the preceding successive R peaks for arriving the typical
value on the basis of R-R interval. Finally, perform the division by the factor of 300
to the resulted number and make the determination of heart rate [22].
In this study, the results have been simulated from the MATLAB software with which
the recordings of electrocardiogram signal have been analyzed to acquire the P peak,
A Graphical User Interface Based Heart Rate Monitoring Process … 489
Start
Adjust the filter coefficients and plot the PQRST wave and detect the heart rate
End
Fig. 3 Precordial chest electrodes usually located on the left side of the chest leads
Q peak, R peak, S peak, and T peak and finally determine the detection of heart rate.
The graphical user interface (GUI) is very simpler for obtaining PQRST values of
peak and gets plotted by the analysis of ECG. It is much required for all the human
who makes use of testing of this source code, must determine the selection of total
sample count for a single ECG waveform cardiac cycle to detect the PQRST peak and
must equal the total sample count of 400. This software tool produces the essential
following features for processing ECG signal and analyzing it.
1. Preliminary recordings of ECG signal have to be loaded from any informational
source as in the form of excel, binary, or text files.
2. The recordings of ECG which has been loaded has to be plotted for every lead.
3. The detection of PQRST has to be made as a unique value and also it should be
made to appear on the plot.
4. The graph has to be exported in terms of bmp or png or fig types.
5. The data has to be saved as either mat or txt or xlsx types.
490 M. Ramkumar et al.
Figure 5 depicts the design of the graphical user interface (GUI) for plotting the
ECG wave and determine the plot of PQRST peak. The following are the sequence
of steps followed to frame an algorithm using the MATLAB tool.
1. Determine the sampling rate for the ECG waveform and estimate the calculation
of heart rate.
2. By using the window filtering technique, determine the detection of heart rate.
3. Obtain the plot and save it in the form of (.png) or (.bmp) or (.fig) type.
4. Next, the data has to be saved in the format of txt or mat or xlsx type.
5. Perform the analysis of acquired ECG informational data and estimate the values
after the detection of PQRST peak values.
6. After the plot is created for the PQRST peaks, the plot is extracted with the
marked peak values.
7. Then, the marked or acquired plot is saved in any one of the represented formats
as (.png) or (.bmp) or (.fig) type.
8. Selection based on the requirement has to be done to print the entire samples
or the specific samples alone.
9. The graph shall be finally saved and proceed with the program for the heart rate
detection and acquire the ECG plot again in the txt or mat or xlsx type.
10. Based on the selected lead, the response could be seen with the adequate plot
of ECG to read the heart rate from the waveform.
11. On entering the sampling data range for analysis, the ECG is imported to acquire
PQRST peaks.
Figure 6 illustrates the plotting of raw ECG signal. Later on, the filtered signal
plotting is exhibited in Fig. 7. The detection over the R peak points in order to estimate
the heart rate of the ECG signal component is depicted in Fig. 8. It determines the
demonstration over how to acquire P, Q, R, S, and T peak values for the approximated
range of data values say nearly 400 samples and establish the process of heart rate
160
140
120
amplitude
100
80
60
40
20
0
0 5 10 15 20 25 30
time
492 M. Ramkumar et al.
100
80
amplitude
60
40
20
-20
0 5 10 15 20 25 30
time
80
amplitude
60
40
20
-20
0 5 10 15 20 25 30
time
detection. Figure 9 depicts the acquisition made on the QRS filtered signal along with
the identified pulse train which is formulated by the adaptive threshold detection.
From the above-mentioned steps, the graphical user interface could be designed in
order to acquire the ECG signal for further processing. The main impact of designing
a GUI is the ECG could be acquired directly from the database such as MIT-BIH
arrhythmia database to make the detection of cardiac abnormalities. In this proposed
study, the acquisition of the ECG signal could be processed only after the process of
detecting the peaks such as PQRST from the ECG is being done and the mapping
could be directly interpreted to measure the heart rate and it could also be monitored
continuously. In Fig. 5, GUI for acquiring ECG is being represented. There is a
virtual key for acquiring the input which is stored in the text format. Once the ECG
A Graphical User Interface Based Heart Rate Monitoring Process … 493
-0.5
-1
100 200 300 400 500 600 700 800 900 1000
QRS on MVI signal and Noise level(black),Signal Level (red) and Adaptive Threshold(green)
0.3
0.2
0.1
100 200 300 400 500 600 700 800 900 1000
Pulse train of the found QRS on ECG signal
0.4
0.2
0
-0.2
-0.4
100 200 300 400 500 600 700 800 900 1000
Fig. 9 Sequence of acquisition of filtered QRS along with the representation identified of pulse
train
signal acquisition is completed, based on the peak values of PQRST and the time
duration of R-R, R-T, Q-T, and S-T interval the abnormality state of cardiac muscle
could be diagnosed by processing it through computational intelligence techniques.
For the detection of cardiac arrhythmias from the ECG signals acquired from MIT-
BIH arrhythmia database, by using the machine learning algorithms such as artificial
neural networks, genetic algorithm, fuzzy logic techniques, and so on could be devel-
oped and this will aid as a non-invasive technique in detecting the abnormal states of
cardiac muscle which will lead to immediate death. Preprocessing of raw acquired
ECG signal inclusive of denoising, dimensionality reduction and baseline wander
removal, feature extraction, feature selection and classification of ECG whether it is
coming under the normal category or abnormal category.
The essential segment in designing the GUI is once the clear plot has been made
with the predefined peaks and interval between the peaks, enhancement can be made
in developing the computational intelligence techniques for classifying the cardiac
arrhythmias and it will lead the doctors to proceed and carry their right path for
treatment. The test has been undergone by carrying the data acquisition process
from the MIT-BIH physionet database. The normal ECG signal is acquired from the
physionet database and the plot over the raw and filtered component of ECG has been
made. As the simulated pattern of GUI, the plot of the raw ECG signal is depicted
in Fig. 6. As an initial step immediately after the ECG signal acquisition process,
denoising is applied with which the noise-free ECG signal has resulted and the plot
of noise-free filtered ECG signal is depicted in Fig. 7. For the noise-free ECG signal
494 M. Ramkumar et al.
Table 2 Comparison of
Parameters of ECG Standard PQRST Detected PQRST
PQRST peaks and the heart
values values
rate of normal acquired ECG
signal P 0.25 mv 0.054 mv
Q 25% of R wave −0.435 mv
R 1.60 mv 1.495 mv
T 0.1–0.5 mv 0.114 mv
Heart rate 60–100 bpm 78 bpm
This study has proposed a technique of monitoring the heart rate and the detection
of PQRST peak from the acquired ECG signal component by designing the GUI
from the MATLAB software. This process of detection can be used by the clinical
analysts as well as the researchers in the field of diagnosing the abnormalities of the
ECG signal. The most initial technique which is used to determine the analysis of
ECG signal to estimate the PQRST peaks on the basis of digital signal processing
technique and artificial neural networks could be enhanced in terms of accuracy by
using this matrix laboratory software. The prediction of the optimal value of the heart
rate could be made accomplished by the proposed method of extraction from GUI.
It is also used for the prediction of different cardiac disease which is designated
as cardiac arrhythmias. As future work, by using the computational intelligence
techniques, the cardiac abnormalities classification algorithms could be implemented
in diagnosing the arrhythmias in a non-invasive manner. As similar to that of MIT-
BIH arrhythmia database, the processing over an ECG signal could be made by the
real-time acquisition of ECG signal processing for which the graphical user interface
could be developed integrating the machine learning algorithms in diagnosing the
abnormality state conditions.
A Graphical User Interface Based Heart Rate Monitoring Process … 495
References
1. Bronzino JD (2000) The biomedical engineering handbook, vol. 1, 2nded. CRC Press LLC
2. Goldshlager N (1989) Principles of clinical electrocardiography, Appleton & Lange, 13th edn.
Connecticut, USA
3. Singh N, Mishra R (2012) Microcontroller based wireless transmission on biomedical signal
and simulation in Matlab. IOSR J Eng 2(12)
4. Acharya RU, Kumar A, Bhat PS Lim CM, Iyengar SS, Kannathal N, Krishnan SM (2004)
Classification of cardiac abnormalities using heart rate signals. Med Biol Eng Comput 42:172–
182
5. Babak M, Setarehdan SK (2006) Neural network based arrhythmia classification using heart
rate variability signal. In: Signal Processing Issue: EUSIPCO-2006, Sept 2006
6. Beniteza D, Gaydeckia PA, Zaidib A, Fitzpatrickb AP (2001) The use of the Hilbert transform
in ECG signal analysis. Comput Biol Med 31:399–406
7. De Chazal P, O’Dwyer M, Reilly RB (2004) Automatic classification of heartbeats using ECG
morphology and heartbeat interval features. IEEE Trans Biomed Eng 51(7):1196–1206
8. Dewangan NK, Shukla SP (2015) A survey on ECG signal feature extraction and analysis
techniques. Int J Innov Res Electr Electron Instrum Control Eng 3(6):12–19
9. Dima SM, Panagiotou C, Mazomenos EB, Rosengarten JA, Maharatna K, Gialelis JV, Curzen
N, Morgan J (2013) On the detection of myocardial scar-based on ECG/VCG Analysis. IEEE
Trans Biomed Eng 60(12):3399–3409
10. Ebrahimi A, Addeh J (2015) Classification of ECG arrhythmias using adaptive neuro-fuzzy
inference system and Cuckoo optimization algorithm. CRPASE 01(04):134–140. ISSN 2423-
4591
11. Burhan E (2013) Comparison of wavelet types and thresholding methods on waveletbased
denoising of heart sounds. J Signal Inf Process JSIP-2013 4:164–167
12. Ingole MD, Alaspure SV, Ingole DT (2014) Electrocardiogram (ECG) signals feature extraction
and classification using various signal analysis techniques. Int J Eng Sci Res Technol 3(1):39–44
13. Jeba J (2015) Classification of arrhythmias using support vector machine. In: National
conference on research advances in communication, computation, electrical science and
structures-2015, pp 1–4
14. Kar A, Das L (2011) A technical review on statistical feature extraction of ECG signal. In:
IJCA special issue on 2nd national conference computing, communication, and sensor network,
CCSN, 2011, pp 35–40
15. Kelwade JP, Salankar SS (2015) Prediction of cardiac arrhythmia using artificial neural network.
Int J Comput Appl 115(20):30–35. ISSN 0975-8887.
16. Kohler B, Hennig C, Orglmeister R, (2002) The principles of software QRS detection reviewing
and comparing algorithms for detecting this important ECG waveform. IEEE Eng Med Biol
42–57
17. Kutlu Y Kuntalp D (2012) Feature extraction for ECG heartbeats using higher order statistics
of WPD coefficients. Comput Methods Progr Biomed 105(3):257–267
18. Li Q, Rajagopalan C, Clifford GD (2014) Ventricular fibrillation and tachycardia classification
using a machine learning approach. IEEE Trans Biomed Eng 61(6):1607–1613
19. Luz EJDS, Nunes TM, Albuquerque VHCD, Papa JP, Menotti D (2013) ECG arrhythmia
classification based on optimum-path forest. Expert Syst Appl 40(9):3561–3573
20. Malviya N, Rao TVKH (2013) De-noising ECG signals using adaptive filtering algorithms. Int
J Technol Res Eng 1(1):75–79. ISSN 2347-4718
21. Markowska-Kaczmar U, Kordas B (2005) Mining of an electrocardiogram. In: Conference
proceedings, pp169–175
22. Masethe HD, Masethe MA (2014) Prediction of heart disease using classification algorithms.
In: Proceedings of the world congress on engineering and computer science WCECS 2014, vol
2, pp 22–24
496 M. Ramkumar et al.
23. Moavenian M, Khorrami H (2010) A qualitative comparison of artificial neural networks and
support vector machines in ECG arrhythmias classification. Expert Syst Appl 37(4):3088–3093.
https://doi.org/10.1016/j.eswa.2009.09.021
24. Muthuchudar A, Baboo SS (2013) A study of the processes involved in ECG signal analysis.
Int J Sci Res Publ 3(3):1–5
25. Narayana KVL, Rao AB (2011) Wavelet-based QRS detection in ECG using MATLAB. Innov
Syst Des Eng 2(7):60–70
Performance Analysis of Self Adaptive
Equalizers Using Nature Inspired
Algorithm
Abstract Through the communication channel, the sender can send a message to the
receiver. But due to some noise in the channel, the sent message is not similar to the
received message. Likewise, in the digital communication channel, the broadcasted
signal may get dispersal. So that both the communicated and transmitted informa-
tion is not similar. An ISI (inter-symbol interference) and Additive noise cause the
dispersal of the signal. If the channel is exactly established, the ISI can be reduced. In
training, though, rarely have preliminary information of the channel attributes even
if there is an obvious issue of inaccuracy that occurs in physical deployments of the
filters. The equalization is utilized to counteract the distortion of intrinsic residual.
In this article, the accomplishment of an adaptive equalizer for data transfer through
a network that triggers ISI (inter symbol interference). One chance to decrease the
impact of this challenge is to utilize a channel equalizer at the receiver. The role
of the equalizer is to create a modernized version of the communicated signal as
near as possible to it. The equalizer is utilized to decrease the BER (bit error rate);
the proportion of received bits in error to overall transferred bits. In this article, the
hybrid approach like least mean square (LMS)and EPLMS algorithms are utilized
to detect the minimum MSE i.e. mean square error and Optimum Convergence rate
which will improve the efficiency of the communication system.
N. Shwetha (B)
Department of ECE, Dr. Ambedkar Institute of Technology, Bangalore, Karnataka 560056, India
e-mail: shwethaec48@gmail.com
M. Priyatham
Department of ECE, APS College of Engineering, Bangalore, Karnataka 560082, India
e-mail: manojpriyatham2k4@yahoo.co.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 497
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_37
498 N. Shwetha and M. Priyatham
1 Introduction
Due to the arrival of digital technology, digital signal communication got essential in
a wide range of applications. Such applications take the initiative in the digital field
several modulation systems and additional updates in them [1]. But those plans and
their updates were highly affected by noise. Basically, two fundamental problems
occur in traditional digital transmission methods. Mostly, ISI (Inter Symbol Interfer-
ence) and noise have a high impact on those methods and their updates. These errors
were caused due to channel characteristics that linked among receiver and transmitter
and the dissemination of transferred pulse. The noise impact on the communication is
determined by the channel features and may also be diminished with the appropriate
selection of channel [2–5]. However, the channel is still noisy the signal that the user
receive could be less impacted if the SNR kept at the transmitter by enhancing trans-
ferring signal power [6]. Because of ISI on symbol electrical power is propagated
into a new symbol duration which impacts the interaction and diffuses the symbol.
An efficient method of decreasing this is utilizing an adaptive equalized channel.
In digital communication systems, adaptive equalization is critical to diminish the
impact of ISI, where an adaptive algorithm like the LMS algorithm will alter the
measurements of the equalizer. When everything is good in the receiver, there is no
communication among the consecutive symbols; each symbol enters and is decrypted
individually of all others [7, 8]. However once the symbols communicate with each
other, the waveform of a single symbol harms the value of an adjacent symbol, and
then the accepted signal turns into a distorted. It is hard to differentiate the message
from such a transmitted pulse or accepted signal are rubbed out so that signals related
to the various symbols are not distinguishable. This impairment is known as the ISI
(inter symbol interference). This impact can be reduced by utilizing the channel
equalizer at the receiver. Two of the very intensively emerging fields of digital trans-
missions, such as cellular communications and digital subscriber lines are heavily
reliant on the implementation of a trusted channel equalizer (Fig. 1).
In the digital communication system, if everything is right at the receiver side then
there will be no interaction among successive symbols. Here each of the signals which
are arrived are decoding self-reliantly among others. But when it comes to symbol
interaction, one of the waveforms will corrupt the values of the next nearby symbols.
Due to this, the received signal will be distorted. Because of this, it is difficult to
differentiate messages from such a received signal. The shortage is known as inter-
symbol interference (ISI). The purpose of an equalizer is to reduce the ISI so that can
have a reconstructed signal from the transmitter side. Due to this it also reduces the bit
rate of the transmitted signal. As assumption made in all pass AWGN is impractical.
The lack of frequency spectrum the signal is filtered to minimise the bandwidth
so that frequency structured division can be obtained. There are many band pass
channels available in practical but the response varies with respect to the different
frequency components. To avoid this, the simplest AWGN model is needed to have
for representing the practical channels very accurately. Such commonly available
retirement is a dispersive channel model shown in Fig. 2.
From the equation u(t) is the transmitted signal, hc (t) is the impulse response of the
channel and n(t) is AWGN power spectral density N 0 /2. The dispersive characteristic
of the channel is prototyped by using the linear filter hc (t). This dispersive channel
model is a low pass filter. By using this low pass filter as can line the transmitted
signal with respect to time causing the effect of symbol difficult to adjust symbols
in a practical case while transmitting the signals from the transmitter. Due to this,
the ISI will deteriorate the error caused by the transmitted signal with respect to
error performance in the communication system. Two main methods are mainly
concentrated on which eradicates the ISI deterioration effect. In the first method, the
Noise
Channel ∑
x(t) y(t)
band limited transmission pulses are used to minimize the ISI. The pulses obtained
by the ISI are called free pulsed which are known by its name Nyquist Pulses. In the
second method, the received signal is needed to filter to cancel the ISI which was
introduced by the channel impulse response. This is known as equalization.
3 Channel Equalization
Figure 1 shows the structure of channel equalization. The equalization is the proce-
dure of changing, channel equalization is adapting the coefficients of the channel.
The radio transmitter filter is levelled to the channel over which information is trans-
ferred. The channel is equalized by many algorithms. The equalizing receiver filter
reduces the impact of ISI on the channel. At present, adaptive algorithms catch the
attention of most investigators [12]. The channel is equalized by several determin-
istic algorithms. The most powerful deterministic algorithm utilized to eliminate ISI,
prior to adaptive methods is LMS, where channel response for inclinations assessed
depending upon the highest possible probability function. The coefficients of the
receive filter are equalized or adjusted to correspond to the channel [13–16]. The
ISI and noise are reduced by modifying coefficients at the output. Then an error
signal power the adaptation of equalizer. Contemporary data transmission methods
are manipulating progressively more physical events to develop the flow of data.
In late periods, a huge measure of effort was accomplished in the advancement of
transmission. To suit into limitations expressed in the global radio guidelines, a lot
of quickening and improvements were applied to previous forms of data modulation
and encryption. According to the volume of transmitted data, the harmful trans-
mission impacts are getting increasingly huge with the expansion of data intensity
in the channel, [17]. To counteract the distortion, modern-day radio communica-
tion equipment uses extraordinary measures that include digital signal processing
and channel analysis. Equalizers can recreate the transmitted pulse from its altered
version. The equalization procedure which improves the noise will not be able to
attain improved functioning [18]. This article concentrates on methods of adaptive
channel equalization, with an intent to replicate real interaction circumstances and
Performance Analysis of Self Adaptive Equalizers … 501
Channel
Estimation
ℎ(n)
Figure 4 illustrates an adaptive filter structure and its working is specified in the four
stages that are as follows:
1. The signal received is being handled by the filter.
2. The response of the filter that characterizes the relation among obtained and
developed pulse.
502 N. Shwetha and M. Priyatham
the contrast among reference input and output. In Fig. 4, the channel works in an
ordinary manner in which an input pulse is handled by the channel and is sent to the
output. Figure 4 illustrates a streamlined model of an adaptive filter [24, 25].
4 Problem Formulation
In case the step size is enormous, it is realized that the merging rate of the LMS
algorithm will be dissolute, yet the consistent state MSE i.e. mean square error will
improve. Then again, if the step size is little, the consistent state MSE will be little,
yet the convergence rate will be moderate. In this way, the step size gives a trade-off
among the convergence rate and the consistent state MSE of the LMS algorithm.
The one way to increase the efficiency of the LMS algorithm is to make the step
size variable as opposed to fixed which leads to VSSLMS algorithms. By using this
methodology, both a fast convergence rate and a little consistent state MSE can be
achieved. The step size should satisfy the condition:
0 < step-size < 1/(max Eigenvalue of the input auto-correction matrix)
For fast convergence, step-size is set close to its maximum allowed value.
5 Performance Criteria
The performance of the LMS adaptive filter is explained in three methods, one is the
sufficiency of the FIR filter, the second one is the speed of the convergence of the
system, and finally the misadjustment in steady-state.
The rate at which the coefficients approach their ideal qualities is known as the speed
of convergence. The speed of convergence improves as the estimation of the step size
is expanded, up to step sizes close to a one-a large portion of the most maximal value
necessary for the steady activity of the framework. This outcome can be acquired
from a cautious examination for various kinds of the input signal and relationship
measurements. For normal signal situations, it is seen that the speed of convergence
504 N. Shwetha and M. Priyatham
of the abundance MSE diminishes for huge enough advance size qualities. The speed
of convergence declines as the length of the filter is expanded. The most extreme
conceivable speed of convergence is restricted by the biggest advance size that can
be selected for solidity for related input signal less significant than a large portion of
the greatest qualities once the input signal is reasonably associated.
The LMS i.e. least mean squares algorithm is one of the most famous algorithms in
adaptive handling of the signal. Because of its robustness and minimalism has been
the focal point of much examination, prompting its execution in numerous appli-
cations. LMS algorithm is a linear adaptive filtering algorithm that fundamentally
comprises of two filtering procedure, which includes calculation of a transverse filter
delivered by a lot of tap inputs and creating an estimation error by contrasting this
output with an ideal reaction. The subsequent advance is an adaptive procedure,
which includes the programmed modifications of the tap loads of the channel as
per the estimation error. The LMS algorithm is additionally utilized for refreshing
channel coefficients. The benefits of LMS algorithm are minimal calculations on
the sophisticated nature, wonderful statistical reliability, straightforward structure
and simplicity of usage regarding equipment. LMS algorithm is experiencing prob-
lems regarding step size to defeat that EP i.e. evolutionary programming is utilized.
Figure 5 shows the block diagram of a typical adaptive filter
where
x(n) is the input signal to a linear filter
y(n) is the corresponding output signal
d(n) is an additional input signal to the adaptive filter
e(n) is the error signal that denotes the difference between d(n) and y(n)
In the case of Linear filter, it can be of different types, namely the FIR or it can be
IIR. The coefficients of linear filter iterations are adjusted by the adaptive algorithm
to minimize the power of e(n). It also adjusts the coefficients of the FIR filter and
includes the recursive least square algorithm. The LMS algorithm performs some of
the operations to appraise the coefficient of an adaptive FIR filter. They are noted
below.
1. Calculates the output signal y(n) from the FIR filter.
where u(n) is the filter input vector and u(n) = [x(n)×(n −1) . . . x(n − N +1)]T
T
w(n)
is the filter coefficients vector and w(n) = w0 (n)w1 (n) . . . w N −1 (n)
2. Calculates the error signal e(n) by using the following equation: e(n) = d(n) −
y(n)
3. Updates the filter coefficients by using the following equation:
+ 1) = (1 − µc) · w(n)
w(n + µ · e(n) · u(n)
where
7 Evolutionary Programming
Evolutionary algorithms are stochastic search methods and not the deterministic
ones. In 1960, Lawrence J. Fogel utilized the evolutionary programming in the US to
utilize modelled evolution as an educational procedure which is seeking to create AI.
The previous existing methods like linear programming, calculus-based methods for
example Nutenian method are having the difficulties in delivering the global solution.
They are tending a stuck in the local solution. To overcome this problem nature
inspiration computation can be applied. In this approach, some characteristics that
are available in nature is taken as a reference to develop the mathematical model. This
mathematical model will utilize to discover the solution to the problem. In this paper,
the characteristics of nature are taken as evolution. This is one of the most success
full characteristics available in nature where the things evolved (the things changed)
with the time to adapt the environment in result betterment in fitness value hence,
the chances of survival are high. For example, the transformation from monkey to a
human. A mathematical model based on the evolution is referred to as evolutionary
computation.
506 N. Shwetha and M. Priyatham
8 EPLMS Algorithms
where, e(n) is the deviation error, d(n) is the expected output value, x(n) is the
input vector at sampling time ‘n’ and W (n) is coefficient vector.
3. A step size having the minimum error select it with respect to current sample
point.
4. With the selected step size LMS applied to get the coefficient value.
5. As the new input sample appears, from the previous generation a new population
of step size is created in EP and procedure repeated.
9 Simulated Results
MATLAB 2014b was utilized to implement the modelling and subsequent results
were shown in this portion. The ability of the recommended structures was deter-
mined in accordance with the conditions of its convergence nature as described in
figures. The GUI model shows the steps involved in implementing EPLMS algorithm
is described in Fig. 6.
To see the performance of Evolutionary Programming LMS algorithms (EPLMS)
for any channel, here 11 taps are selected for an equalizer. 500 samples are considered
in the Input signal; through uniform distribution, the values are generated randomly
as shown in Fig. 7. The Gaussian noise is having zero mean and has 0.01 standard
unconventionality additional with the input signal shown in channel features is given
by the vector:
[0.05 − 0.063 0.088 − 0.126 − 0.25 0.9047 0.25 0 0.126 0.038 0.088]
This is the randomly generated input signal consists of 500 samples. This signal
transfer in a bipolar form (+1, −1). To make the system more complex random infor-
mation generated between +1 and −1. This makes the information unpredictable at
the receiver side.
Performance Analysis of Self Adaptive Equalizers … 507
Fig. 7 Generated input signal and signal with noise from channel
508 N. Shwetha and M. Priyatham
Figure 8 shows the MSE error plot using the LMS with fixed step size values 0.11,
0.045, and 0.0088. With observation, it is clear that for the different step size values
the performance is different. i.e., they have different convergence characteristics
along with a different quality of convergence. And also, it is very tough to identify
the optimum step size value.
Figure 9 shows the comparative MSE error plot using LMS and EPLMS. From
this, an observation can be made that the error with the EPLMS algorithm is reduced
when compared to the LMS algorithm.
Figure 10 shows the generated input signal, Signal from the channel after the addi-
tion of noise, Signal from equalizer and signal recovered after decision respectively.
Fig. 8 Fixed step size performance of LMS with step size equal to 0.11, 0.045 and 0.0088
Fig. 10 Original signal, signal from channel with noise, signal from equalizer (EPLMS), the
recovered signal after a decision
Observation depicts that the EPLMS algorithm is very much efficient in providing
noise free information and also reduced bit error rate and Minimum mean square
error and EPLMS algorithm have proven to be the best algorithm in adaptive signal
processing.
10 Conclusion
Bandwidth-effective data transfer through radio and telephone channels has been
made through the usage of adaptive equalization to counteract for dispersal of time
launched by the channel. Stimulated by useful applications, a constant research
attempt over the past two decades has been producing a wealthy body of fiction
in adaptive equalization and the associated more common disciplines of a function
of system identification, adaptive filtering, and digital signals. This article provides
510 N. Shwetha and M. Priyatham
References
1. Dey A, Banerjee S, Chattopadhyay S (2016) Design of improved adaptive equalizers using intel-
ligent computational techniques: extension to WiMAX system. In: 2016 IEEE Uttar Pradesh
section international conference on electrical, computer and electronics engineering (UPCON).
IEEE
2. Gupta S, Basit A, Banerjee S (2019) Adaptive equalizer: extension to communication system.
Int J Emerging Trends Electron Commun Eng 3(1). ISSN:2581-558X (online)
3. Ghosh S, Banerjee S (2018) Intelligent adaptive equalizer design using nature inspired algo-
rithms. In: 2018 second international conference on electronics, communication and aerospace
technology (ICECA). IEEE
4. Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag
1(4):28–39
5. Shin H-C, Sayed AH, Song W-J (2004) Variable step-size NLMS and affine projection
algorithms. IEEE Signal Process Lett 11(2):132–135
6. Pradhan AK, Routray A, AbirBasak (2005) Power system frequency estimation using least
mean square technique. IEEE Trans Power Deliv 20(3):1812–1816
7. Banerjee S, Chattopadhyay S (2016) Equalizer optimization using flower pollination algorithm.
In: 2016 IEEE 1st international conference on power electronics, intelligent control and energy
systems (ICPEICES). IEEE
8. Praliya S (2016) Intelligent algorithm based adaptive control of nonlinear system. Dissertation
9. Sun L, Bi G, Zhang L (2005) Blind adaptive multiuser detection based on linearly constrained
DSE-CMA. IEE Proceed Commun 152(5):737–742
10. Schniter P, Johnson CR (1999) Dithered signed-error CMA: robust, computationally efficient
blind adaptive equalization. IEEE Trans Signal Process 47(6):1592–1603
11. Xiao Y, Huang B, Wei H (2013) Adaptive Fourier analysis using a variable step-size LMS
algorithm. In: Proceedings of 9th international conference on information, communications &
signal processing. IEEE, Dec 2013, pp 1–5
12. Wang Y-L, Bao M (2010) A variable step-size LMS algorithm of harmonic current detec-
tion based on fuzzy inference. In: 2010 The 2nd international conference on computer and
automation engineering (ICCAE), vol 2. IEEE
13. Xiao Y, Huang B, Wei H (2013) Adaptive Fourier analysis using a variable step-size LMS
algorithm. In: 2013 9th International conference on information, communications & signal
processing. IEEE
14. Eweda E (1990) Analysis and design of a signed regressor LMS algorithm for stationary
and nonstationary adaptive filtering with correlated Gaussian data. IEEE Trans Circ Syst
37(11):1367–1374
15. Sethares WA, Johnson CR (1989) A comparison of two quantized state adaptive algorithms.
IEEE Trans Acoust Speech Signal Process 37(1):138–143
16. Hadhoud MM, Thomas DW (1988) The two-dimensional adaptive LMS (TDLMS) algorithm.
IEEE Trans Circ Syst 35(5):485–494
17. Haykin SS (2005) Adaptive filter theory. Pearson Education India
Performance Analysis of Self Adaptive Equalizers … 511
18. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching–learning-based optimization: a novel
method for constrained mechanical design optimization problems. Comput-Aided Des
43(3):303–315
19. Kennedy J (2006) Swarm intelligence. In: Handbook of nature-inspired and innovative
computing. Springer, Boston, MA, pp 187–219
20. Meng H, Ll Guan Y, Chen S (2005) Modeling and analysis of noise effects on broadband
power-line communications. IEEE Trans Power Deliv 20(2):630–637
21. Varma DS, Kanvitha P, Subhashini KR (2019) Adaptive channel equalization using teaching
learning based optimization. In: 2019 International conference on communication and signal
processing (ICCSP). IEEE
22. Gibson JD (ed) (2012) Mobile communications handbook. CRC Press
23. Garg V (2010) Wireless communications & networking. Elsevier
24. Palanisamy R, Verville J (2015) Factors enabling communication based collaboration in inter
professional healthcare practice: a case study. Int J e-Collab 11(2):8–27
25. Hassan N, Fernando X (2019) Interference mitigation and dynamic user association for load
balancing in heterogeneous networks. IEEE Trans Veh Technol 68(8):7578–7592
Obstacle-Aware Radio Propagation
and Environmental Model for Hybrid
Vehicular Ad hoc Network
1 Introduction
With the significant growth and requirement for provisioning smart transportation
systems, vehicles in current days are embedded with various hardware and smart
devices such as sensors and camera. Building smart intelligent transport system aids
in providing seamless mobility, safe journey, more enjoyable, and improved user
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 513
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_38
514 S. Shalini and A. P. Patil
The state-of-the-art radio propagation models are divided into two classes. Few
approaches have focused on addressing the delay constraint by increasing propa-
gating speed. Rest of the approaches has focused on establishing a reliable prop-
agating route in the vehicular ad hoc network. However, the majority of these
approaches have considered that if vehicles are within association range can commu-
nicate among each other and if not they cannot communicate with each other. Further,
the presence of a bigger vehicle in a line of sight (LOS) of transmitter and receiver
will induce significant overhead of effective coverage for transmitting information,
because receiver will experience decreased received signal powers. Along these lines,
the receiver cannot decode the information successfully [14] as they do not consider
vehicles obstructing effect (VOE) into consideration. Thus, state-of-the-art conven-
tions will experience the ill effects of the broadcasting hole (BH) issue: few vehicles
within association range cannot receive the broadcasted message from the nearest
device (i.e., both source and hop device) with enough signal power. A communi-
cating device inside the zone of BH will fail in decoding information and will not
have any knowledge of the current dynamic traffic condition. Thus, these devices
consequently become potential casualties of vehicle collisions.
For addressing the above-discussed problems, this work describes transmission
efficiency (TE) (i.e., the additional attenuation power required) for estimating the
influence of vehicles obstructing effect on a different channel and environment
condition. The TE can be established in a heuristic manner by taking the ratio of
total vehicle density that obtain information with no error to the overall vehicle
density within correspondence range of source vehicle considering moving at a
certain speed and density. This paper presents the obstacle-aware radio propaga-
tion model considering VOE for different environmental conditions. Further, the
work presented in distributed MAC design maximizes vehicular ad hoc network
throughput with minimal collision under a multichannel environment.
The highlight of the work is discussed below:
• Presents obstacle-aware radio propagation model for different environment
condition such as urban, rural, and highway.
• Modeled distributed MAC that maximizes system throughput of vehicular ad hoc
network.
• Experiment outcome shows the proposed distributed MAC design archives better
outcome when compared with existing MAC design with better throughput and
less collision.
The organization of the research work is as follows. In Sect. 2, the literature survey
of various existing radio propagation and environmental is discussed. In Sect. 3, the
obstacle-aware radio propagation and environmental model are proposed. The result
and discussion are discussed in Sect. 4. Lastly, the research is concluded and the
future direction of research work is discussed.
516 S. Shalini and A. P. Patil
2 Literature Survey
This section discusses the various existing radio propagation model presented for
improving communication performance of vehicular ad hoc network under different
environment and network conditions. In [15], the author focused on highlighting the
physical obstacle and further it is observed that vehicles have a large impact of safety
information on optimized propagations in VANET through the continuous obstruc-
tion of links among two communicating devices. Moreover, obstructing effect incur
various impact on road safety and it diminishes effective coverage regarding safety-
related information; however, so far it is not addressed in efficient manner. Here, at
the first broadcast definition as a metric is highlighted then optimization of mitigation
problem on safety-related information is extensively investigated and to overcome
such issue graph theory optimization technique is adopted. Maximum broadcast-
efficient relaying (ER) aka MBER algorithm is developed for distributable optimiza-
tion in VANET, and MBER helps in maximizing operative information coverage, also
it tries to meet certain requirement such as reliability constraint through incorporating
propagation distance and broadcast efficiency in the relay contention. Furthermore,
algorithm is evaluated and it is observed that the MBER tries to promote the effec-
tive coverage of information (safety-related) through varying vehicular distribution
in vehicular ad hoc networks. In [16], the author focused on properties of V2V radio
channel characteristics of ramp conditions following the different structure; more-
over, ramps are divided into the various construction structures. The first structure
is defined bridge ramp along with soundproof walls in the urban area, second is
general ramp without any sound-proof walls is given sub-urban region. Moreover,
the whole propagation process of the radio signal is parted into different propagation
zones while considering the line of sight (LOS); further, the propagation charac-
teristics include shadow fading, propagation path loss, RMS delay spread, average
fade duration, level crossing rate, fading distribution, and fading depth, and these
characteristics are estimated. Furthermore, in accordance with different characteris-
tics, various ramp conditions are compared and the following observation is carried
out. (1) In urban bridge ramp condition, an abrupt fluctuation indicates the signifi-
cance of soundproof walls in radio channel of vehicle-to-vehicle communication. (2)
Frequent change in received signal strength parameter and various fading parameter
in different propagation environment are observed in ramp scenario of sub-urban
environment. Moreover, statistical features are optimized and fixed through certain
generalization error parameter; hence, propagation path loss is exhibited through
demonstration of path loss parameter differences in a given operating environment.
In [17], the author tries to achieve reliable communication; hence, it is observed
that features of the wireless channel need to be analyzed properly. In here author
mainly focuses on radio channel characteristics of V2V which is primarily based
on 5.9 GHz under overtaking scenario; further, they are analyzed through empir-
ical results under four environment and network conditions. However, the primary
concern is channel characteristics difference among non-overtaking and overtaking
scenarios; hence, here they divided the non-overtaking and overtaking points based
Obstacle-Aware Radio Propagation and Environmental Model … 517
on small-scale fading distribution and further it is observed that the value of average
fade distribution and root-mean-square delay spread are significantly high when
compared with non-overtaking scenarios; however, level crossing rate and root-mean-
square Doppler spreads are lesser than non-overtaking conditions. Moreover, [18]
considered variation in the velocity of communicating vehicles; the further generic
model was designed considering the various parameters such as path powers, path
delays, arrival angle, departure angle, and Doppler frequencies; these parameters are
analyzed and simplified through Taylor’s series. They aimed modeling mechanism
which can be applied for real-time vehicle-to-vehicle communication and further
explicitly reveals velocity variation impact on channels. In [19], the author designed
3D model of a stochastic model which was irregular in shape and based on the
geometry for vehicle-to-vehicle communication scenarios; here, multiple inputs and
multiple output mechanisms were used at transmitting device. Further, geometric
path length which is time variant is developed for capturing non-stationary; this non-
stationary is caused by transmitting and receiving device. Moreover, it is observed
that the author focuses on investigating the impact of relative moving time and direc-
tions of respective channel state information. Similarly, [20] observed that multipath
component in dynamic clusters is not modeled ideally in the existing model; hence,
in here multipath component clusters distributions for both horizontal and a vertical
dimension are considered. Here, expectation maximization algorithm is introduced
for extracting multipath component and further for identification and tracking is
carried through developing clustering and tracking methodologies. Moreover, MPC
clusters are parted into two distinctive categories, i.e., scatter cluster and global
cluster; the cluster distribution is further categorized through various inter- and intr-
acluster parameters. However, it is observed that elevation spread and azimuth spread
both follow a lognormal distribution.
From the survey, it is seen the number of radio propagation model has been
presented considering different scenarios considering the presence of an obstacle
and environmental conditions. The 3D geometric model is very efficient in modeling
VOE more efficiently. However, induce higher computation overhead. Further, the
number of 2-way and 3-way knife edge model has been presented addressing large-
scale fading issues, however did not address the small-scale fading issues under
varied environment scenarios. Along with, very limited work is carried out designing
distributed MAC employing VOE. For overcoming research issues in modeling VOE
under a different environmental condition in the next section, this work presents radio
propagation and distributed MAC model for different environment condition such as
urban, rural, and highway.
518 S. Shalini and A. P. Patil
This section presents obstacle-aware radio propagation (OARP) model for dynamic
environment condition such as urban, rural, and highway. Let us consider a set of a
vehicle of different size moving in different region as shown in Fig. 1.
Let us assume that each vehicle has a homogenous communication radius which
is described by notation S y and these vehicles can communicate with one RSU
or vehicles at given instance of time. Then, each vehicle transmits H number of
packets with the size of N and pass through radio propagation environment with a
set of vehicle A = {1, . . . , A}. Let M describe the average size of vehicles (i.e.,
average vehicles arrival rate within the coverage area) that is passing through a
radio propagation environment with the Poisson’s distribution. The vehicle speed
and density are described by u and l, respectively. The vehicle speed considering
certain vehicle density l can be obtained using the following equation
l
u = uk 1− , (1)
l↑
where u k depicts the speed of vehicles under Poisson’s distribution and l↑ is the
maximum feasible vehicle density in a radio propagation environment. Therefore,
the M can be estimated using the following equation
M = lu. (2)
The maximum amount of vehicle P that can be allowed by certain vehicle or RSU
y can be obtained utilizing floor function using the following equation
P↑,y = 2S yl↑ , ∀y ∈ A. (3)
For computing N th slot time when the vehicle will be in the communication
region of neighboring vehicle can be obtained using the following equation
y−1
V(y, N ) = N x + N , ∀N ∈ 1, . . . , N y , (5)
x=0
Where N0 = 0. The time line representation of time slots in yth the device is
described using the following equation
N y = V(y, 1), . . . , V y, N y (6)
Further, for maximizing resource utilization of VANET, the slots are selected that
maximizes the system throughput. Let the slot assignment decision be exy and the
throughput attainable by each vehicle X in a vehicular ad hoc network is S X . Here,
e X N is set to 1 provided slot N is assigned to the vehicle. Similarly, if not slot is
assigned to a vehicle, e X N is set to 0. Therefore, the throughput gain problem is
described using the following equation
R
max SX . (7)
E
X
where R depicts the overall vehicle size in VANET. Further, slot assignment
constraint is described using the following equation
R
e X N = 1 ∀y. (8)
X
Thus, this paper computes the attainable throughput of vehicle X on slot assign-
ment below. Let VX describes the slots allocated to vehicle x and let l X N describes
the probability that slot N is reachable by vehicle X . For simplicity, this paper
assumes that l X N does not rely on each other. As a result, the S X is estimated using
the following equation
520 S. Shalini and A. P. Patil
T
e X N
SX = 1 − lXN = 1 − lXN (9)
N ∈VN N =1
where 1 − N ∈VX l X N depicts the probability that at least one slot is available for
each vehicle X Then, the parameter l X N depicts the probability that slot N is not
reachable for vehicle X is computed using the following equation
l X N = 1 − l X N (10)
This is because every vehicle can at least use only one assigned slots, its maximum
throughput achievable will be 1 under different radio propagation environment
considering certain data rate.
The different environment has different shading, path loss, and shadowing compo-
nent. Thus, it is important to model such an effect for improving packet transmission
performance. This work addresses the issues of path loss component on channel
attenuation. Let us consider for a given slot time n the bandwidth can be estimated
using the following equation
where C depicts the bandwidth of vehicular ad hoc network, G depicts the commu-
nication power, P0 power spectral density with zero Gaussian noise, rn depicts the
distance between communicating vehicles at time slot n and α depict the path loss
component. For evaluating α in Eq. (11), as described in [21] log normal shadowing
model and path loss component considering signal-to-noise ratio (SNR) (r )dB with
distance r apart from the sender, the receiver can be obtained using the following
equation
where Pt depicts the power required for processing packet PL(r0 ) depicts the path
loss component at a certain distance apart r0 , Xσ depicts the zero mean Gaussian
random parameters with standard deviation σ , Pn depict the noise level in decibel
watt. Further, this work considers VOE for modeling channel [22, 23] for improving
log normal shadowing models. This paper considers neighboring device as VOE
for modeling obstacle-aware radio propagation model. In the OARP model first, the
vehicle that would get affected by VOE between transmitting vehicle x and receiving
vehicle y is described as obtProbAff(x, y). If the distance between VOE of vehicle
x and vehicle y is higher than that of those in the middle of VOE vehicle, in that case,
the vehicle is said to be probably obstructing. Second, the vehicle that obstructs the
VOE between vehicle x and vehicle y are chosen from a selected probable candidate
of obstructing vehicle established in previous round are described using following
notation obtLOSaff([ProbableAff]). Further, it must be seen that the transmitted
signal may get obstructed because of obstructing effects of Fresnel zone ellipsoid
Obstacle-Aware Radio Propagation and Environmental Model … 521
where W depicts the wavelength. Finding the height of entire possible obstructing
vehicles before carrying out communication plays a significant part before transmit-
ting packets. Further, it is noted a vehicle will obstruct vehicle x and y provided if
z is greater than that of its height. Thus, the probability of VOE between vehicle x
and vehicle y is estimated using the following equation.
where L depicts the probability of VOE by vehicle (i.e., obstacle) among transmitter
vehicle and receiver vehicle, ϕz depict the mean of amplification of obstructing
vehicles and ωz depicts a standard deviation of amplitude of the obstructing vehicles,
Q(·) depicts Q function.
Finally, the amplified attenuation needed of signal power obtained is estimated for
VOE of obstructing vehicle in a prior round is established utilizing following notation
obtAttenuation([AffDevices]). This work uses multiple knife edge (MKE) models.
Using MKE a candidate of VOE vehicle is obtained and base on the distance and
height the attenuation is optimized. The OARP model computation for amplifying
attenuation between transmitter vehicle x and receiving device y considering the
presence of multiple obstacles because of neighboring vehicles are described in the
flow diagram in Fig. 2.
The proposed obstacle-aware radio propagation and distributed MAC model attain
significant performance when compared with the existing model under different
environment condition which is experimentally proved in the next section below.
0
1 6 11 16 21 26 31 36
Simulation time (S)
200
100
0
1 6 11 16 21 26 31 36
Simulation time (S)
Table 1 Simulation
Simulation parameter used Configured value
parameter used for
experiment analysis Vehicular ad hoc network size 50 m * 50 m
Number of vehicles 20–60
Modulation scheme Quadratic amplitude
modulation-64
Mobility of devices 3 m/s
Coding rate 0.75
Bandwidth 27 Mb/s
Number of channels 7
Number of time slot 8 µs
Message information size 20 bytes
Medium access control type used TECA, DMAC
524 S. Shalini and A. P. Patil
10
0
20V 40V 80V
Number of vehicles
Fig. 5 Throughput performance attained by proposed distributed MAC under denser environmental
condition considering the varied size of vehicles
200
Collision performance considering varied vehicle
packet collided
Number of
Proposed Model
0
20V 40V 80V
Number of vehicles
Fig. 6 Collision performance attained by proposed distributed MAC under denser environmental
condition considering the varied size of vehicles
achieved much better collision outcome when compared with existing MAC irre-
spective of vehicle size. The significant throughput and collision reduction achieved
using proposed DMAC model under dynamical environmental condition are because
slots are assigned to a vehicle that maximizes throughput using Eq. (7) and the band-
width is optimized in Eq. (11) based on signal-to-noise ratio considering obstructing
effect of among communicating device. On the side, in the existing model, the slot
is assigned to vehicle based on resource availability. Thus, a vehicle cannot maxi-
mize system throughput. Further, they consider simple attenuation model without
considering multiple obstructing deceive in LOS among statically associating device.
However, the obstructing effects in real time vary significantly. Thus, require dynamic
obstructing effects measurement model. Thus, from the result it can be seen they
induce high packet loss. Thus, from the result achieved, the proposed DMAC can
be concluded as robust in nature under varied vehicle density and radio propagation
environment as it brings good tradeoffs between reducing collision and improving
throughput.
3. Results and discussions
This section discusses the result and its significance of proposed radio propagation
environmental and distributed MAC model over existing model [24, 26]. In Table 2,
the proposed model comparison of the proposed approach over the state-of-the-art
model is shown. [15] presented MBER by considering the presence of multiple obsta-
cles between LOS of communicating vehicles. However, they considered perfor-
mance evaluation under the highway environment only. Further, the number of radio
propagation model has been presented for addressing the obstacle effect between
LOS among communicating device [17–20]. However, this model aimed at reducing
propagation delay adopting 3D geometry, as a result inducing high computational
overhead. Along with, these models are simulated under different environment condi-
tion thus they are realistic. On the other side, this paper presented an efficient radio
propagation model that considers the obstructing effect between communicating
vehicle. Further, the experiment conducted in [24] shows high packet collision under
a multichannel environment. Thus, for addressing [26] presented a throughput effi-
cient channel access model. However, these models induce slightly higher colli-
sion and failed to maximize system throughput. For addressing, this paper presented
526 S. Shalini and A. P. Patil
distributed MAC design that maximizes throughout with minimal collision overhead.
From overall result attained, it is seen the proposed model achieved much superior
throughput with lesser collision when compared with the existing model. Thus, the
proposed MAC model brings good tradeoff between maximizing throughput with
minimal collision.
5 Conclusion
First, this work analyzed various existing work that was presented recently for
addressing VOE among communicating vehicles. Various radio propagation method-
ologies considering obstacle in line of sight of communicating vehicle has been
presented with good result. However, these models are not applicable for simu-
lating under practical or real-time environment. Further, the adoption of 3D geom-
etry statistical model induces high computational complexity considering dynami-
cally changing vehicle environment. For addressing the research problem, this paper
first presented obstacle-aware radio propagation model. Further, the impact of the
obstructing effect on communication is tested under different environment condition
such as urban, rural, and highway. No prior work has considered such an evalu-
ation. Further, this paper presented a distributed MAC model that overcomes the
problems of maximizing throughput with minimal contention overhead. The OARP
model is incorporated into DMAC thus able to optimize the slot time dynamically
aiding system performance. Experiments are conducted by varying vehicle size. An
Obstacle-Aware Radio Propagation and Environmental Model … 527
References
16. Jiang H, Zhang Z, Wu L, Dang J (2018) Novel 3-D irregular-shaped geometry-based channel
modeling for semi-ellipsoid vehicle-to-vehicle scattering environments. IEEE Wirel Commun
Lett 7(5):836–839
17. Yang M et al (2019) A Cluster-based three-dimensional channel model for vehicle-to-vehicle
communications. IEEE Trans Veh Technol 68(6):5208–5220
18. Manzano M, Espinosa F, Lu N, Xuemin Shen, Mark JW, Liu F (2015) Cognitive self-scheduled
mechanism for access control in noisy vehicular ad hoc networks, Hindawi Publishing
Corporation. Math Probl Eng 2015, Article ID 354292
19. Hrizi F, Filali F (2010) simITS: an integrated and realistic simulation platform for vehicular
networks. In: 6th international wireless communications and mobile computing conference,
Caen, France, pp 32–36. https://doi.org/10.1145/1815396.1815404
20. Han Y, Ekici E, Kremo H, Altintas O (Feb. 2017) Throughput-efficient channel allocation
algorithms in multi-channel cognitive vehicular networks. In: IEEE transactions on wireless
communications, vol 16. no 2, pp 757–770
21. Huang R, Wu J, Long C, Zhu Y, Lin Y (2018) Mitigate the obstructing effect of vehicles
on the propagation of VANETs safety-related information. In: 2017 IEEE intelligent vehicles
symposium (IV), Los Angeles, CA, pp 1893–1898
22. Li C et al (2018) V2V radio channel performance based on measurements in ramp scenarios
at 5.9 GHz. In: IEEE Access, vol 6. pp 7503–7514
23. Chang F, Chen W, Yu J, Li C, Li F, Yang K (2019) Vehicle-to-vehicle propagation channel
performance for overtaking cases based on measurements. In: IEEE access, vol 7. pp 150327–
150338
24. Li W, Chen X, Zhu Q, Zhong W, Xu D, Bai F (2019) A novel segment-based model for
non-stationary vehicle-to-vehicle channels with velocity variations. In: IEEE access, vol 7. pp
133442–133451
25. Jiang H, Zhang Z, Wu L, Dang J (Oct. 2018) Novel 3-D irregular-shaped geometry-based
channel modeling for semi-ellipsoid vehicle-to-vehicle scattering environments. In: IEEE
wireless communications letters, vol 7. no. 5, pp 836–839
26. Yang M et al. A cluster-based three-dimensional channel model for vehicle-to-vehicle
communications. In: IEEE Transactions on Vehicular
27. ITU-R (June 2019) Propagation by diffraction. In: International telecommunication union radio
communication sector, 2007, Technology vol 68. no. 6, pp 5208–5220
Decision Making Among Online Product
in E-Commerce Websites
Abstract In the present era, customers are mainly engrossed in the product-based
system. To make their exertion easy all pupils are trusting in internet marketing.
By catching this public interest, all the product-based systems are playing enor-
mous activities which may be legal or illegal. Due to this reason, decision making
among products in e-commerce websites is making most ambiguity. Considering this
perspective, this paper is providing an analysis of how to evaluate customer reviews.
It deals with deciding how to manage the customer experience in marketing. This
paper presents how to analyze online product reviews. The framework aims to distill
large volumes of qualitative data into quantitative insights on product features so
that designers can make more informed decisions. This paper set out to identify
customer’s likes and dislikes found in reviews to guide product development.
1 Introduction
The data from different sources like conducting surveys, interviews, etc. The impor-
tance of the customer and their needs were the key role to design a product and
that must satisfy the customer needs. Nowadays, customers can review all aspects
of products in e-commerce websites. Big data is needed for the product designers
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 529
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_39
530 E. Rajesh Kumar et al.
Figure 1 shows the person’s perception analysis over the online product. Mainly this
picture indicates the process of visualization on the online product by the customer.
Generally, any customer will list out the relative product website, now consider the
few interesting websites which are trusted by users.
Now the user will enter into the specific online website URL and look out the
interface if any pre-conditional login is required then the customer will fill and get
into it. So, by this login, the user can understand that the online website is giving user
individuality or not. After login into the website now the user will look out the entire
item which displayed on the first screen and now user search the key product in search
engine. After that, the user will select one product and look into its specification or
feature of a product. If the user finds all features are good, then the user decides
whether to buy a product or not. If the user didn’t find any item as good, then the
user browses to the next website. In this format, the user will travel from one source
of online website to another online website.
Decision Making Among Online Product in E-Commerce Websites 531
vehicle age, vehicle price, police report status naive Bayes can provide probability-
based classification whether the claim is genuine [4]. Bayes theorem provides a way
to calculate the posterior probability p(a/b) from p(a), p(b) and p(b/a).
Probability Rule
The conditional probability of event a occurring, given that event b has already
occurred, is considered as p(a/b).
a b
P P(a)
P = a
(1)
b P(b)
Naïve Bayes classifier finds prior probability and likelihood for each feature of
the given class labels and posterior probability from the above formula. The class
label with the highest posterior probability is the result of the prediction [5]. Naive
Bayesian includes all predictors using Bayes’ rule and the independent assumptions
between predictors [6].
3 Visualization
The customer can analyze the occurrences of product websites that are preferred by
the end-user. From Fig. 3, customers can analyze the most preferred website like
amazon for that product next followed by Flipkart and so on (Fig. 4).
4 Implementation
1. Dataset collection is taken from Google form which specifies the different
attributes of the product should be rated with different values. This form is mainly
about to take the analysis of the customer who was already bought the product
in another source of online website.
2. Data validation is a process of splitting the dataset with reference to customer-
specified products and removing all the unnecessary variables upon the specified
product from the dataset and ordering the dataset according to the websites.
3. In the planning stage, the dataset is divided into two different data frames: one
as a training dataset and another one as a testing dataset. where training dataset
can train by the algorithm with its dataset, whereas testing can test the trained
dataset.
4. In the modeling stage, the training dataset undergoes into classified algorithm
training with considering the target variable of the dataset.
5. Prediction of the model is happened by considering both the trained and testing
datasets. This prediction use to test the dataset with reference to the trained
dataset.
6. The confusion matrix is the process of displaying the table of values which are
correctly predicted through the modeled dataset and testing dataset. The size of
the confusion matrix is a 2 × 2 matrix with labels as 0 and 1. Where [0,0] and
[2] matrix index used to show correctly predicted values of the data, [0, 1] and
[1, 0] matrix index shows incorrect predicted values.
7. If any new customer wants to buy the product, then the result analysis shows the
result in deployment.
Decision Making Among Online Product in E-Commerce Websites 535
Figure 5 shows the result for a specific product using the naive Bayes algorithm.
Figure 6 shows the result for a specific product using the SVM algorithm.
By applying naive Bayes and SVM algorithm on the dataset which can get an
output of accuracy which is related to the Buys products. This accuracy has been
calculated from a confusion matrix where a result is a total number of correctly
predicted from the testing dataset by the total number of rows in a dataset. Buys
products are dependent on all the variable which are present in the dataset. This result
shows the best online product website based on the accuracy result of a product on
a specific company. From these two algorithms, Naïve Bayes gives the best accurate
result. Figure 5 shows Flipkart having the highest review rating of 38 percentage
and Fig. 6 shows the club factory having the highest review rating of 35%. so, by
this, the new consumer can utilize this analysis result for buying online products
on an e-commerce website. This process considers the algorithm based on an online
independent product specification review which is given individually by the customer.
The main important application of the paper is to predict the online product website
which is been reviewed by the end-user and give their review.
536 E. Rajesh Kumar et al.
5 Conclusion
This paper reviewed the current online review summarization methods for products.
Knowing how clients use to buy the item, what their enthusiastic state when utilizing
it. This paper has gone through the naive Bayes and SVM models to perform target
result where the SVM algorithm used to check the best hyperplane in the given data
set but it doesn’t check the occurrences in the data set. To overcome this problem
using a naive Bayes algorithm, where it use to see the probability of occurrences
over the data set. Hence by using the Naïve Bayes algorithm, it can easily predict
good accuracy results over SVM.
References
Anuj Kumar
Abstract The Internet of things (IoT) is an expanding field which increases its partic-
ipation in different fields like e-health, retail, smart transportation, etc., day by day
where devices communicate with each other and with persons for providing different
facilities for the users and for the overall community of humans. In this, communica-
tion technologies are used with modern wireless communications. Communication
between device to device and human made possible with the help of sensors and
wireless sensors network which are provided by IoT. With these capabilities in IoT,
various challenges are available. This paper focused on an overview of IoT and appli-
cation scenarios of IoT. IoT contribution health sector, IoT E healthcare architecture
and point out the various security concern and objections in the E-health enabled
with IoT are also talk over and disclosed.
1 Introduction
In IoT, there is a wide area for research available now which attracts research scholars.
IoT has now changed the way of living of humans. In this pattern, different types
of devices and gadgets are attached in the manner that they can communicate or
transfer information with each other. Internet is a medium which is used for this
type of interaction. The team of research and innovations Internet of things describe
IoT is a network Infrastructure which spread out worldwide with own composed
qualities which are based on standard rules of exchange and use information in large
heterogeneous network where both types of an object like physical and virtual have a
specification, physical attributes, and virtual characters use smart coherence and are
logically united into the information network. The ability to exchange and use infor-
mation in a large heterogeneous network is a special feature of IoT which provides
A. Kumar (B)
Department of Computer Engineering and Applications, GLA University, Mathura, India
e-mail: anujkumar.gla@gla.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 537
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_40
538 A. Kumar
more and rapid growth in its popularity. IoT has a feature, to collect and share data
from the connected smart devices or objects with other devices and structure. Through
the analysis and processing of the data, there is a little bit or no need for any type of
interaction by a human in devices when devices performing their actions. Nowadays,
the Internet of things (IoT) completely modified connectivity from “anytime,” “any-
where” for “anyone” into “anytime,” anywhere” for “anything” [1]. Forming a smart
city and smart home, creating environmental monitoring, providing a new direction
for a new smart healthcare system, and adding specific features at transportation,
etc., are the objectives of IoT.
IoT Applications Some of the IoT applications are given in the below Table 1.
Many areas are seen where IoT gives its applications and creates dynamic changes
in our life. In the next section, smart health concept which is enabled with IoT is
discussed
1. IoT in e-health—Before the existence of the Internet of things there is some
limitation in traditional healthcare system. 1. Patients can be bounded to interact
with doctors only by visiting hospitals and sending text messages. 2. No options
were available for doctors for caretaking and monitor its patient health 24 h and
provide treatment accordingly. IoT solves these problems by providing different
IoT-enabled medical equipment which help to distant monitoring patients in the
medical field possible and meeting with doctors have become very systematic.
Presently, IoT is changing the scenario of the medical field by reformatting the
space of devices. Lots of IoT-enabled health applications in the medical field
provide gain for patients, families, doctors, and medical institutions (Fig. 1).
1.1 IoT for patients—IoT provides wearable devices like fitness bands, CGM,
sensors, coagulation testing, etc., give feels to a patient like doctors attend
A Descriptive Analysis of Data Preservation Concern … 539
personally his/her case. IoT provides a big change in old people life where these
devices track continuous health status of them. Devices contain techniques,
which provide an alert signal to the relatives and concerned medical practitioners
who follow up those people who are living alone.
1.2 IoT for physicians—Physicians use wearable and other embedded IoT devices.
With the help of this, they can monitor and track their patients’ health and any
type of medical needed by patients. IoT creates a strong and tightly bounded
relation between physician and with their patients. Patient’s output (data) come
from these devices and it is very helpful to doctors to identify the disease and
for providing the best treatment.
1.3 IoT for hospitals—IoT devices in hospitals are used in devices like defib-
rillators, tracking devices, etc. IoT-enabled devices protect the patient against
infection also. IoT devices also work as a manager for getting information like
pharmacy inventory control, and environmental monitoring, and also works for
humidity and temperature control.
IoT architecture contains four steps. The output of every step will give input to
the next step. These steps are combined in a process. Final combined values in the
process are used as per users need and for different area prospects.
Step 1 In the initial step, there is a formation of interconnected devices embedded
with sensors, actuators, monitors, detectors, etc., and these types of
equipment are used for the collection of data.
Step 2 In this step, sensor is used for providing data; data is in the analog form, so
it will be required to collect and transform analog form to the digital form
for further execution.
540 A. Kumar
Step 3 Outputted data of step 2 is digitized and aggregated; in this step, it will be
stored in a data center or cloud.
Step 4 Advanced analytics are used in this data in the final step for managing and
structuring it as users get the right decision on behalf of this.
Health care with IoT contains various issues and challenges in terms of data
security. IoT-enabled connected devices gathered a lot of data which also contains
very sensitive data indeed. So data security should be a major concern on this field,
and several security and privacy issues are observed [1].
2 Literature Survey
Tao et al. [4] discussed healthcare data acquisition technique and studied about secu-
rity and privacy concern of it. The author proposed a technique which was collected
data secured for IoT-based healthcare system. Secure data scheme was composed of
four layers but author contributed to the first three layers. FPGA algorithm and secret
cipher algorithm were used for the initial phase and implemented KATAN algorithm
maximize. FPGA was used for hardware platform, and for achieving privacy and
protection to patients’ data, the secret cipher was used. Distributed database tech-
nique was applied at cloud computing layer for achieving privacy in patients’ data.
Simulations with FPGA were used for measuring the performance of secure data
in terms of a different parameter of algorithm. Fan et al. [5] proposed a scheme
which can solve the problem of medical security and privacy. RFID was used for
this purpose because it contained features of data exchanging and collecting, and
for this execution back-end server was used. Encoded text (Cipher Text) was used
for this type of information exchanging process, which makes this process more
securable. Tang et al. [6] proposed a secured health data collection scheme where
data was collected from various sources and signature techniques were used for a
guaranteed fair incentive for contributing patients and for keeping data obliviousness
security and fault tolerance combination of two cryptosystems was used. Also, the
key factor of the scheme in terms of the resistance of system from attacks toleration
of healthcare center failure, etc., was discussed. Puthal [7] proposed a static model
for data privacy. Basically, a model was used for restricted flow of information on a
huge amount of data streams. Two types of the lattice were used; one for wearable
sensors, known as sensor lattice, and the second one is user lattice for users. Static
lattices aimed to execute the model as much as faster. Results manifest that model can
handle the huge amount of data which comes in the form of streams with minimum
dormancy and store requirement. Deebak et al. [8] proposed a secured communica-
tion scheme for healthcare applications. In this, an authentication scheme was used for
users which is based on biometric technology and gives better result as compared to
existing techniques in terms of like packet delivery ratio, end-to-end delay, throughput
rates, and routing overhead and founded these results when it was implemented on
NS3simulator. This will make more securable smart healthcare application system.
A Descriptive Analysis of Data Preservation Concern … 541
Minoli et al. [9] proposed a novel IoT protocol architecture, inspect tools and tech-
niques which was used in security that could be grip as part of the distribution of
IoT; the author said that these techniques were very most important in e-health and
special care facility home like nursing homes applications. Tamizharasi et al. [10]
discussed the various architectural models and also acquire control algorithms for
IoT-enabled e-health systems. Further they discussed a relative analysis of different
architecture segments and about the security measures, and at last advised best appro-
priate techniques for IoT-enabled e-medical care systems. Koutli et al. [11] firstly
surveyed about the field of e-health Internet of things and found the security require-
ments and challenges in (IoT) applications. Then proposed an architecture which
contained VICINITY. Also contained General Data Protection Regulation (GDPR)
which was an amenable feature ordered to provide secured e-medical facilities to old-
and middle-aged people. At last, it highlighted the point of designing of this architec-
ture and security and privacy needs of this system. Rauscher and Bauer [12] proposed
a safe and secured analysis approach which contained a standardized meta-model
and an IoT safety and security framework that embracing a personalized analysis
language. Boussada et al. [13] proposed a new privacy maintaining e-health solution
over NDN. All privacy requirements and security were achieved by this architecture
and focusing on the dependency, named AND_aNA. security analysis was followed
for proving the vigorous of that proposal and through the performance, evaluation
shows its effectiveness. At last, after simulation results are applied, it is disclosed
that technique had an acceptable transmission delay and involves a negligible over-
head. Almulhim and Zaman [14] proposed a secure authentication scheme, where
a group-based lightweighted scheme was used for credential for IoT-based e-health
applications, and the proposed model contained various specific features like mutual
authentication, energy efficient, and computation for healthcare IoT-based applica-
tions. For containing these features in the scheme, elliptic curve cryptography (ECC)
concept was used. Savola et al. [15] proposed a new set of rules applied for security
objective decomposition aimed at security metrics definition. defined and managed
security metrics Systematically Which provide a higher level of effectiveness in secu-
rity controls, permitting informed risk-driven security decision-making. Islam et al.
[16] proposed an intelligent collaborative security model to minimize security risk; it
was also discussed how different new technologies like big data, ambient intelligence,
and wearable, are gripped in a healthcare context. Various relations between IoT and
e-health policies and its control across the world are addressed, and a new path and
new area for future research on IoT-based health care based on a set of open issues
and challenges are provided. Suo et al. [17] discussed all the aspects of data security
in each layer of IoT architecture like perception, network, support application layers
and found the holes in power and storage and other different issues like DDoS attack
authentication and confidentiality privacy protection in IoT architecture. All these
issues and challenges are elaborated briefly. Qiang et al. [18] focused issues like
network transmission of information security wireless communication and informa-
tion security RFID tag information security privacy protection and found challenges
were like RFID identification, communication channel, RFID reader security issues
etc. Chenthara et al. [19] discussed a security model which works for electronic
542 A. Kumar
health record (EHR). They also focused on that points which identified after study
about the research work which has already published on EHR approach before two
decades. Those techniques which can maintain the integrity and other basic data
security measure of any patient in EHR are further explained. Chiuchisan et al.
[20] discussed major concerns of data security like confidentiality, integrity, etc.
Healthcare systems also surveyed about information protection in terms of different
measures which was used in security and communication techniques also. Some
issues about security which arose with patients in some specific disease like Parkinson
when performed monitoring and other services were further explained. Abbas and
Khan [21] focused on facilities of the cloud for health information record, like data
storage center. Author described the state of the art in the field of cloud supports in
health records and explained the point like classification and taxonomy, which were
found after surveyed different privacy preserving techniques. Further, it strengths
and weaknesses of these techniques are focused and some new challenges are also
give for new research scholars. Ma et al. [22] proposed a new technique for e-health
application. The technique was based on compression which was a combination of
two techniques: Fourier decomposition algorithm (AFD) and symbol substitution
(SS). In the initial step, AFD was worked on, data was compressed by using lossy
compression further SS performed lossless compression technique. Hybridization of
both techniques was very effective in terms of CR and PRD that gave more valu-
able results. Idoga et al. [23] highlighted those point who effected the healthcare
consumers. For identification, this data was applied on various measures and found
the structural path model, also development of healthcare center by using the cloud.
After applying data on various models, it is analyzed with some specific measures like
social science, LISREL, etc. Pazos et al. [24] proposed a new programmed frame-
work that inscribes the fragmentation process. Overall process flows with the help
of communicating agents. They were using different set of rules for communication
between devices. In this framework, communication agent was developed according
to giving specification, also was feasible in all terms of security and expandable.
Maurin et al. [25] discussed the objects that are exchanged and communicated on
the Internet and focused on security features of these objects and threats like cyber
risk, vulnerabilities discussed which broke the security shield of these objects. And
the overview of solutions of these problems is further explained and the requirement
which was necessary to adapt for business and market perspectives is explained.
Karmakar et al. [26] proposed SDN-based architecture for the Internet of things.
This was based on authentication scheme which means it allowed only authenticated
devices for accessing the specific network. For authentication, lightweight protocol
was used. It was also worked for secure flow in the network. Combinations of both
concepts make the system more securable from malicious attacks and threats. Chung
et al. [27] described the IoT issues and explored some challenges. A new system
according to our needs for security is further proposed. In this, author said that
security features could be added in old ones without regenerating a system. The
old features of the system are exchanged with new coming features without doing
any type of renewal in the system. Borgia [28] explored new security technique
in M2M communication, routing, end-to-end reliability, device management, data
A Descriptive Analysis of Data Preservation Concern … 543
management, and security of IoT which make more secure the overall process. And
further author found some challenges in this field at the time IoT-enabled devices and
objects were communicated. There were privacy and security issues at the time of
transmission of data. Xiaohui [1] explained the concepts of IoT than about the issues
and challenges in terms of security and privacy being faced in IoT field. At the time
of transmission, the author found two types of security issues like wireless sensor
network security problems and information transmission and processing security.
And highlighted other threats like counterfeit attacks and malicious code attacks.
Keoh et al. [29] described all four nodes on which IoT devices are based. Stan-
dard rules of security are described and communication security for IoT is mainly
focused. Some challenges such as interoperable security datagram layer security are
also explained.
See Table 2.
4 Motivation
Health is the most concerned subject of human beings and e-health with IoT has
a notable area of the future Internet with a vast effect on human’s community life
and trades. IoT applications belong to the health sector and in other sectors also and
services which are providing both. There are some security issues in applications of
this field which are elaborated in Table 2. To secure IoT environment against those
issues, a new architecture or a mechanism is needed for this application areas. By
the help of that mechanism, it can be solved or fill small holes who arise in security
terms like authentication, confidentiality, and data integrity in IoT-embedded field.
The main motivation behind this survey is to provide mainly a detailed study about
e-health system with IoT and other IoT applications related to this field and find the
security issues and challenges in the field of e-health with IoT.
5 Conclusion
IoT gives big changes in the usage of Internet and also opens new opportunities for
new research scholars in the real world. Although a lot of researches are on IoT,
its IoT-based application areas are still now open. Data security issues in e-health
system with IoT have been carried out. Lots of researchers already gave data security
544 A. Kumar
Table 2 (continued)
Author Description Issues and challenges
Boussada et al. [13] Discussed named data Privacy issues over NDN
networking (NDN) nodes Comparison with IP solutions
exchanging Simulation Conduction
Identity-based cryptography
(IBC)
E-health solutions
Almulhim and Zaman [14] Lightweight authentication Middle attack
scheme Unknown key sharing attacks
ECC principles comparable Increase number of users’
level of security group based access points
authentication scheme\model Security issues
Savola et al. [15] Explored security risk, discussed Hierarchy of security metrics
heuristics for security objective More detailed security
decomposition, systematically objectives for the target system
defined, and managed security
Metrics
Boussada et al. [30] A novel cryptographic scheme Contextual privacy requirements
PKE-IBE. Based on Sensibility of exchanged data
identity-based cryptography Secure session keys
(IBC) tackles the key escrow transmission
issue and ensures the blind
partial private key generation
Islam et al. [16] Surveys advances in IoT-based Standardization IoT healthcare
healthcare technologies platforms
Analyzes distinct IoT security Cost analysis the app
and privacy features development process data
Discussed security protection network type
requirements, threat models, and Scalability
attack taxonomies
Suo et al. [17] Explained the security issues Storage issues, attacks like
which comes in all four types of DDoS attack
layers of IoT architecture Basic security needs like
authentication confidentiality,
access control, etc.
Qiang et al. [18] Discussed RFID tag information RFID identification,
security wireless communication channel, RFID
communication and information reader security issues
security network transmission of Radio signals attack
ınformation security Internet information security
Privacy protection Private information security
(continued)
546 A. Kumar
Table 2 (continued)
Author Description Issues and challenges
Chenthara et al. [19] Discussed EHR security and Integrity
privacy Confidentiality
security and privacy Availability
requirements of e-health data in Privacy
the cloud (3) EHR cloud
architecture, (4) diverse EHR
cryptographic and
non-cryptographic approaches
Chiuchisan et al. [20] Explored data security, Security issues in
communication techniques, communication techniques
strategic management,
rehabilitation and monitoring
with a specific disease
Abbas and Khan [21] Discussed facilities of the cloud Secure transfer of the data
for health information record, .Attacks like DoS
classification, and taxonomy, Authentication issues
reviewed more research work
Ma et al. [22] Explained combination of two Physical alteration can be
techniques Fourier possible
decomposition algorithm (AFD) There is no access control in the
and symbol substitution (SS), transmission of data
evaluated in terms of CR and
PRD
Wang [31] Worked for outsourced data and Due to IoT-enabled devices
user’s data secured in data unique working features this
sharing will create an issue of data
To ensure the privacy of data security
owner Mobility, scalability, the
multiplicity of devices
Idoga et al. [23] Identification, structural path Integration of different
model techniques creates a challenge
Data statistics like effort for security
expectancy performance Secure transfer of the data
expectancy information sharing
Pazos et al. [24] Discussed program-enabled Fragmentation in terms of
framework for fragmentation communication
flexible communication agents Protocols and data formats
Security and scalability aspects
Maurin et al. [25] Discussed the objects Communication between IoT
Threats like cyber risk, objects/ machines. Compromise
vulnerabilities basic security aspects of data.
The solution in terms of business Device tampering, information
and market perspectives disclosure, privacy breach
(continued)
A Descriptive Analysis of Data Preservation Concern … 547
Table 2 (continued)
Author Description Issues and challenges
Karmakar et al. [26] Explained SDN-based Malicious attacks and threats
architecture using an
authentication scheme
lightweight protocol
Security challenges
Chung et al. [27] Discussed on-demand security No proper pre-preparation for
configuration system handling security threats
Worked for unexperienced No techniques for authentication
challenges on security issues Compromise data security and
privacy
Borgia [28] Explored Security in terms of Authentication
IoT devices management and Privacy
security of data, network and Data security
applications
Xiaohui [1] Discussed wireless sensor Counterfeit attacks, malicious
network security problems and code attacks
information transmission and
processing
Security
Keoh et al. [29] Discussed Security issue at the time of
Standardization exchange and use information
Communication security by devices
Transport layer security
solutions in IoT but still, there is a need for more security solutions in application
fields of IoT like smart home, e-health, retail, etc. As an output of this survey, many
issues and challenges are found like a small hole placed in our data security in the field
of e-health in IoT like denial of service, man-in-the-middle, identity and data theft,
social engineering, advanced persistent threats, ransomware, and remote recording.
Many researchers gave solution about that but it is not sufficient, day by day, new
issues and challenges are in front of researchers so now more research should be
done in this field.
References
1. Xiaohui X (2013) Study on security problems and key technologies of the Internet of Things.
In: International conference on computational and Information Sciences, 2013, pp407–410
2. Mathuru GS, Upadhyay P, Chaudhary L (2014) The Internet of Things: challenges & security
issues. In: IEEE international conference on emerging technologies (ICET), 2014, pp 54–59
3. Atzori L, Iera A, Morabito G (2010) The Internet of Things: a survey. ElsevierComputer Netw
2787–2805
4. Tao H, Bhuiyan MZA, Abdalla AN, Hassan MM, Zain JM, Hayajneh T (2019) Secured data
collection with hardware-based ciphers for IoT-based healthcare. IEEE Internet of Things J
6(1):410–420. https://doi.org/10.1109/JIOT.2018.2854714
548 A. Kumar
5. Fan K, Jiang W, Li H, Yang Y (2018) Lightweight RFID protocol for medical privacy protection
in IoT. IEEE Trans Ind Inf 14(4):1656–1665. https://doi.org/10.1109/TII.2018.2794996
6. Tang W, Ren J, Deng K, Zhang Y (2019) Secure data aggregation of lightweight E-healthcare
IoT devices with fair incentives. IEEE Internet of Things J 6(5):8714–8726. https://doi.org/10.
1109/JIOT.2019.2923261
7. Puthal D (2019) Lattice-modeled information flow control of big sensing data streams for smart
health application. IEEE Internet of Things J 6(2):1312–1320. https://doi.org/10.1109/JIOT.
2018.2805896
8. Deebak BD, Al-Turjman F, Aloqaily M, Alfandi O (2019) An authentic-based privacy preser-
vation protocol for smart e-healthcare systems in IoT. IEEE Access 7:135632–135649. https://
doi.org/10.1109/ACCESS.2019.2941575
9. Minoli D, Sohraby K, Occhiogrosso B IoT Security (IoTSec) Mechanisms for e-Health and
Ambient Assisted Living Applications. In: 2017 IEEE/ACM international conference on
connected health: applications, systems and engineering technologies (CHASE), Philadelphia,
PA, pp 13–18. https://doi.org/10.1109/CHASE.2017.53
10. Tamizharasi GS, Sultanah HP, Balamurugan B (2017) IoT-based E-health system security:
a vision archictecture elements and future directions. In: 2017 International conference of
electronics, communication and aerospace technology (ICECA), Coimbatore, 2017, pp 655–
661. https://doi.org/10.1109/ICECA.2017.8212747
11. Koutli M et al (2019) Secure IoT e-health applications using VICINITY framework and GDPR
guidelines. In: 2019 15th International conference on distributed computing in sensor systems
(DCOSS), Santorini Island, Greece, 2019, pp 263–270. https://doi.org/10.1109/DCOSS.2019.
00064
12. Rauscher J, Bauer B (2018) Safety and security architecture analyses framework for the
Internet of Things of medical devices. In: 2018 IEEE 20th international conference on e-
health networking, applications and services (Healthcom), Ostrava, 2018, pp 1–3. https://doi.
org/10.1109/HealthCom.2018.853112
13. Boussada R, Hamdaney B, Elhdhili ME, Argoubi S, Saidane LA (2018) A secure and privacy-
preserving solution for IoT over NDN applied to e-health. In: 2018 14th International wireless
communications & mobile computing conference (IWCMC), Limassol, 2018, pp 817–822.
https://doi.org/10.1109/IWCMC.2018.8450374
14. Almulhim M, Zaman N (2018) Proposing secure and lightweight authentication scheme for
IoT based E-health applications. In: 2018 20th International conference on advanced commu-
nication technology (ICACT), Chuncheon-si Gangwon-do, Korea (South), 2018, pp 481–487.
https://doi.org/10.23919/ICACT.2018.8323802
15. Savola RM, Savolainen P, Evesti A, Abie H, Sihvonen M (2015) Risk-driven security metrics
development for an e-health IoT application. In: 2015 Information security for South Africa
(ISSA) Johannesburg, 2015, pp 1–6 https://doi.org/10.1109/ISSA.2015.7335061
16. Islam SMR, Kwak D, Kabir MH, Hossain M, Kwak K (2015) The Internet of Things for
health care: a comprehensive survey. IEEE Access 3:678–708. https://doi.org/10.1109/ACC
ESS.2015.2437951
17. Suoa H, Wana J, Zoua C, Liua J (2012) Security in the Internet of Things: a review. In:
International conference on computer science and electronics engineering, 2012, pp649–651
18. Qiang C, Quan G, Yu B, Yang L (2013) Research on security issues on the Internet of Things.
Int J Future Gener Commun Netw 1–9
19. Chenthara S, Ahmed K, Wang H, Whittaker F (2019) Security and privacy-preserving
challenges of e-health solutions in cloud computing. IEEE Access 7:74361–74382
20. Chiuchisan D, Balan O, Geman IC, Gordin I (2017) A security approach for health care infor-
mation systems. In: 2017 E-health and bioengineering conference (EHB), Sinaia, 2017, pp
721–724
21. Abbas, Khan SU (2014) A review on the state-of-the-art privacy-preserving approaches in the
e-health clouds. IEEE J Biomed Health Inf 18(4):1431–1441
22. Ma J, Zhang T, Dong M (2015) A novel ECG data compression method using adaptive Fourier
decomposition with security guarantee in e-health applications. IEEE J Biomed Health Inf
19(3):986–994
A Descriptive Analysis of Data Preservation Concern … 549
23. Idoga PE, Toycan M, Nadiri H, Çelebi E (2018) Factors Affecting the successful adoption
of e-health cloud based health system from healthcare consumers’ perspective. IEEE Access
6:71216–71228
24. Pazos N, Müller M, Aeberli M, Ouerhani N (2015) ConnectOpen—Automatic integration of
IoT devices. In: 2015 IEEE 2nd world forum on Internet of Things (WF-IoT), Milan, 2015,pp
640–644
25. Maurin T, Ducreux L, Caraiman G, Sissoko P (2018) IoT security assessment through the inter-
faces P-SCAN test bench platform. In: 2018 Design, automation & test in Europe conference
& exhibition (DATE), Dresden, 2018, pp 1007–1008
26. Karmakar KK, Varadharajan V, Nepal S, Tupakula U (2019) SDN enabled secure IoT archi-
tecture. In: 2019 IFIP/IEEE symposium on integrated network and service management (IM),
Arlington, VA, USA, 2019, pp 581–585
27. Chung B, Kim J, Jeon Y (2016) On-demand security configuration for IoT devices. In: 2016
International conference on information and communication technology convergence (ICTC),
Jeju, 2016, pp 1082–1084
28. Borgia E (2014) The Internet of Things vision: key features, applications and open issues.
Elsevier Comput Commun 1–31
29. Keoh SL, Kumar SS, Tschofenig H (2014) Securing the Internet of Things: a standardization
perspective. IEEE Internet of Things J 265–275
30. Boussada R, Elhdhili ME, Saidane LA ()2018 A lightweight privacy-preserving solution for
IoT: the case of e-health. In: 2018 IEEE 20th international conference on high performance
computing and communications; IEEE 16th international conference on smart city; IEEE 4th
international conference on data science and systems (HPCC/SmartCity/DSS), Exeter, United
Kingdom, 2018, pp 555–562. https://doi.org/10.1109/HPCC/SmartCity/DSS.2018.00104
31. Wang H (2018) Anonymous data sharing scheme in public cloud and its application in e-health
record. IEEE Access 6:27818–27826
32. Sudarto F, Kristiadi DP, Warnars HLHS, Ricky MY, Hashimoto K (2018) Developing of Indone-
sian intelligent e-health model. In: 2018 Indonesian association for pattern recognition inter-
national conference (INAPR), Jakarta, Indonesia, 2018, pp 307–314. https://doi.org/10.1109/
INAPR.2018.8627038
33. Abomhara M, Koien GM (2014) Security and privacy in the internet of things: current status
and open issues. In: IEEE International conference on privacy and security in mobile systems
(PRISMS), 2014, pp1–8.
34. Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) “Internet of Things (IoT)” a vision,
architectural elements, and future directions. Elsevier Future Gener Comput Syst 1645–1660
35. Al-Fuqaha A, Guizani MM, Aledhari M, Ayyash M (2015) Internet of Things: a survey on
enabling technologies, protocols and applications. IEEE Commun Surv Tutor 17(4):2347–2376
36. Said O, Masud M (2013) Towards Internet of Things: survey and future vision. Int J Comput
Netw (IJCN) 1(1):1–17
37. Matharu GS Upadhyay P, Chaudhary L (2014) The Internet of Things: challenges & security
issues. In: IEEE, international conference on emerging technologies (ICET), 2014,pp 54–59
38. Granjal J, Monteiro E, Sa Silva J (2015) Security for the Internet of Things: a survey of existing
protocols and open research issues. IEEE Commun Surveys Tutor 17(3):1294–1312
39. Atamli AW, Martin A (2014) Threat-based security analysis for the Internet of Things. In:
International workshop on secure Internet of Things, 2014, pp 35–43
40. Mahmoud R, Yousuf T, Aloul F, Zualkernan I (2015) Internet of Things (IoT) security:
current status, challenges and prospective measures. In: International conference for internet
technology and secured transactions (ICITST), 2015, pp336–341
41. Vasilomanolakis E, Daubert J, Luthra M, Gazis V, Wiesmaier A, Kikiras P (2015) On the
security and privacy of Internet of Things architectures and systems. In: International workshop
on secure internet of things, 2015, pp 49–57
42. Zhang Z-K, Cho MCY, Wang C-W, Hsu C-W, Chen C-K, Shieh S (2014) IoT security: ongoing
challenges and research opportunities. In: IEEE international conference on service-oriented
computing and applications, 2014,pp 230–234
550 A. Kumar
43. Jiang DU, Shi Wei CHAO (2010) A study of information security for M2M of lOT. In: IEEE
international conference on advanced computer theory and engineering (ICACTE), 2010,pp
576–579
44. Basu SS, Tripathy S, Chowdhury AR (2015) Design challenges and security issues in the
Internet of Things. In: IEEE region 10 symposium, 2015, pp90–93
45. Miorandi D, Sicari S, De Pellegrini F, Chlamtac I (2012) Internet of Things: vision, applications
and research challenges. Elsevier Ad Hoc Netw 1497–1516
46. Asghar MH, Mohammadzadeh N, Negi A (2015) Principle Application and vision in Internet
of Things (IoT). In:International conference on computing, communication and automation,
2015, pp427–431
47. Chen X-Y, Zhi-Gang, Jin (2012) Research on Key technology and applications for Internet of
Things. In: Elsevier international conference on medical physics and biomedical engineering,
2012, pp561–566
48. Vermesan O, Friess P (eds) (2014) In: Internet of Things—From research and innovations to
market deployment. River Publishers Series in Communication
Applying Deep Learning Approach
for Wheat Rust Disease Detection Using
MosNet Classification Technique
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 551
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_41
552 M. D. Olana et al.
2 Lıterature Revıew
All over the world, wheat is considered as one of the common cereal grains which
comes from grass type (Triticum) and grown as a different variety all over the world.
Wheat is considered as highly proactive as it has gluten which triggers harmful
immune response in individuals. Besides, these people all over the world consume
wheat as main food due to its rich antioxidants, minerals, vitamins, and fibers. Wheat
is a prominent food security crop of Ethiopia, and it provides millions of dollars for
Ethiopia [3].
Despite the importance of the crop in Ethiopia, wheat is the most commonly rust-
infected grain, basically with three most common rusts: leaf rust, yellow (stripe) rust,
and stem rust (Table 1).
This paper deeply works on how to detect three types of wheat rust diseases using
2113 RGB value of the images with RGB value segmentation and convolutional
Applying Deep Learning Approach for Wheat Rust Disease Detection … 553
neural network approaches together. Three types of wheat rust diseases are severely
harming wheat crops in Ethiopia.
Leaf rust (Puccinia triticina) is one of the common diseases that occurs in plants due
to fungus [4, 5]. İt is also called as brown rust that occurs mainly in wheat, barley,
and other grown crops. By attacking the foliage, leaf rust changes the leaf surface
into dusty, reddish orange to brown.
Yellow rust (Puccinia striiformis) [4] is also known as wheat stripe rust. It is one
of the rust diseases that occurs in wheat grown in cool environments. Mostly, the
northern latitudes are such regions which have temperature range of 2 and 15 °C.
Even though leaf and yellow rusts are types categorized under leaf rust family, they
are indifferent races, but they are sometimes difficult to recognize them easily by
normal visual inspection and they need to be tested in laboratory.
The stem rust is caused by the fungus-like other rust types (Puccinia graminis) [4] and
is a significant disease that affects bread wheat, barley, durum wheat, and triticale.
554 M. D. Olana et al.
Deep learning is a subfield of machine learning [6] that studies statistical models
called deep neural networks that can learn hierarchical representations from raw data.
The aim of creating a feasible neural network for detection and recognition of images
that categories under neural network as convolutional neural network (ConvNet) [7,
8]. ConvNet is widely used for feature extraction through its processing layers for
the input image.
First, this research [9] used 15,000 manually cropped RGB images into a single leaf to
detect only the infected area of the crop. These images are used to classify three types
of cassava leaf diseases, by applying a different set of a train, validate, and test split
ranges, in which 10 percent is used for validating the model, and other are used for
train and test of 10/80, 20/70/, 40/50, and 50/40 percent, respectively. They also used
Google InceptionV3 and achieved 98% of accuracy, but at the same time, this study
cannot achieve good performance when having random images, which are captured
under random conditions, which will not allow the model to be applied in real-world
conditions. In this research [10], they used GoogLeNet and AlexNet models to train
54,306 images from the plantVillage Web site, in which GoogLeNet performs better
and consistently with a training accuracy of 99.35%. But in this study, the accuracy
degrades to 31.4%. In this study, three train–test split distribution of 75/25, 60/40,
and 70/30 in percent has been used with three types of image types which are RGB
color images, grayscale images, and segmented images. In third work [11], they used
an automated pattern recognition using CNN, to detect three types of plants and their
diseases, based on simple leaves of the plants, using the 5 basic CNN models, from
pre-trained models. The study uses 70,300 for training and other 17,458 images for
testing, with a standard size of 256 × 256 pixel size. Models fine-tuned [10, 12, 13]
in these studies are AlexNet, AlexNetOWTBn, GoogLeNet, OverFeat, and VGG,
with the highest accuracy on the training set with 100 and 99.48% on the testing set
in VGG model.
Table 2 shows the existing research for wheat rust detection with significantly
improved accuracy, as well as they have some limitations in their works which is
compared with the current scenario. Here, listed major four techniques are taken for
comparison of [9–12].
In this learning, our model has been proposed from scratch which is named after the
author’s name ‘MosNet’ model without using any transfer learning method and our
model classifies the images into two categories, which are infected and not infected
Applying Deep Learning Approach for Wheat Rust Disease Detection … 555
• The third layer contains a convolution layer with 32 feature maps with a size of
3 × 3 and a rectifier activation function called ReLU and a pooling.
• Layer with a size of 2 × 2.
• The next layer contains a convolutional layer with 64 feature maps with the size
of 3 × 3 and rectifier activation function followed by the max pooling layer with
a pool size of 2 × 2.
• In the next layer, the 2D matrix data is converted into flattened vector and it
allows the output to be processed through its fully connected layers and activation
function.
• Regularization layer is the next layer which uses dropout. In order to reduce the
overfitting, this layer is configured to randomly 30% of neurons.
• The next layer is a fully connected layer with 64 neurons with a rectifier activation
function.
• The output layer has 2 neurons for the 2 classes and a sigmoid activation function
to output probability-like predictions for each class. The model is trained using
binary cross-entropy as loss function and the Adam gradient descent algorithm
with different learning rates (Fig. 1).
3.2 Algorithm
Step 1 Image collection from different parts of Ethiopia, especially parts that
are known by their vast product of wheat crop using digital camera and
smartphones, is the first step.
Step 2 The second step is increasing the number of images using ‘data augmen-
tation’ technique because there are no means and culture of image data
stored in Ethiopia since the country is still not aware of new approaches of
machine learning. In this step, 192 original images had been taken from real
cultivation field to 2113 total images used to train our model using ten (10)
augmentation features. Here below the ten features have been listed used to
augment the original images.
• Rotation, height shift, width shift, rescaling, shear, zoom, horizontal flip, fill mode,
data formant, and brightness
Step 3 The third step is to resize the images into a common standard format so that
our model can have a uniform image reading system and it makes it easy to
get images with different sizes and resizes them into 150 × 150 height and
width.
Step 4 The final step is to segment the images using the RGB color Value of the
infected images and feed to the model. Therefore, each color of the three
types of wheat rust diseases has its color and all colors have their own unique
RGB value representation.
Note: R = red, G = green, B = blue.
Applying Deep Learning Approach for Wheat Rust Disease Detection … 557
Input image
Convolution2D
MaxPooling2D
Conv Layer2
ReLuActivation
MaxPooling2D
Convolution2D
Conv Layer3
ReLuActivation
MaxPooling2D
(None, 64)
Flatten
SigmoidActivation
Output Layer
Dense
Dense
Segmenting the images using these color values gave us the perfect classification
by creating an identified zone on only infected areas of the crop, and (f) the image
is the healthy one; it gives us a solid dark image so that it can identify it easily from
infected crops.
As shown in Fig. 2, there are four different crop images, which are the healthy
wheat image (a), a wheat crop which is infected with yellow rust (b), a wheat crop
which is infected with stem rust (c), and wheat crop which is infected with leaf rust
(d). This segmentation has made the model classify far better than images without
being segmented.
4 Experimental Results
All the experiments are performed using the Python programming language on
Anaconda distribution platform using JupyterLab editor. MosNet model has been
evaluated on three different datasets (grayscale image dataset, RGB image dataset,
and RGB value segmented image dataset) using different parameters that can affect
the efficiency of any model. These parameters are learning rate, dropout ratios, and
train–test split ratio. These important parameters have been used in different combi-
nations; because each and every parameter has its effect, every time the value of the
parameter is changed. Therefore here below there are the parameter values which
are selected to be more efficient and to show the impact of the parameter values.
Applying Deep Learning Approach for Wheat Rust Disease Detection … 559
Epochs Epochs are the number f training iterations the model runs for the given
dataset. Three different epochs have been used, and these are 100, 200 and 300
training iterations.
Learning Rate Adam (0.001), Adam (0.0001), and Adam (0.00001) have been used.
Test Ratio Test ratio is the amount of data that is used to test the accuracy of the
model after the training is finished. And test data is a type of data that is not learned
by the model before. Test ratios of 25, 20, and 30% have been used from the total
dataset.
Dropout It is explained in (2.5 dropout) section and has used two values of dropout
rate which are 50 and 30%. Each and every parameter has its impact on the model
depending on the values used in combination with the other parameters, and the
effect they brought is discussed.
In this study, more than two hundred experiments have been conducted by
exchanging the combination of parameters that are explained in this section. MosNet
is evaluated on three kinds of image dataset which are: grayscale, RGB, and RGB
value segmented image datasets.
Grayscale images are images with only one color channel, which cannot hold
enough information to be processed or extracted from the image data. Evaluating
the same model on the same dataset in Table 3, by changing their parameters that
can affect the efficiency of the model, the effect of the learning rate is discussed
for results 1 and 2 in the table. The only parameter changed from the two results is
their learning rate, in which 0.001 is used for the first result and 0.00001 is used for
the second result, which results in an accuracy of 85.89% and 81.63%, respectively.
This shows that the learning rate in result 1 which is equal to 0.001 has decreased to
0.00001 in result 2.
Table 4 shows the improved result in accuracy of proposed MosNet model.
This sounds, as the learning rate decreases, the accuracy of the model decreases
proportionally, which sounds like, decreasing the learning rate means, decreasing
in the speed of the learning ability of the model. As long as the model is using the
same number of epochs, dropout rate, and test ratio, the result shows the model is
taking too much time to learn features from the data. As understood from the table,
the model starts to degrade on the 300th epoch and this shows for grayscale images
it has nothing more to extract and learn because there is only one channel this makes
it lose more features from the data and that will degrade the efficiency of the model
after the 200th epoch. This result will force us to use RGB images which have three
channels and can contain more than 16 million colors. This helps the model to extract
and learn from the color nature of images and helps to prevent losing information
from the images, which can be extracted from the color of the images since in the
case of this study wheat rusts are identified by their colors from the healthy wheat
images. The graphical results of each evaluation can be seen (Figs. 3 and 4).
As have shown in Figs. 5 and 6, there is a big difference when testing our model
on the training data and test data. The difference is the training data is the data
that is already learned by the model during the training time and it does not take
it so long for the model to generalize all samples as a class of learning time. As
understood from the above confusion matrix, loss graph and accuracy graph cannot
have good enough detection model that can work properly in real-time conditions as
it is with many error results, even though it detected more than 80% of the testing
data that is why needed to improve the quality and more extracted data as much as
possible. Therefore, the next step is to train our model using RGB images without
segmentation, this brought as much further accuracy than the model trained with the
grayscale image dataset, and the results of the two datasets can be compared with
the same parameters.
If the results of grayscale dataset of row 5 are shown in Table 3 and RGB dataset
of row 4 in Table 3, there is a big difference in their result.
The above two diagrams are the results found evaluating our model on two
different datasets, the grayscale dataset which is shown in Fig. 7 and RGB dataset
which is shown in Fig. 10. As the accuracy graph of both evaluations is seen, the one
with the grayscale dataset has the minimum value on validation (test) data, which
scores only 79.08% accuracy and 20.92% error, which is not convenient to use this
model on real-time applications, and this shows as the number of epochs increased
to the maximum when grayscale image dataset is used, it starts to degrade the accu-
racy of the model because when the model extracts grayscale images repeatedly,
the model generalizes the samples into false classes and starts to overfit the classes.
The next graph (Fig. 8) is the evaluation result of RGB image dataset with the same
parameter used in the previous one, the result is far better than the one with grayscale
image dataset, this evaluation scored with 99.27% accuracy and 0.73%, this is a pretty
good result, and besides the comparison, the model has still an accuracy value of
99.51%, but still, needed more extraction technique because have seen some prob-
lems with problems that it classifies some of the wheat images with rain droplets,
soil and leaves with fire burn as an infected crop which is an error. Therefore, there
Applying Deep Learning Approach for Wheat Rust Disease Detection … 563
Fig. 9 Classification report of RGB segmented dataset value with the highest accuracy
is a need to find another way to fix this problem, it was segmenting the images using
their unique RGB value, and this brought us a big improvement with the result of
99.76% (Table 3 row 1 of RGB segmented dataset part) accuracy and also fixing the
problems encountered in previous evaluations on grayscale and RGB image datasets.
This result also gave us a precision and recall value of 1 which indicates a perfect
evaluation of the model (Fig. 9).
As shown in Fig. 10, the result that has been achieved by segmenting images using
their RGB value gave us an excellent accuracy with validation accuracy of 99.76%
and the lowest error rate of 0.24% which is absolutely a great achievement in this
study. You can see that the loss of the model starts from around 0.6 and tends to
almost zero error gradually, and for the accuracy part it starts from around 88% of
accuracy and ends on 99.76% on the 300th epoch.
This study discussed different CNN models by applying different important factors
that can affect the model designed in the study. Three dataset types are used in the
study to conduct the experiments for the MosNet model, and these dataset types
564 M. D. Olana et al.
Fig. 10 Loss and accuracy graph of RGB segmented dataset with the best result
are: grayscale image dataset, a dataset which only contains one channel images,
RGB dataset, a dataset with 3 channel images and RGB color segmented dataset, a
dataset which is RGB and segmented with the disease color code. MosNet model
has achieved an accuracy of 86.62% with 200 epochs, 0.001 learning rate, and a
50% dropout rate. This result is improved when the model is trained on the RGB
image dataset, which climbed to an accuracy of 99.51%. Finally, after segmenting the
images using the color of the infected images, the model extracted better information
than the previous model and achieved an accuracy of 99.76% with 300 training
epochs, the learning rate of 0.001, and the dropout rate of 30%.
This study delivered a CNN model that can effectively monitor wheat crop health
which is quite helpful for early protection of the wheat farm before the disease spreads
out and makes total damage to the crop. This is a nice thing for early prevention of
total crop loss, but it is not enough for statistical data, which means the detection
should be with identifying which type of disease occurred and in how much extent
it occurred on the farm. Collecting enough and well-defined dataset on different
agricultural land, that help this study to progress to work on different variety of
crops and their disease types to apply CNN models to the real world in a short period
of time in Ethiopia. Since lack of enough data limits the study unable to progress
more than the result found currently.
References
1. FAO (2016) Ethiopia battles wheat rust disease outbreak in critical wheat-growing regions.
https://www.fao.org/emergencies/fao-in-action/stories/stories-detail/en/c/451063/. Accessed
20 Mar 2019
2. Seyoum Taffesse A, Dorosh P, Gemessa SA (2013) Crop production in Ethiopia: regional
patterns and trends. In: Food and agriculture in Ethiopia: progress and policy challenges, vol.
9780812208, pp 53–83
3. USDA. Ethiopia Wheat Production. https://ipad.fas.usda.gov/rssiws/al/crop_production_
maps/eafrica/ET_wheat.gif. Accessed 19 Jan 2019
Applying Deep Learning Approach for Wheat Rust Disease Detection … 565
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 567
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_42
568 K. V. Jeeva Padmini et al.
1 Introduction
the predictive model of the proposed DST. To test the utility of the proposed DST,
the feedback is consulted from five RPA SMEs of a global IT consulting services
company, who applied the tool to three different projects of a multinational financial
services company’s RPA transformation. Out of the three projects considered, one
project was a failure, and the proposed DST predicted all three scenarios correctly.
The rest of the paper is organized as follows. Section 2 presents the litera-
ture review. Section 3 describes the research methodology. The proposed DST for
analyzing business processes and identifying candidate business processes for RPA
is presented in Section 4. Section 5 presents the survey results and the empirical
study conducted to validate the utility of the proposed DST, concluded in Sect. 6.
2 Literature Review
RPA is a novel technology in BPA [1], which has the potential to automate business
processes quickly and efficiently without altering existing infrastructure and systems.
RPA software robots (aka., virtual bots or bots) work 24 × 7 and are programmed
to follow a specific set of rules to complete tasks with higher accuracy and speed,
which is not achievable from a typical human worker [12]. A bot can integrate with
almost any software without accessing third-party tools or API [1] and behaves like
a proxy human worker to operate a business application [7]. It relives workers from
mundane tasks and enables them to focus on work requiring more human intervention,
cognitive skills, and creativity [11]. RPA could replace the human work pertaining
to criteria such as high-volume transactions, need to access multiple systems, stable
environment, low cognitive requirements, easy decomposition into unambiguous
rules, proneness to human error, limited need for exemption handling, and clear
understanding of the current manual costs [12]. Artificial intelligence (AI) is applied
to non-repetitive tasks where is RPA is typically performed in repetitive tasks. Both
RPA and AI together help to generate compelling applications and decision making
in business processes [5].
As an emerging technology, RPA has its own set of challenges. Based on the
experience in implementing RPA in 20 companies, Lamberton [9] identified that
30–50% of the initial projects failed. It has been identified that applying traditional
software development methodologies and transforming the wrong business process
are critical contributors to such failure [8–10]. While both factors require consid-
erable attention from academics and practitioners, this work focused only on the
problem of selecting the right business process for RPA transformation. Failing to
select a suitable process early in the business analysis leads to user and customer
frustration, in addition to direct losses such as time and money. In [13], Lacity and
Willcocks presented a widely used set of indicators such as rule-driven, repetitive in
nature, data-intensive, high compliance, and validations to select suitable candidate
business processes for RPA. Further, in [14], a set of recommendations to assess
and analyze each business process against human factor, complexity, and stability
are presented. In [7, 15, 16], it has been identified that selecting candidate business
570 K. V. Jeeva Padmini et al.
processes with a high volume of transactions, process maturity, and business rules
lead to better success. Moreover, in [16], it is identified that business processes with
high workload and low complexity are better candidates for RPA bot implementa-
tion. While these related works give useful insights on a few factors to consider in
determining the RPA suitability of a candidate business process, no systemic mecha-
nism/process exists to do so. Therefore, it is imperative to identify a suitable process
to determine the candidacy early in the business analysis.
3 Research Methodology
Table 1 (continued)
Factor type Factor name Description Measurement criteria
Stability of Availability of UAT or Yes, No
environment (SOE) any other
near-production
environment of the
target application for
automation
Financial impact Operational cost (OC) Cost of day-to-day High, medium, or low
factor maintenance and
administration
Impact of failure Financial impact to High, Medium, or Low
(IOF) client if the bot goes
wrong
Human error (HE) Task is error-prone due High, medium, or low
to human worker
negligence than
inability to code as a
computer program
End of life (EOL) How soon the bot will ≤2 years, ≤4 years,
become obsolete, e.g., ≥5 years
new system is on the
road map of client
Service-level Time taken to complete Seconds, minutes,
agreement (SLA) end-to-end hours, days
functionalities of a task
business process are finalized . Five factors were combined with other factors, as
they were determined to be alternative terminologies/definitions of the same set of
factors. Four factors were removed because they were hard to measure, e.g., the
client’s expected outcome of the process delivery. A requirement of irregular labor
was identified as a new factor from the interview findings that were not in the initial
list of 25.
Based on the interviews, it was further realized that not all 16 factors are of
equal significance. However, while attempted to rank the relative importance of the
factors with the feedback from RPA SMEs, it was not straightforward. Hence, an
industry survey is decided to conduct to measure 16 factors of RPA projects/bots
developed by surveyed SMEs and their outcomes and consequently developed a
questionnaire based on the 16 factors. It was first shared with ten RPA SMEs of the
global IT consulting services company as a pilot survey. Profiles of SMEs included
project managers, architects, developers, and testers. In the questionnaire, the survey
participants are still asked to prioritize the factors. However, while analyzing the pilot
survey results, it is understood that prioritization of factors depended on the role the
survey participants played within the RPA project. Hence, the factor prioritization
option is removed and update the questionnaire based on the feedback received.
A Decision Support Tool to Select Candidate Business Processes … 573
The online questionnaire is then shared with the industry experts across the
World. These experts were identified using LinkedIn, public forums, webinars,
etc. 56 responses are collected. Three of them were discarded after verifying their
validity because those respondents commented that they cannot assess the process
complexity. Even after attempting to collect data for more than three months, it was
difficult to collect many responses, as it was hard to find professionals who have
completed at least one RPA project. Moreover, some declined to participate, citing
confidentiality agreements with their clients or conflict of interest. Among the 53
responses, only 22 of survey participants were involved in a bot implementation that
was successful and another eight were involved projects with failed bots. The other
23 participants had ongoing projects which were at different phases of the project
lifecycle.
As the dataset collected from the online survey contained both categorical and
ordinal data, the Spearman correlation is calculated for 30 (22 + 8) responses. That
factors are identified such as workload variance (WV), no. of systems to access
(NOSA), and service-level agreement (SLA) had a positive correlation with the
RPA outcome of the business process (represented as the Status of Candidate Busi-
ness Process (SCBP) in Table 1), whereas volume of transactions (VOT), regulatory
compliance (RC), and cognitive features (CF) have a negative correlation with SCBP.
However, due to the limited number of responses, as well as the feedback from SMEs
during interviews indicated that the other ten factors are still useful in capturing the
characteristics of a business process, decided to develop the prediction model using all
16 factors. This was also motivated by the fact that the dataset was small. To derive
the RPA suitability decision, the two-class decision forest classification model is
used. The overall accuracy was verified using the fourfold cross-validation process
and the prediction model was evaluated to have an overall accuracy of 90%. The
choice of Spearman correlation, two-class decision forest classification model, and
resulting data analysis are presented in Sect. 4.
Finally, the predictive model was further validated with the help of five RPA
SMEs who applied it to three different projects of a global IT consulting services
company that develops RPA bots. The three projects were chosen from three business
processes of a consumer division of a multinational financial services company. At
the time of evaluation, three projects were already implemented at the customer
site and were in operation for a sufficient time to determine their success/failure in
achieving the business outcomes. The proposed DST determined two of the projects
as suitable and the other as unsuitable for RPA transformation. Indeed the project
that was determined to be predicted as unsuitable for RPA transformation had failed
due to wrong selection of the business process.
4 Predictive Model
While the correlation analysis indicated that all 16 factors are relevant in determining
the RPA suitability of a candidate business process, it is difficult to determine the
574 K. V. Jeeva Padmini et al.
56 responses are collected from the online survey shared across the industry. Three
responses were discarded during data cleaning to correct the data inconsistency and
to remove noise. Two of those respondents had commented that they could not assess
the process complexity of the selected business process.
A Decision Support Tool to Select Candidate Business Processes … 575
Spearman’s correlation is used to identify the significant factors from the survey
responses. Pearson’s correlation is used when the data contains intervals or ratios,
whereas some of the factors are binary or categorical. Because having a categorical
and ordinal dataset, Spearman’s correlation was used to measure the monotonic
relationship strength between variables. Table 2 lists the resulting correlation values.
SCBP is the dependent variable and the 16 factors are independent variables. Lower
Spearman’s correlation among factors indicates that there is no inter-relationship
among them. Therefore, the chosen 16 factors capture the different properties of the
candidate business process.
576
Table 2 Spearman’s correlation among 16 factors and the status of the business process
SCBP VOT BPC ROC RBBP WLV RC CF NOSA MMI DD OC IOF HE EOL SOE SLA
SCBP 1.00 −0.62 −0.16 0.00 −0.01 0.41 −0.32 −0.37 −0.31 −0.19 −0.06 −0.18 −0.02 −0.15 −0.02 0.05 0.40
VOT −0.62 1.00 0.16 0.06 0.14 −0.53 0.06 0.05 −0.19 0.00 0.00 0.05 0.10 0.19 0.14 0.00 −0.29
BPC −0.16 0.16 1.00 0.19 0.16 −0.07 −0.25 −0.04 0.13 −0.23 −0.53 0.35 0.32 0.12 −0.37 0.16 0.14
ROC 0.00 0.06 0.19 1.00 −0.19 0.22 −0.27 0.13 0.03 −0.27 −0.34 0.09 0.30 0.00 0.25 0.24 0.25
RBBP −0.01 0.14 0.16 −0.19 1.00 −0.08 −0.12 −0.15 0.06 0.15 −0.07 0.30 0.19 0.00 0.01 0.47 −0.11
WV 0.41 −0.53 −0.71 0.22 −0.77 1.00 −0.23 −0.30 0.11 0.07 −0.14 −0.29 0.22 0.00 −0.07 −0.16 0.38
RC −0.32 0.06 −0.25 −0.27 −0.12 −0.23 1.00 0.24 −0.15 0.11 0.33 −0.43 −0.19 −0.16 0.25 −0.25 0.05
CF −0.39 0.05 −0.04 −0.13 −0.15 −0.30 0.24 1.00 −0.07 0.61 0.21 −0.29 −0.17 0.00 −0.19 0.11 −0.42
NOSA −0.31 −0.19 0.13 0.03 0.06 0.11 −0.15 −0.07 1.00 −0.22 −0.35 0.24 0.07 0.26 −0.20 0.12 −0.18
MMI −0.19 0.00 −0.23 −0.27 0.15 0.07 0.11 0.61 −0.22 1.00 0.28 −0.19 0.02 0.00 −0.19 0.11 −0.24
DD −0.06 0.00 −0.53 −0.34 −0.07 −0.14 0.33 0.21 −0.35 0.28 1.00 −0.38 −0.52 0.00 0.17 −0.15 −0.08
OC −0.18 0.05 0.35 0.09 0.30 −0.29 −0.43 −0.29 0.24 −0.19 −0.39 1.00 0.51 0.21 0.01 0.14 −0.10
IOF −0.02 0.10 0.32 0.30 0.19 0.22 −0.19 −0.17 0.07 0.02 −0.52 0.51 1.00 0.00 −0.02 0.01 −0.04
HE −0.14 0.19 0.12 0.00 0.00 0.00 −0.16 0.00 0.26 0.00 0.00 0.21 0.00 1.00 0.37 −0.19 −0.02
EOL −0.02 0.14 −0.37 0.25 0.01 −0.07 0.25 −0.19 −0.20 −0.19 0.17 0.00 −0.02 0.37 1.00 0.02 −0.02
SOE 0.05 0.00 0.16 0.24 0.47 −0.16 −0.25 0.11 0.12 0.11 −0.15 0.14 0.01 −0.19 0.02 1.00 −0.24
SLA 0.40 −0.29 0.14 0.25 −0.11 0.38 0.05 −0.42 −0.18 −0.24 −0.08 −0.09 −0.04 −0.02 −0.03 −0.24 1.00
K. V. Jeeva Padmini et al.
A Decision Support Tool to Select Candidate Business Processes … 577
Next, the accuracy of the trained two-class classification model is evaluated based
on the decision forests algorithm. Accuracy, precision, recall, and F score are some
of the commonly used metrics to evaluate the accuracy of a classification model.
Accuracy measures the quality of the classification model as the proportion of the
578 K. V. Jeeva Padmini et al.
true results to all the cases within the model. Precision describes the proportion of
true results overall positive results. The recall is the fraction of all correct results
returned by the model. F score is calculated as the weighted average of precision
and recall between 0 and 1 (higher the better). Table 3 presents the results based on
fourfold cross-validation. It can be seen that the overall accuracy of the model is 90%
and the model has good precision and recall too. A higher F 1 score further indicates
good accuracy of the test.
The proposed DST primarily includes the trained two-class classification model
and the values of 16 factors fed into it. The prediction model is published as a Web
service in Microsoft Azure platform such that the SMEs could access it to validate the
model (see Fig. 3). Finally, the proposed DST is verified by applying it to evaluate the
RPA transformation of three business processes of a multinational financial services
company.
The proposed DST was validated with the help of five RPA SMEs who applied it to
three different projects of the global IT consulting services company. Figure 4 shows
the output of the DST for the failed RPA transformation project which illustrates
the results as 0 for No (1 for Yes) with the status of the list of factors. The second
sentence explains the output. The DST predicted the other two projects as suitable
for RPA transformation, and they were indeed successful in actual operation at the
client site.
<- ‘DSS in RPA 3 [Predictive Exp.]’ test returned [“0”, “Medium”, “High”, “X<=2”, “With few
exceptions”, “No”,”Yes”,”Yes”,”X<=2”, “No”,
“Yes”,”Medium”,”High”,”Medium”,”2<X<5=5”,”Yes”,”Days”, “-…
As per the interview and data analysis, it is identified that RPA projects fail when
the selected business processes have the following characteristics:
1. The complexity of the business process is high.
2. Workload tends to experience high variation.
3. Regulatory compliance is needed.
4. The business process needs to access more than three to five different systems.
5. Multi-model inputs need to be handled.
6. The operational cost is high.
7. The system environment is not stable.
All the failed RPA projects had these seven characteristics. Therefore, to be suit-
able for RPA transformation, a business process should have no more than two or
three of these characteristics.
The demographic analysis of survey participants is further presented to show that the
industry status and how the responses may have affected our findings and conclusions.
As seen in Fig. 5 survey participants had different types of roles within an RPA project,
such as business analysts (30%), developers (26%), project managers (21%), testers
(14%), head of the transformation (3%), technical leads (3%), and architect (3%)
and did not have any agile specific roles such as scrum master. This could be due to
Project Manager
21%
Developer
26%
relatively less use of agile principles and values within an RPA project, which has
been identified as another reason for the failure of RPA projects.
As seen in Fig. 6, 96% of respondents had less than five years of experience
and only 4% of respondent had over 5 years’ experience in RPA projects. This is
understandable given that RPA technology is new to the industry.
As per Fig. 7, it can be seen that 11% of the respondents are planning to develop
bots in the future, and 32% of bots developments are still in progress. Only 15%
of the bots failed where it was determined that the client/business is not satisfied
with the resulting bots. The industry is moving toward bot development, and 42% of
the respondents confirmed that business is satisfied with the bot development. These
successful projects had only one to three of the seven characteristics mentioned in
the above section.
5+ years
experience in
RPA Less than 1 years
4% experience in
RPA
11%
1+ years
3+ years experience in
experience in RPA
RPA 32%
53%
Planing to develop
bots in the future Bot development
11% project is sƟll in
progress
Bots are developed 32%
and business
needs are NOT
saƟsfied
15%
6 Summary
References
1. Asatiani A, Penttinen E (2016) Turning robotic process automation into commercial success—
Case OpusCapita. J Inf Technol Teach Cases 6:67–74
2. Auro Inc. Use cases—RPA in telecom industry. https://www.aurorpa.com/rpa-telecom-industry
3. Cosourcing Partner (2016) Exploring robotic process automation as part of process improve-
ment (2016)
4. Accenture (2016) Getting robots right—How to avoid the six most damaging mistakes in
scaling up robotic process automation. Accenture Technology Vision
5. Cline B, Henry M, Justice C (2016) Rise of the robots. KPMG
6. DeMent B, Robinson T, Harb J (2016) Robotic process automation: innovative transformation
tool for shared services. ScottMadden, Inc
7. Institute for Robotic Process Automation (2015) Introduction to robotic process automation—A
primer
8. Shared Services and Outsourcing Network (2017) The global intelligent automation market
report
9. Lamberton C (2016) Get ready for robots: why planning makes the difference between success
and disappointment. Ernst & Young
10. Sigurðardóttir GL (2018) Robotic process automation: dynamic roadmap for successful
implementation. Msc. thesis
11. Casey K (2019) Why robotic process automation (RPA) projects fail: 4 factors. https://enterp
risersproject.com/article/2019/6/rpa-robotic-process-automation-why-projects-fail/
12. Lacity M, Willcocks L (2015) Robotic process automation: the next transformation lever for
shared services. London School of Economics Outsourcing Unit Working Papers, vol 7, pp1--35
582 K. V. Jeeva Padmini et al.
13. NICE (2016) Selecting the right process candidates for robotic automation. NICE Systems Ltd.
14. Haliva F (2015) 3 criteria to choosing the right process to automate. https://blog.kryonsystems.
com/rpa/3-criteria-to-choosing-the-right-process-to-automate
15. Kroll C,Bujak A, Darius V, Enders W, Esser M (2016) Robotic Process automation—Robots
conquer business processes in back offices. Capgemini Consulting
16. Schatsky D, Muraskin C, Iyengar K (2016) Robotic process automation: a path to the cognitive
enterprise. University Press, Deloitte
17. Bernes J (2015) Azure machine learning. Microsoft azure essentials. Microsoft Press
Feature-Wise Opinion Summarization
of Consumer Reviews Using Domain
Ontology
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 583
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_43
584 D. Vidanagama et al.
customers examined for a product online every day. Most of the consumers believe
online reviews, while positive reviews increased the trust on a product of 73% of
consumers and 49% of consumers need higher rating before deciding on a product
[1].
When making decisions from a vast collection of reviews, it is difficult to read
all the reviews from Web sites, blogs, forums and social media at once. Otherwise,
the decision-makers may acquire some biased view on the product by reading only
a few reviews. There may be some chances of missing the reviews with positive or
negative opinions. Because of that, the reviews generated by the consumers can be
classified into positive, negative or neutral using the available review mining tech-
niques which are applied in the field of information extraction from text processing.
In these classical classification methods, sometimes the negatively classified review
may not mean that the customer did not like any feature about that product and a
positively classified review may not imply that the reviewer liked every feature of
that product. They do not provide an idea about what people more like or dislike
about those products in advance. Sometimes, a feature which is not important for
one person may be important to another. Also, there may be some reviews which
contain opinions without considering any feature about the product itself. A prefer-
able choice is to evaluate products based on its feature ratings without considering the
overall rating. So, it is necessary to provide a feature-wise opinion summarization.
On the other hand, when considering the semantic orientation of opinion words
associated with each feature, variety of the sentiment analysis techniques used
domain-oriented semantic lexicons which may be inaccurate for determining the
polarity of ambiguous opinions. So, this paper suggests an approach of feature-wise
opinion classification by selecting the important opinion targets using a domain
ontology and determines the semantic orientation of unambiguous and ambiguous
opinion words using a combinational approach of semantic lexicon method and
PMI algorithm. The final feature-wise summarization of reviews is generated by
aggregating all the feature-wise sentiments.
The structure of the paper is as follows: The second section focuses on related work
in the area of semantic analysis, feature-level semantic analysis, feature extraction
methods. The third section presents the proposed methodology. The fourth section
depicts the experiment details and evaluation results. The final section concludes the
paper.
2 Related Work
feedbacks and analyzing them manually is a very expensive and erroneous task. So,
the computational sentiment analysis is supportive to recognize problems by reading,
other than by asking, thereby guaranteeing a more accurate likeness of certainty.
By analyzing such consumer opinions, they can generate customer vision, improve
marketing effectiveness, increase customer gratification and protect brand reputation
for better market research effectiveness.
Analysis of consumer reviews can be performed in three different ways, namely
sentence-level, document-level and feature-level analysis [3]. In the document-level
analysis, the whole document with opinions is considered to be having opinions on
a single object and finally, the whole document is classified as positive or negative
[4].
The sentence-level sentiment analysis method calculates the polarity by each
sentence by taking the sentence as a single unit [4]. The feature-based senti-
ment analysis method identifies the features of the certain object and classifies the
sentence/document based on the opinion word of the features.
Feature-level sentiment analysis first discovers the targeted objects, their compo-
nents, attributes and features of the opinionated sentence and then decides whether the
opinions are negative, positive or neutral [5]. This kind of analysis is required to make
product enhancements and to express what features of the product are mostly liked
or not [5]. A positive or negative opinionated document at whole does not mean that
the opinion holder has positive or negative opinions on all the features of that partic-
ular object. Such details cannot be identified in document-level or sentence-level
sentiment analysis. When performing feature-level sentiment analysis, it is required
to identify some information related to the reviews. Those are the synonyms and
feature indicators of a feature, target object, sentiment orientation of the opinion on
the feature, reviewer and the time of the review [5].
The sentiment analysis can be achieved by using either machine learning, lexicon-
based or hybrid approach. The supervised machine learning approach uses the Naïve
Bayes, support vector machine, K-nearest neighbors and maximum entropy algo-
rithms for classification [4]. The sentiment orientation (SO) of opinion words is
determined in the unsupervised approach [4]. Also, lexicon-based method [6] and
unsupervised dictionary-based technique [4] have been used to perform sentiment
classification.
Esuli and Sebastiani [7] proposed the LSA method to feature extraction and the
approach of MapReduce and Hadoop together with SVM to expand the accuracy
and efficiency of sentiment classification of movie reviews. Liu et al. [8] proposed
a sentiment mining approach using a Naïve Bayes classifier combined with the
Hadoop framework. A mixture of lexicon-based and machine learning method is
used by Zhang et al. [9] in their research. A lexicon-based method was applied to
Twitter data to perform sentiment analysis. Then, chi-square test was applied on the
586 D. Vidanagama et al.
output, and additional tweets with opinions were recognized. Then, a binary senti-
ment classifier was trained to allocate sentiment polarities of the newly recognized
opinionated tweets. The classifier used the training data which is provided by the
previous lexicon-based method. Nandhini et al. [10] discussed feature-based senti-
ment analysis method by performing the feature extraction with the use of feature
dictionary and opinion extraction with the use of opinion dictionary and combined
the effect.
Many types of research use ensemble classifiers to increase robustness, accuracy
and better overall generalization [11]. Apart from using lexicon-based or machine
learning techniques, there was a hybrid approach of the rule-based classifier, lexicon-
based classifier and SVM as the machine learning classifier for Twitter messages
[12] and a scalable and useful lexicon-based approach for mining sentiments using
emoticons and hashtags with Hadoop [13]. Govindarajan [11] proposed an ensemble
classification with Naïve Bayes and genetic algorithm for movie review data. Shinde-
Pawar [14] proposed a sentiment analysis technique by applying artificial neural
network (ANN) and fuzzy logic.
When expressing the opinion on products or services, the customers mainly focus
on different features of the particular object. Any object can have different features
which the customers may like or dislike, in turn, may help the manufactures to
improve the quality of the product by focusing on the features that need further
attention. Also, the customers who wish to buy a particular object can analyze the
feature-wise sentiment and get to know about the most significant features. Feature-
wise sentiment analysis considers the overall sentiment of the review as well as the
sentiment on the particular features of the products [15].
The basic steps of feature-level sentiment analysis are data preprocessing, extrac-
tion of features and extraction of opinions, determine the polarity of opinion words
and identify the polarity of opinionated sentence and generation of summary [4].
Data preprocessing step uses different techniques like part of speech (POS) tagging,
lemmatization stemming and removal of stop words which can be useful for noise
removal and enabling in feature extraction in a dataset. Feature mining classifies the
product features which are being opinionated by customers. Opinion word extraction
classifies the text which contains sentiment or opinion. Opinion word polarity iden-
tification decides the sentiment polarity of the opinion word as positive, negative
or neutral; finally, the opinion sentence polarity identification step aggregates the
polarity by sentence-wise and summary generation aggregates the results obtained
from each sentence.
The features describing the products can be categorized into morphological type
features, frequent features, explicit or implicit features [2]. Morphological features
can be classified as semantic, syntactic and lexical structural [2]. Semantic features
are types of contextual information and semantic orientation [2]. Syntactic features
Feature-Wise Opinion Summarization of Consumer Reviews Using Domain Ontology 587
use chunk labels, POS tagging, dependency depth feature and N-gram words [2].
Lexical structural features consist of special symbol occurrences, word distribu-
tions and word-level lexical features which are infrequently used in opinion mining
[2]. Common features are the features which the customers have more interest in
Apriori association rule mining, or frequent pattern mining is broadly used to iden-
tify frequent features. Most features have explicitly appeared within reviews [16],
e.g., ‘picture quality is very good’ where the picture is an explicit feature which
can be taken from the review. But some features cannot be directly derived from the
review itself, e.g., ‘it fits in the pocket nicely’ which derives the feature size implicitly.
Appropriate feature extraction techniques which are used in sentiment analysis
has an important role for identifying relevant attributes and opinions and that may
lead to increase the classification accuracy.
More researches were focused on how to extract features from reviews effec-
tively and efficiently. Feature or feature extraction methods can be categorized as
NLP or heuristic-based, frequency-based, statistical, syntax-based, clustering, super-
vised, unsupervised machine learning methods and hybrid approaches [2, 17]. The
frequency-based feature detection method only considers nouns or noun phrases as
possible features. The syntax-related methods discover features by means consid-
ering the syntactical relations. NLP-based methods usually identify the noun, adjec-
tives, noun phrases, adverbs which express product features. They have achieved
high accuracy, but low recall by using POS of tagging [2]. The key weakness of
clustering-based feature extraction is that it can only extract main features and it is
difficult to extract unimportant features [2].
Statistical techniques have computational efficiency, but they ignore feature inter-
actions. There are no much-supervised learning methods for feature extraction. The
power of supervised approaches depends on the features which often constructed
using other methods [17].
The OPINE system used the unsupervised information extraction approach to
identify product features [18]. Also, an unsupervised feature extraction method by
using a taxonomy is developed with user-defined knowledge [19]. A domain ontology
model to extract features from tabular data from the Web was introduced by Holzinger
et al. [20]. Also, structural correspondence learning (SCL) algorithm [21] and a
pattern-based feature extraction which adapted the log-likelihood ratio test (LRT)
[22] were proposed. Also, [16] proposed a technique for product feature extraction
using association rule mining based on the assumption that people frequently use
similar kind of words. A semi-supervised feature grouping technique was used where
the features are grouped based on the synonyms, words or phrases [23]. The paper
of [24] used different feature extraction or selection techniques, namely single word,
document-level, multiword, Tf-Idf single word, phrase-level and Tf-Idf multiword
588 D. Vidanagama et al.
sentiment analysis. Also, some approaches like dependency parsing [25] and joint
sentiment topic model using LDA [26] have been used to extract features of opinion.
Hybrid techniques use some combinational approaches [2] POS tagging with
WordNet dictionary, the combination of lexical and syntactic features with a
maximum entropy model, and a combination of association rules and point-wise
mutual information. The hybrid approach was used to identify product features which
uses bootstrapping iterative learning strategy with the use of additional linguistic rules
for mining less occurring features and opinion words.
On the other hand, most of the researches were used domain ontologies [27–33].
The ontology-based approach does not require training dataset for feature extraction.
In this approach, ontology is used to present domain knowledge about a domain and
allows to show the details of the product which is rated in opinion.
The polarity of the opinion word consists of a sentence and may be changed according
to the domain context. There are no effective methods that can be used to exactly
define the written pattern of sentences [29]. To determine which words are semanti-
cally concerned with, authors measured the co-occurrence of new words with words
from a known seed set of semantically oriented words [30]. But word pairs with
equal polarity can only be determined using their high co-occurrence rates [30].
However, in the some literature, the associated opinion polarity is determined
using publicly available opinion lexicons such as SentiWordNet, general inquirer
and SenticNet. Also, there was a method for defining the polarity of sentiment words
only from the feature of the textual context by converting the textual context of
sentences as semantic pattern vectors [29]. Those semantic pattern vectors were used
to compare the similarity of two sentences while exploring the polarity of sentiment
words included within the sentences.
The sentiment dictionary named ‘SentiMI’ extracted sentiment terms with POS
information from SentiWordNet. Then, the mutual information for both positive and
negative terms was calculated. The final class label is determined by the related
positive and negative scores [32].
Some opinion words have the same orientation in any context, e.g., ‘amazing,’,
‘excellent,’ ‘bad,’ etc [33]. But some words are domain-dependent, and it is very
difficult to find the actual polarity for ambiguous words such as ‘unpredictable,’
‘high,’ ‘good’ and ‘long’ [33]. For example, ‘The phone has long battery life’ and
‘This program takes a long time to run,’ the opinion word ‘long’ has both positive and
negative polarities in the first and second sentences, respectively [33]. These kinds of
opinion words change their polarities according to the context. Therefore, it is neces-
sary to acquire the polarity of these kinds of words by comparing with the contextual
information. So considering only opinion words is not sufficient, and it is wise to
consider its associated feature also. Because of that, some literature discussed the
methods of using point-wise mutual information (PMI) for determining the polarity
Feature-Wise Opinion Summarization of Consumer Reviews Using Domain Ontology 589
of opinion words [34]. This method only considers opinion words for classification.
There is a statistical method named PMI-IR, to study polarity of words by calculating
the statistical dependence among two words that are often used together in the text
according to the following equation [31] (Eq. 1).
p(x, y)
P M I (X, Y ) = log (1)
p(x) p(y)
3 Proposed Methodology
The proposed framework consists of five main processes. The first process is to collect
the reviews, preprocess and generate dependencies. The second process is to create
the ontology and updating the ontology based on the new features with sentiment
expectation using PMI algorithm and opinions coming from the third process. The
third process is to determine the important features and extract the opinion words
associated with those features. The fourth process is to determine the sentiment of
opinions using semantic lexicons. The fifth process is to visualize a feature-based
summary of the reviews. The overall architecture is shown in Fig. 1.
improves the way of generating the feature-based summary. The knowledge of the
relevant domain is required to construct the ontology of a specific domain.
The objective of the proposed framework is to provide feature-based sentiment
summarization for a specific domain. The proposed framework is applicable for any
domain by replacing the ontology. Ontologies can be constructed by using existing
ontologies in the specific domain or building on from scratch. Since this research
focused on the domain of mobile phone, the initial ontology was constructed by using
the ConceptNet which is a large semantic network consisting of a large number of
concepts. The concepts from ConceptNet were extracted up to level 4 as it was
extracted un-related concepts when the level is increasing. But as the ConceptNet
lacks with related concepts, the specifications from official mobile phone Web sites
were collected to expand the ontology. Also, each node of ontology was expanded
by merging with the synonym words from the WordNet database.
The ontology consists of four main classes: review, feature, feature property and
sentiment. The review class contains review id and line id subclasses. The feature
class contains terms associated on mobile phone domain. Its subclasses are applica-
tion, battery, camera, display, general, price, services and speed. The final summary
is generated according to these feature sets. These subclasses further contain their
subclasses with more specific aspects. Each feature has an object property of semantic
expectation. The feature property class contains the extracted opinion words from
reviews and has the object property of sentiment. The sentiment class contains
only three instance values including 1,-1 and 0 where they represent the sentiment
polarities as positive, negative and neutral. The initial ontology is depicted in Fig. 2.
The opinion about the product is expressed through a feature of the product. The
features which the opinions are expressed were extracted during this process.
For example, by considering the sentence, ‘This picture quality is awesome,’
‘picture quality’ is a feature of the product which the sentiment is expressed.
Always the feature may be a noun or noun phrase. The feature and opinion
extraction process adopt a rule-based strategy. The rules are derived based on
the type of dependency relation and POS tag pattern. The dependency rela-
tion is identified from Stanford Dependency Parser. The relation contains three
components consistent with the type of relation (R), parent word (P) and depen-
dent word (D) denoted as R(P, D). The feature-opinion pairs and new features
of each sentence are extracted using the algorithms in Figs. 4 and 5. Ri is ith
dependency relation between the two words Pi and Di of a single sentence.
Let X = ‘JJ/JJS/JJR/VB/VBG/VBD/VBP/VBN/VBZ/RB//RBR/RBS’ and N =
‘NN/NNS/NNP’.
The relevancy of candidate opinion is checked with SentiWordNet and updates
the ontology accordingly. The extracted features are matched with the constructed
domain ontology, to remove all the insignificant and unrelated features. All the
features are nouns/noun phrases, but all the nouns/noun phrases may not be features
relevant to that domain. So, the algorithm in Fig. 6 used a probabilistic model (Eq. 1)
to identify the relevant features which are not currently included in the ontology.
The semantic expectation of new features can be determined using the algorithm in
Fig. 7 and can update the ontology.
The seed positive word list consists of some strong positive words such as ‘good,’
‘nice,’ ‘positive,’ ‘fortunate,’ ‘correct,’ ‘excellent’ and ‘superior,’ and the seed nega-
tive word list consists of some strong negative words such as ‘bad,’ ‘nasty,’ ‘nega-
tive,’ ‘unfortunate,’ ‘wrong,’ ‘poor’ and ‘inferior’ used to calculate the sentiment
expectation of features (Fig. 7).
The Feature-Score uses a corpus of mobile review sentences which are already
collected to determine the feasibility of candidate feature as a relevant feature. The
Feature-Score value is based on mutual information between a candidate feature and
list of existing features (Eq. 2). If the candidate feature is relevant, then the category
of the new features within the ontology is decided by PMI value (Eq. 2). The new
feature is listed under the feature category which shows high PMI value among all
the other feature categories. Later on, such features were included under the relevant
category as an ontology update.
f (ai bi )
Feature - Score = log2 ×N (2)
i
f (a) f (bi )
where
a candidate feature
bi existing features
Feature-Wise Opinion Summarization of Consumer Reviews Using Domain Ontology 593
GoogleHitsCount(a, Ci )
PMI(a, Ci ) = log2 (3)
GoogleHitsCount(a) × GoogleHitsCount(Ci )
where a is the candidate feature and C i is the ith feature category under ontology.
594 D. Vidanagama et al.
After identifying the features and opinions, the subsequent stage is to define the
sentiment of each opinion. The sentiment of unambiguous opinions can be retrieved
through the ontology using the sentiment expectation of the feature. The sentiment
of the new opinions is generated through SentiWordNet. For the opinions with
ambiguous sentiment orientations, the actual sentiment orientation (+1, −1, 0) of
feature-opinion pair can be calculated by multiplying the sentiment expectation of
the feature and the sentiment orientation of the opinion as the sentiment orientation
of such opinion is changed according to the feature associated with it.
If any of the opinion words are associated with negation words such as ‘not,’ ‘no,’
‘nor’ and ‘none,’ then the sentiment of the feature-opinion pair is changed. If a certain
feature-opinion pair is depended with a negation word, the sentiment is inversed.
Feature-Wise Opinion Summarization of Consumer Reviews Using Domain Ontology 595
All the feature-opinion sentiment details of each sentence are recorded in ontology
as individuals. The final summarized sentiment of each feature category can be
generated by issuing a SPARQL query to the ontology.
596 D. Vidanagama et al.
Experimental evaluation was carried out on the dataset derived from [35] which was
originally collected from Amazon.com. The dataset contains 69 reviews and 554
sentences in the mobile phone domain. The precision and recall are selected as the
evaluation metrics, which are commonly used in information retrieval and document
classification research. Precision is the proportion of the number of appropriately
classified things to the total number of things that were classified. Recall is the
proportion of the number of appropriately classified things to the total number of
things that were classified as the same category in the annotated dataset.
It can be justified that the proposed approach has good recall and precision in
predicting features with both positive and negative opinions. The high recall and
high precision value show the ability to extract all the relevant feature sentiment
pairs. The high F-score value shows the accuracy of the approach (Table 1).
5 Conclusion
Customer reviews are a rich source of information for other consumers and sellers.
Reading all the reviews is time-consuming and not easy to determine which infor-
mation can help to purchase a product. A preferable choice is to evaluate prod-
ucts based on domain-related features. Sentiment analysis is a process of analyzing
user-generated contents for positive, negative or neutral. This paper represents an
approach for analyzing the summary of reviews by considering the features and
sentiment word pairs from reviews. This framework allows collecting consumer
reviews from various sources on a specific domain and collectively analyzing the
sentiment of products under different features. The extracted features are compared
with the existing features of the domain ontology. The ontology is updated with new
features, feature sentiment orientation, opinion words and feature-opinion pairs. The
summary is generated for all the available reviews under each feature category.
Further research suggests generating overall sentiment summary for each review
and considering the polarity of opinions instead of sentiment orientation.
Table 1 Evaluation results
# Of annotated feature # Of features # Of correct feature Recall Precision F-Measure
sentiment extracted sentiment
Feature sentiment
Pos Neg Pos Neg Pos Neg Pos Neg Pos Neg Pos Neg
310 148 295 125 270 118 95.2% 84.5% 91.5% 94.4% 93.3% 89.2%
Feature-Wise Opinion Summarization of Consumer Reviews Using Domain Ontology
597
598 D. Vidanagama et al.
References
1. Local consumer review survey. The impact of online reviews, BrightLocal. https://www.bright
local.com/learn/local-consumer-review-survey/
2. Asghar, MZ, Khan A, Ahmad S, Kundi FM, Khairnar J et al (2014) A review of feature
extraction in sentiment analysis. Int J Comput Sci Inf Technol (IJCSIT) 5(3):4081–4085
3. Joshi NS, Itkat SA (2014) A survey on feature level sentiment analysis. Int J Comput Sci Inf
Technol (IJCSIT) 5(4):5422–5425
4. Kolkur S, Dantal G, Mahe R (2015) Study of different levels for sentiment analysis. Int J Curr
Eng Technol 5(2)
5. Liu B (2010) Sentiment analysis and subjectivity. In: Indurkhya N, Damerau FJ (eds) Handbook
of natural language processing, 2nd edn.
6. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for
sentiment analysis, vol 37. Association for Computational Linguistics
7. Esuli A, Sebastiani F (2016) SENTIWORDNET: a publicly available lexical resource for
opinion mining. In: Proceedings of the 5th conference on language resources and evaluation
(LREC’06), pp 417–422
8. Liu B, Blasch E, Chen Y, Shen D, Chen G (2013) Scalable sentiment classification for big data
analysis using Naive Bayes classifier. In: IEEE international conference on big data
9. Zhang L, Ghosh R, Dekhil M, Hsu M, Liu B (2015) Combining Lexicon-based and learning-
based methods for twitter sentiment analysis. In: National conference on advanced technologies
in computing and networking, pp 89–91
10. Nandhini A, Vaitheeswaran G, Arockiam L (2015) A hybrid approach for aspect based
sentiment analysis on big data. Int Res J Eng Technol (IRJET) 2:815–819
11. Govindarajan M (2013) Sentiment analysis of movie reviews using hybrid method of Naive
Bayes and genetic algorithm. In: Int J Adv Comput Res 3(4)
12. Pedro P, Filho B, Pardo TAS (2013) A hybrid system for sentiment analysis in twitter messages.
In: Second joint conference on lexical and computational semantics, vol 2: Seventh international
workshop on semantic evaluation (SemEval 2013), pp 568–572
13. Kaushik C, Mishra A (2014) A scalable, lexicon based technique for sentiment analysis. Int J
Found Comput Sci Technol (IJFCST) 4(5)
14. Shinde-Pawar M (2014) Formation of smart sentiment analysis technique for big data. Int J
Innov Res Comput Commun Eng 2:7481–7488
15. Rotovei D (2016) Multi-agent aspect level sentiment analysis in CRM systems. In: 18th
International symposium on symbolic and numeric algorithms for scientific computing
(SYNASC)
16. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceeding of 10th ACM
SIGKDD international conference on knowledge discovery and data mining. ACM, Seattle,
WA, USA, pp 168–177 (2004)
17. Schouten K, Frasincar F (2016) Survey on aspect-level sentiment analysis. IEEE Trans Knowl
Data Eng 28(3):813–830
18. Popescu AM, Nguyenand B, Etzion O (2005) Extracting product features and opinions
from reviews. In: Proceedings of the conference on human language technology and empir-
ical methods in natural language processing. Association for Computational Linguistics,
Vancouver, British Columbia, Canada, pp 339–346
19. Carenini, G, Ng RT, Zwart E (2005) Extracting knowledge from evaluative text. In: Proceedings
of the 3rd international conference on knowledge capture. ACM Banff, Alberta, Canada, pp
11–18
20. Holzinger W, Krupl B, Herzog M (2006) Using ontologies for extracting product features from
web pages. In: Proceedings of the 5th international semantic web conference. Athens, Georgia,
USA, pp 286–299
21. Ben-David S, Blitzer J, Crammer K, Pereira F (2007) Analysis of representations for domain
adaptation. Adv Neural Inform Process Syst 19
Feature-Wise Opinion Summarization of Consumer Reviews Using Domain Ontology 599
22. Ferreira L, Jakob N, Gurevych I (2008) A comparative study of feature extraction algorithms in
customer reviews. In: Proceedings of the IEEE international conference on semantic computing.
Santa Clara, CA, pp 144–151
23. Zhai Z, Liu B, Xu H, Jia P (2011) Clustering product features for opinion mining. In: Proceed-
ings of the fourth ACM international conference on web search and data mining. Hong Kong,
China, pp 347–35
24. The importance of sentiment analysis in social media—Results 2Day, Results 2Day. https://
results2day.com.au/social-media-sentiment-analysis-2
25. Mosha C (2010) Combining dependency parsing with shallow semantic analysis for chinese
opinion-element relation identification. IEEE, pp 299–305
26. Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceedings of the
18th ACM conference on information and knowledge management (CIKM)
27. Alkadri AM, ElKorany AM (2016) Semantic feature based arabic opinion mining using
ontology. Int J Adv Comput Sci Appl vol. 7(5):577–583
28. Lin S, Han J, Kumar K, Wang J (2018) Generating domain ontology from Chinese customer
reviews to analysis fine-gained product quality risk. In: Proceedings of the 2018 international
conference on computing and data engineering, pp 73–78
29. Wang J, Ren, H. Feature-based customer review mining. https://www.researchgate.net/public
ation/242227984_Feature-based_Customer_Review_Mining
30. Mukherjee S, Joshi S (2013) Sentiment Aggregation using concept net ontology. In:
Proceedings of the sixth international joint conference on natural language processing, pp
570–578
31. Sureka A, Goyal V, Correa D, Mondal A (2010) Generating Domain specific ontology from
common-sense semantic network for target specific sentiment analysis
32. Wang BB, McKay RIB, Abbass HA, Barlow M (2003) A Comparative study for domain
ontology guided feature extraction. In: Proceedings of the 26th Australasian computer science
conference, vol 16, pp 69–78
33. Vicient, C, Sánchez, D, Moreno A (2011) Ontology-based feature extraction. In: Proceedings
of the 2011 IEEE/WIC/ACM international joint conference on web intelligence and intelligent
agent technology—Workshops, WI-IAT
34. Turney PD (2001) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In:
Proceedings of the twelfth European conference on machine learning. Springer, Berlin, pp
491–502
35. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the ACM
SIGKDD international conference on knowledge discovery and data mining (KDD-2004)
Machine Learning-Based Approach
for Opinion Mining and Sentiment
Polarity Estimation
1 Introduction
With the evolution of the Internet and people’s enthusiasm to perform tasks with
few clicks, a thing that got affected is how vendors and customers perform their
buying and selling tasks. The physical stores and businesses were moved to the
web by extending their services and capabilities worldwide by allocating a part of
cyberspace rather than being limited to one single physical location creating e-market
and e-commerce. These systems can promote business and organizations online and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 601
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_44
602 H. K. S. K. Hettikankanama et al.
make it able for customers to perform their tasks with the minimal human intervention
[1]. This allowed vendors to expand the business beyond geographical limitation and
acquire huge customer base worldwide more efficiently.
As many of the physical stores began to move to the web, only being a part
of the web did not fulfilled the expectations of businesses. There emerged a huge
competition between online businesses. Therefore, they strive to achieve competitive
advantage to get more customer attraction and increase their sales. By increasing
customer satisfaction, the businesses can gain more customer attraction which shows
that e-retailers should provide e-satisfaction to create a good customer base [2].
Number one goal of many leading businesses worldwide is to make sure their
customers are satisfied [3]. To make customers satisfy, it is vital to provide the
service they expect by providing value for their money and facilitate them. It should
be made easier for customers to find what they want with good quality and in less
time and the process should be less exhaustive. For that, most online businesses use
recommender systems to find the right product for the right customer at the right
time [4]. These systems can increase the quality of the decisions that customers
make while searching and selecting a product by reducing the information overload
and complexity caused by them [5].
Product recommendation systems can be identified as filtering tools which use
data, pattern recognition and filtering to suggest the most relevant items to a partic-
ular user [6]. Those systems work on specific parameters like ratings or product
properties. These systems are having the challenge to give high quality and relevant
recommendations without errors in reduced time. This will increase the traffic to the
website [7]. Recommendation systems use various data mining and machine learning
algorithms [7]. As there is a huge competition between online businesses, the quality
and accuracy of product recommendation system chooses the winner in online sales
nowadays.
It is found that people read reviews given by other customers before purchasing.
The survey done by Myles Anderson shows that 88% consumers trust online reviews
like personal recommendations [8].
In this paper, it proposes a new model for product recommendation by rating
the products according to the reviews given by users. According to this method, the
textual reviews given by previous customers of a particular product are considered
and the products are ranked and sorted according to the cumulative score given to the
product and the recommendation is done as customer search by keywords. With this
method, it will suggest good quality products as reviews come from many users and
they provide a practical opinion with a wide view which provide multiuser recom-
mendation with the upcoming trend of product recommendation. The architecture
of the proposed model and method took to conduct the research and the results are
discussed in this paper.
Machine Learning-Based Approach … 603
2 Related Work
3 Methodology
data preprocessing is done to take the data into an applicable condition. Then senti-
ment values for the review text are calculated and sentiment polarity is estimated.
Then some insights of dataset and behaviors of some variables in relation to others
are measured. The supervised learning methods are applied on dataset and learning
algorithm with the highest accuracy is identified. After this, a product ranking method
considers the sentiment value of user reviews. Next step is model validation for high
accuracy as it is a vital characteristic of a recommender system. To research as the
programming language, Python programming language is used and Jupiter Microsoft
Azure Notebook is used as a development environment.
Machine Learning-Based Approach … 605
Fig. 1 Methodology
To validate the outcome of this research is giving value to the society and to gather
information and perspective from a wider audience, a survey was done in the form
of an online questionnaire. Design of the questions is as shown in Table 2.
Here, the customer product reviews given by users are extracted in native format. In
the extracted dataset, there are more than 400,000 review data from Amazon.com
on some different brands of mobile phones. It contains reviews from Amazon.com
about unlocked mobile phones. Dataset carries basic product information, price,
ratings and review vote ratings. Here, the dataset was extracted using web scrapper
PromptCloud.
Python programming language is used in analyzing and implementing the model.
A dataset which contains 400 thousand user reviews on 4410 products from 385
brands is taken.
606 H. K. S. K. Hettikankanama et al.
Data preprocessing was done as a vital part of this study as this is the part where
most related words for sentiment analysis are extracted from the review text. Here, it
was done in six major steps. This preprocessing mechanism increased the accuracy
of prediction nearly from 40% as shown in Fig. 2. These steps were done using the
Python programming language.
HTML Element Removal As the data was scraped directly from the web, there
could be some HTML tags included in the reviews; they will make no sense when it
comes to sentiment analysis.
Special Characters Removal Here, stars, dots, commas and all other special
characters are removed from review text.
Machine Learning-Based Approach … 607
Performance
results without data
pre-processing
Performance
results after data
pre-processing
Non-word Entry Removal Here, the numerical entries in reviews are removed,
such as model numbers, price information.
Stop Word Removal Stop words like prepositions are removed from the reviews,
e.g., a, the, on etc.
Null Value Removal This is done in later part in this preprocessing activities as
the above activities can create some reviews null as it removes several words from
review text.
Lemmatization Here, the words used in review texts are taken into its common root
word for the analyzing purposes, e.g., running, ran are from the root word ran.
Review texts are analyzed and their negativity positivity identified as supervised
learning methods are going to be applied on dataset.
Here, the best accurate algorithm to predict scores for products considering sentiment
polarity has to be proposed. In this research, some classification algorithms are used
and their results were collected. By comparing results of accuracy of each algorithm,
one most appropriate algorithm was proposed. In this study, random forest classifier,
decision tree classifier, K-neighbors classifier, AdaBoost classifier and ensemble
algorithm are used.
The algorithm which was proposed is trained and sentiment polarity of each review
is predicted using the proposed algorithm.
An average value was given to each product considering sentiment polarity of each
review of a particular product.
Machine Learning-Based Approach … 609
The questionnaire was completed by 164 participants where all of them are students
of foreign and local universities who follow computer science-related course. Results
show that 69.5% of people who answered have heard of product recommendation
systems while 93.9% have done online shopping. 89.9% of people had been victims
of receiving poor products than expected while shopping online. The questionnaire
shows that 68.9% are not trusting star ratings given for products as an answer for
6th question. 98.8% stated that they refer real user comments prior to buying online
though the product is having high star ratings. The majority stated that they read 5
to 10 number of reviews prior to purchase where it indicates 75.6% and 78% have
stated that reading and analyzing reviews by themselves is hard and could be misled
and 92.1% accept that it will be helpful if ranking system analyzes user reviews and
recommend products. 85.9% stated that they will trust products ranked considering
real user comments.
In this study as described earlier, random forest classifier, decision tree classifier, K-
neighbors classifier and AdaBoost classifier are used. Their results were recorded as
shown in Table 3. According to results, K-neighbors classifier is proposed as the most
appropriate algorithm as it has highest precision, recall, cross-validation precision,
F 1 measure and also least mean error. Results comparison is graphically shown in
Fig. 3.
100
90
80
70
60
50
40
30
20
10
0
Cross Validaon F1-score mean absolute precision recall root mean squre
Accuracy error error
Random Forest Classifier Decision Tree Classifier K-Neighbours Classifier Ada Boost Classifier
Ensemble approach which causes to increase the accuracy is also used to predict
the sentiment values. The comparison of results along with results of K-neighbors
as shown in Fig. 4 ensures that K-neighbors is still having the highest accuracy and
that algorithm was proposed as the best algorithm to predict sentiment polarity of
reviews.
For the evaluation of the model, some evaluating mechanisms which consider
different types of measurements as accuracy or coverage are used. Accuracy measures
the number of correct recommendation divided by all possible recommendations.
Coverage is objects considered divided by objects in the search space. Accuracy
measurements are categorized into two parts as statistical and decision support accu-
racy metrics. Statistical accuracy metrics evaluate by comparing predicted recom-
mendations with actual user recommendations. Correlation, mean absolute error
(MAE) and root mean square error (RMSE) are statistical accuracies measuring
metrics here.
1
MAE = |Pu,i − Ru,i | (1)
N u,i
Here Pu,i mean predicted rating for user u on item i. Ru,i is the actual rating and N
is the total number of rating on item set. Lower the MAE is better and same for the
RMSE.
1
RMSE = ( pu,i − ru,i )2 (2)
n u,i
2P R
F_measure = (5)
P+R
Figure 5 shows the final results of the system which ranks products for recommen-
dation by considering all textual reviews given by customers for a particular product
by analyzing reviews using KNN classifier and identifying positivity and negativity
of each review and finally calculating the product score by considering all reviews.
Acknowledgements Authors of this study thank all the lecturers of Faculty of Applied Sciences
who helped in making this successful. And much thankful to Sabaragamuwa University of Sri Lanka
for encouraging the research.
References
1. Mahmood SMF (2016) E-commerce, online shopping and customer satisfaction: an empirical
study on e-commerce system in Dhaka
2. Szymanski D, Hise R (2000) E-satisfaction: an initial examination. J Retail 76(3):309–322
3. Fieldboom.com (n.d.) Customer satisfaction survey. https://www.fieldboom.com/customer-sat
isfaction-surveys
4. Wang J, Zhang Y (2013) Opportunity model For E-commerce recommendation. In: Proceedings
of the 36th international ACM SIGIR conference on research and development in information
retrieval—SIGIR’13, pp 303–312
5. Spiekermann S (2001) Online information search with electronicagents: drivers, impediments,
and privacy issues. Unpublished doctoral dissertation. Humboldt University, Berlin
6. MacKenzie I, Meyer C, Noble S (2013) How retailers can keep up with consumers. McKinsey
& Company. https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-
up-with-consumers
7. Vaidya N, Khachane AR (2017) Recommender systems-the need of the ecommerce ERA. In:
2017 International conference on computing methodologies and communication (ICCMC),
Erode, 2017, pp 100–104. https://doi.org/10.1109/ICCMC.2017.8282616
8. Anderson M (2014) 88% of Consumers trust online reviews as much as personal recommenda-
tions. Search Engine Land. https://searchengineland.com/88-consumers-trust-online-reviews-
much-personal-recommendations-195803
9. Rao KY, Murthy GSN, Adinarayana S (2017) Product recommendation system from users
reviews using sentiment analysis
10. Abdul Hassan A, Abdulwahhab A (2017) Reviews sentiment analysis for collaborative
recommender system. Kurdistan J Appl Res 2(3):87–91 (2017)
11. Elmurngi E, Gherbi A (2017) An empirical study on detecting fake reviews using machine
learning techniques. 107–114. https://doi.org/10.1109/INTECH.2017.8102442.
12. Ramadhan WP, Novianty STMTA, Setianingsih STMTC (2017)Sentiment analysis using multi-
nomial logistic regression. In: 2017 International conference on control, electronics, renewable
energy and communications (ICCREC), Yogyakarta, 2017, pp 46–49. https://doi.org/10.1109/
ICCEREC.2017.8226700
13. Das B, Chakraborty S (2018) An improved text sentiment classification model using TFIDF
and next word negation. Cornell University Library, Computation and Language
14. Bhavitha BK, Rodrigues AP, Chiplunkar NN (2017) Comparative study of machine learning
techniques in sentimental analysis. In: 2017 International conference on inventive communi-
cation and computational technologies (ICICCT). IEEE, pp 216–221
15. Athanasiou V, Maragoudakis M (2017) A novel, gradient boosting framework for sentiment
analysis in languages where NLP resources are not plentiful: a case study for Modern Greek.
Algorithms 10(1):34
Early Detection of Diabetes by Iris Image
Analysis
Abstract Diabetes has become a global problem due to changing lifestyles, daily
eating habits, level of stress encountered by people, etc. According to statistics of
the World Health Organization (WHO) in 2016, 8.5% of the adult population of the
world is suffering from diabetes. Therefore, early detection of diabetes has become a
global challenge. The iris of the human eye depicts a picture of the health condition
of the bearer. Iridology is a method conceived decades ago that focuses on the study
of iris patterns such as texture, structure and color for diagnosis of various diseases.
By analyzing the images of human iris, a medical imaging method was explored with
computer vision for the identification of diabetes. Iris analysis of the human eye is
conducted based on the pancreas, kidney and the spleen of the human body where
the local datasets were collected using a Digital Single Lens Reflex (DSLR) camera.
Diabetes detection system with a low cost was created focusing on the localization,
segmentation, normalization and the system predicts the severity of diabetes with
85% accuracy.
1 Introduction
Due to the changes in the present lifestyle, eating habits, stress level and lack of
exercises, the entire humankind is facing various health issues. Specially, diabetes
is spreading at an alarming rate among people according to statistics of the World
Health Organization (WHO). The global report on diabetes and 2018 statistics of
WHO [1] reported that 8.5% of the adult population is suffering from diabetes and the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 615
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_45
616 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
percentages are on the rise. Sri Lanka is not an exemption from the global picture, and
according to reports of International Diabetes Federation (IDF) 2015, the prevalence
of diabetes among adults in Sri Lanka is also 8.5% and 1 in 12 adults suffer from
diabetes. According to WHO 2016 report, 7% of total deaths in all age categories
in Sri Lanka are due to diabetes and related complications. In addition, long-term
diabetes leads to damages in blood vessels in the heart, brain, legs, eyes, kidneys,
nerves system, etc.
Diabetes can be identified using various medical tests such as blood pressure tests,
fasting blood sugar tests, random blood sugar tests and oral glucose tolerance test.
[2]. However, from all these methods, diabetes is found accurately when it reaches
a certain level of maturity, which would be hard to cure then. In addition, diabetes is
the leading cause of blindness in humans in various ways such as cataract, glaucoma
and damage of blood vessels inside the eye. At that stage, a simple method of curing
is not possible; hence, a lifelong tedious, costly and life-risking medical process has
to be followed. Therefore, early detection of diabetes is one of the most essential
health requirements at present. This research introduces an alternative method of
pre-identification of diabetes by analyzing the human eye iris.
The iris is a part of the expansion of the nervous system and brain and consists
of thousands of nerve endings, blood vessels, connective tissues and nerve impulses
[3]. The nerve fibers in the iris of the eye receive impulses from the rest of the body
via the spinal cord, optic nerve and optic thalami. It is identified that iris of the human
eye represents the health conditions of various body organs; hence, by analyzing the
human eye iris, the health condition of various body organs can be detected easily.
Iridology [4] is an ancient medicine technique, which provides an alternative to
conventional medical diagnosis methods. Iridology makes use of color, the structure
of tissues, shape, patterns, pigments and several features of human eye iris to predict
abnormalities of body organs of an individual. It is a “pre-disease state” through
medical diagnosis and pigment abnormalities in the iris. The location of the abnor-
mality in the iris is related to the medical condition of the relevant body organ. Hence,
by analyzing these abnormalities of pigmentations, various health conditions can be
identified.
Iridology motivates for healthy behavior and cautious for prevention from diseases
throughout all stages of life and paves the way for a non-invasive, automated, accurate
and preventive healthcare model. In this research, computer vision and artificial
intelligence were linked with iridology to predict diabetes at an early stage-examining
diseased iris for tissue changes, spots and pigmentations in the pancreas, kidney and
spleen. An automated and non-invasive system based on image acquisition of the
iris, image pre-processing, localization, segmentation, normalization, the region of
interest extraction according to iridology chart and feature extraction was developed.
A convolutional neural network was used for classification. Section 2 of this paper
contains the literature survey on similar researches, Sect. 3 is on methodology, Sect. 4
contains results and discussion and Sect. 5 is on conclusion and future work.
Early Detection of Diabetes by Iris Image Analysis 617
1.1 Background
This section explains the background of the related work regarding iris recognition
and diagnosing of diabetes.
Eyes have long been known as the “windows of the soul.” But very few people are
cognizant of how accurate this observation is. Usually people know that diabetes can
effect temporary blurred vision, or it may lead to severe, permanent vision loss and
increase the threat of rising glaucoma and cataracts. Scientifically, it is identified that
patients with diabetes have a variety of symptoms, including diabetic retinopathy,
cataracts, diabetic macular edema (DME) and glaucoma. Diabetes can be identified
at early stages by the observing eyes is a lesser known fact.
Iridology is an alternative medical technology that examines colors, patterns
and other features of the iris of the eye that can be used to determine
the patient’s systemic health condition. Iridologists compare their observations
with iridology charts. These iridology charts contain different zones that correspond
to specific parts of the human body. Jensen [5], who drew the iridology chart given
in Fig. 1 describes that any specific changes in body tissues due to various reasons
are indicated by corresponding nerve fibers in the iris. This states that the features
of iris vary considerably according to the physical changes of the body organs.
Fig. 1 Iris chart for the right iris and left iris in eyes
618 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
The iris can be defined as an extension of the brain which consists of microscopic
blood vessels, connective tissues and thousands of nerve endings that are attached to
each tissue of the body through the nervous system and the brain.
The nerve fibers receive their impulses from the optic nerve, spinal cord and optic
thalami. Therefore, what is revealed in the iris is a reflex condition of the body by
showing up as lesions or marks and color variations in the iris. This is why the eyes
have been called “the windows of the soul.”
The research problem addressed in this research is how diabetes can be identi-
fied at an early stage. The aim of the research is to device a mechanism to use
non-invasive, automated and accurate alternative medicine technique to early detect
diabetes. Objectives of the research are to learn the concepts, methods and techniques
of the alternative medicine technique iridology, study the changes in the features of
the iris with respect to diabetes, design and implement an algorithm for iris recogni-
tion and develop a system, evaluate the developed system against benchmark dataset
and apply the evaluated system for local dataset to predict diabetes.
2 Related Work
In this research, related literature is being reviewed in order to find out the work
carried out in iris analysis with iridology. This section summarizes various works
done in this direction along with tools used with the organ related.
The pancreas has used as the organ for diagnosing diabetics in Ref. [6]. The
research aimed to value iridology as a diagnosis method for diabetes. A database of
200 subjects has used in this research. Thereafter, localization, segmentation, rubber-
sheet normalization and ROI extraction have done under pre-image processing.
Statistical, textural and 2D-DWT features have extracted in both eyes. For classi-
fication, six classification algorithms have accessed. Maximum accuracy of 89.66%
has achieved by the random forest (RF) classifier.
The gallbladder has used as the organ to diagnose type II diabetes in Ref. [7].
The research aimed to apply iris image analysis for clinical diagnosis in an effi-
cient manner and determine the health status of organs. After iris image acquisition,
noise removal and enhancement have done before analysis. The iris is obtained
by subtracting pupil from the sclera. Normalization was carried out to convert the
circular part to a rectangular shape. ROI has identified by visual inspection as per chart
of iridology. Various features have extracted from both iris eyes. For classification
purpose, a new method, support vector machine (SVM), has been used.
In Kale et al. [8], various methods and techniques, which are already implemented,
have put together for the detection of diabetes. It can be seen from the survey, that
Early Detection of Diabetes by Iris Image Analysis 619
3 Methodology
Initially, a dataset of iris images of 100 persons (50 diabetic and 50 non-diabetics) was
taken by using a Canon Digital Single Lens Reflex (DSLR) camera with a 50 mm
macrolens. Color images of size 5184 × 3456 of both irises on both eyes were
captured simultaneously. Thereafter, pre-image processing, post-image processing
and various methods and techniques of machine learning were applied to the system.
After noise removal, localization and normalization were carried out and then each
iris was converted into a 2D array. Several regions of interest (ROIs) were extracted
from the left eye and right eye as per the iridology chart of Dr. Jensen Bernard. ROI
was extracted corresponding to the positions of pancreas, kidney and spleen organs
in both of the iris. For classification, the convolutional neural network was used. The
system was tested using local dataset as well as UBIRIS [10] standard dataset.
The system is divided into several stages. The process described in the previous
section is captured into the top-level diagram given in Fig. 2.
The eye images were captured with the help of Canon EOS 700D DSLR camera with
50-mm (mm) macrolens and stored in the database, which contained normal as well
as abnormal results of iris (Figs. 3 and 4).
620 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
Fig. 4 50 mm macrolens
Pre-processing refers to the transform the image of the eye in such a way that it can
extract the desired features. It can be divided into three steps; noise reduction, iris
localization and iris normalization.
The dataset contained noises such as salt and pepper noise, poisson noise, Gaussian
noise and various reflections. Removal of noise is essential to analyze an image with
a more refined dataset than the raw dataset. Hence, noise removal was done using a
wiener filter.
Wiener filter in the Fourier domain
D ∗ (x, y)Rs (x, y)
G(x, y) = (1)
|D(x, y)|2 Rs (x, y) + Rn (x, y)
The term Rn /Rs is the reciprocal of the signal to noise ratio (Figs. 5 and 6).
622 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
The first step in iris segmentation is finding the inner and outer boundaries of an
iris. For the past few decades, various algorithms were proposed by researches for
segmentation of iris from the eye. Daughman presented integro-differential operator
for iris segmentation [11]. The operator hunts the circular boundary to distinguish iris
and separate it from sclera and pupil. Wildes [11] proposed a segmentation algorithm
to find the outer center and radius of the region of the iris, using Hough transform
theory.
In the method proposed by Wildes, the magnitude of the intensity range of the
image was a threshold to obtain the edge map of the image.
− (x−x0)2+(y−y
2
0 )2
1 2σ
G(x, y) = e
2π σ 2
(x, y) is used as a Gaussian smoothing function with scaling parameter σ selecting
the appropriate scale of iris edge analysis. The Hough transform is used for the iris
contour to maximize the voting process of the edge map. The maximum point in the
Hough space corresponded to the radius r and center coordinates x c and yc of the
circle. This can be defined according to the equation:
The parabolic Hough transform of Wildes’ [12] was used to detect the eyelids and
eyelashes by approximating the upper and lower eyelids with parabolic arcs (Fig. 7).
Iris normalization was done to convert the iris eye image into fixed dimensions
(Fig. 8).
where x is every pixel of the image for both available standard University of
Beira Interior (UBIRIS) benchmark dataset and local dataset iris detection system
with feature extracted on normalized iris images has much better accuracy than
one with features extracted on iris images without normalization. Figure 9 shows
the comparison of equal error rate (EER) for feature extraction with and without
normalization.
Database A consists of UBIRIS free dataset and dataset B consists of local dataset.
624 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
In this stage, processes like a selection of organs, ROI extraction, feature extraction
and classification tasks were carried out.
When people develop diabetics conditions, the system of the body changes. Then
some of the organs are going to get more affected than the other organs. Since all these
Early Detection of Diabetes by Iris Image Analysis 625
organs are directly connected to the iris, any changes on these organs are reflected
in the iris. Therefore, this research explores the analysis of the iris eye image with
respect to kidney, pancreas and spleen to early detect diabetes.
Pancreas
The pancreas was selected as a diabetic organ because it is responsible for producing
the hormone called insulin which regulates the level of glucose in the blood. People
suffering from diabetics experience a build-up of glucose in their blood. If the
pancreas is not functioning properly due to insulin deficiency, the level of glucose in
the blood raises, which needs medication. When the pancreas is affected, the nerves
bring that signal to iris and fibers re-structure according to that metabolism and
symptoms are shown via the iris.
Kidney
Kidney was selected as a diabetic organ because high blood glucose damages the
blood vessels in the kidney. When the blood vessels are injured, they do not function
well; hence, abnormality of a kidney can be detected through the analysis of iris.
Spleen
The spleen is a fist-sized organ of the lymphatic system that behaves similar to the
pancreas. It normally operates filter for blood. Nowadays, researchers have found
that spleen enlargement means diabetes; hence, spleen was also selected as a diabetic
organ.
ROI extraction was done to crop the particular portions of regions in the pancreas,
spleen and kidney from normalized iris eye image according to the iris chart as shown
in Fig. 10.
There are many features such as tissue changes, pigmentations and orange color dots,
which can be considered for diagnosing diabetes, but only some of them give more
accurate results compared to the other features. Color moments and texture features
were known to give a better prediction of diabetics. Iris contains information on rich
texture and breaking tissues of the iris is concomitant with changes in texture features
directly.
626 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
3.4.4 Classification
For classification, convolutional neural network has been used. The CNN algorithm
automatically learns feature extraction; hence, the time and effort needed for imple-
mentation can be minimized. The CNN algorithm works well on image data and is
flexible.
CNN had basically three layers; input layer, output layer and hidden layers
(Conv2D, MaxPool2D, AvgPool2D, batch normalization, reshape, concatenate,
dense, DepthwiseConv2D and ReLU).
This section presents the results of this research followed by a discussion on it.
Discussion is based on information gathered through a questionnaire with human
beings. Various charts have been drawn to explain the results.
Created a database of iris images of healthy people and people with diabetes. Iris
images of 100 subjects, 50 diabetic and 50 non-diabetics, have been used. The written
consent of each individual is obtained for this research. Cases of type I, type II and
628 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
gestational diabetes are considered for the study. The subjects vary from one year of
diabetes to 23 years of diabetes (Fig. 12).
Above information was collected by distributing the questionnaire among diabetic
and healthy persons in several areas in Sri Lanka.
4.2 Implementation
This work has been implemented using OpenCV, keras and TensorFlow. For data
acquisition, Canon DSLR camera was used with 50 mm macrolens. The noise of
collected iris images has been cleared using a wiener filter. After that, the Hough
transform is applied for localization and segmentation to isolate the iris from the
pupil and the sclera. To transform the iris in to fixed size, normalization was done.
After that, ROI extraction was done to identify required organs to detect diabetes.
For all these phases, OpenCV has been used. The classification has been done using
convolutional neural network using TensorFlow. User interface was created using
visual studio 2015.
Table 1 Comparison of accuracy of madhumeha diabetes prediction model with existing techniques
No. Disease Classifier Number of samples Accuracy (%)
1 Detecting broken tissues Visual inspection 34 94.0
in pancreas [13]
2 Nerve system SVM 44 86.4
3 Pancreas disorder Neighborhood-based 50 83.3
(Lesmana et al. 2011) modified back
propagation using
adaptive learning
parameters
4 Madhumeha CNN 100 85
The proposed model does not work properly when the patient is having lenses in
the eyes. It makes confusion when training data. Another important observation of
this study is that a person having controlled diabetes with medicine, proper diet and
exercise has also been identified as healthy.
Data training and testing were done using UBIRIS free dataset and my local
dataset. Accuracy was measured as k-folds of the dataset (Table 2).
4.4 Result
4.5 Limitations
Obtaining noise-free images was a trivial task with various lighting conditions using
the Canon DSLR camera and many images were captured with reflections as given
in Fig. 14.
Implementing an algorithm for iris localization and normalization was hard due
to the time consumption in writing codes. In circle detection, circle was detected
including eyelashes and a very small amount of sclera zone was captured as given in
the below picture. Inaccurate results of limbus detection because of the low contrast
630 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe
of limbus and the presence of eyelids and eyelashes have been illustrated in Fig. 14
(Fig. 15).
Therefore, capturing of noise-free iris images is the crucial task in any kind of
analysis or predictions.
Early Detection of Diabetes by Iris Image Analysis 631
References
9. B. Ragavendrasamy, Mithun BS, Sneha R, Vinay Raj K, Hiremath B (2017). Iris diag-
nosis—A quantitative non-invasive tool for diabetes detection. Ragavendrasamy B (Special).
Retrieved from https://www.researchgate.net/publication/326201202_Iris_Diagnosis-A_Quan
titative_Non-Invasive_Tool_for_Diabetes_Detection. 2. (Wolfweb.unr.edu, 2019). 3. Foster I,
Kesselman C (1999) The grid: blueprint for a new computing infrastructure. Morgan Kaufmann,
San Francisco
10. UBIRIS (2018). Retrieved from https://iris.di.ubi.pt/
11. Wildes R (1997) Iris recognition: an emerging biometric technology. Proc IEEE 85(9):1348–
1363. https://doi.org/10.1109/5.628669
12. Hough Circle Transform (n.d.) Retrieved from https://opencv-python-tutroals.readthedocs.io/
en/latest/py_tutorials/py_imgproc/py_houghcircles/py_houghcircles.html
13. Wibawa AD, Purnomo MH (2006) Early detection on the condition of pancreas organ as the
cause of diabetes mellitus by real time iris image processing. In: APCCAS 2006 - 2006 IEEE
Asia Pacific Conference on Circuits and Systems. https://doi.org/10.1109/apccas.2006.342258
A Novel Palmprint Cancelable Scheme
Based on Orthogonal IOM
Abstract To extract more palmprint features and achieve better recognition results,
an Orthogonal Index of Maximum and Minimum (OIOMM) revocable palmprint
recognition method is proposed in this paper. Firstly, the competitive code features
of region of interest (ROI) are extracted. Then, the statistical histogram of palmprint
competition code features is obtained by partitioning the features. The Gaussian
random projection (GRP)-based IOM mapping is used to generate a GRP matrix.
Orthogonal GRP matrix is obtained by Schmidt orthogonalization. OIOMM hash
converts real-valued biological eigenvectors into discrete index hash codes. Finally,
the palmprint image is matched with Jaccard distance. The experiment is carried out
in the palmprint database of Hong Kong Polytechnic University. When the random
projection size is 200 and the revocable palmprint feature length is 500, the equal
error rate is 0.90. This shows that the algorithm not only improves security but also
maintains the classification effect.
1 Introduction
In the information society, people have more and more demand for identity authenti-
cation [1]. Palmprint recognition has become one of the most promising methods in
the field of biometric recognition because of its stability, reliability and easy acqui-
sition [2]. Palmprint recognition plays an important role in public security, access
control, forensic identification, banking and finance [3]. With the increasing use of
biometric recognition, the necessity toward securing the biometric data is also arising.
However, unlike revocable and redistributable credit cards or passwords, each unique
biometric template cannot be revoked. Once biometric data is stolen, it cannot be
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 633
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_46
634 X. Wang et al.
same steps. Then it is matched with revocable palmprint template by Jaccard distance
[18]. If the Gaussian projection matrix is correlated, the extracted palmprint features
are also correlated. There will be a lot of redundant information and inadequate
expression of palmprint. Schmidt orthogonalization for Gaussian random projection
matrix is used. Furthermore, improve the location information of the maximum and
minimum values of the random orthogonal projection matrix of palmprint features,
which can not only extract more sufficient palmprint information but also improve
the security of palmprint recognition. It is an irreversible transformation of palm-
print recognition template to ensure the security and privacy of the template. If you
cancel the revocable palmprint template, you can regenerate the new template from
the same palmprint database.
The article is organized as competitive code discussion in Sect. 2, palmprint
competition code feature based on OIOMM in Sect. 3, experimental analysis and
discussions in Sect. 4, security analysis in Sects. 5 and 6 provides the conclusion.
The original palm print image is preprocessed and the features are extracted. The
extracted ROI is binarized to obtain a binary image of the palm print. The visual
cortex cells are simulated through Gabor transform and the illumination changes,
image contrast is used to improve the feature recognition in palm print analysis. The
filtered image is obtained from Gabor filter bank and it is given as
G R (u, v) = 4u 2 − 2 exp − u 2 + v 2 (1)
where u is abscissa and v is ordinate, the place coordinate is given as (u, v) and it is
obtained as
u 1 α 0 cos θ sin θ x − x0
= × × (2)
v 0 1 β − sin θ cos θ y − y0
where (x 0 , y0 ) is the center of the filter, θ is the rotation angle that can locally orientate
the filter along the palm line and α and β are to adapt the line orientation.
After that, the filtered image is normalized and the palmprint competitive code
features are obtained. The formula is as follows:
CompCode(x, y) = arg min I (x, y) ∗ G R x, y; θ j (3)
j
636 X. Wang et al.
The obtained palmprint competitive code features are mapped and each palmprint
competitive code feature map is divided into p × p sub-blocks. The statistical
histogram hi of palmprint competition codes is calculated for each block. Because
there are eight possible competing codes in total, the apparent hi dimension is 8.
The histograms hi in all blocks are connected to form a large histogram h. The large
histogram h is used as the eigenvector. The dictionary A is defined as the series of
all eigenvectors according to the whole set of galleries.
The formula is as follows:
A = v1,1 , v1,2 , vk,n k ∈ R m×n (4)
where k is the number of classes in the library set; nk is the number of samples of
class k; m is the dimension of features; and n is the number of samples of each class.
Figure 1 is a process diagram for extracting feature vectors of palmprint
competition codes.
Compcode mapping is highly recognizable, but the major issue in compcode
mapping is its sensitivity. Even a small amount of registration errors will affect the
performance features in probe and training image. However, the global statistics and
its features such as histograms are robust against sensitivity. Combining these two
models will provide more advantages in the feature extraction process. This paper
uses the block statistics of compcode as the feature. However, this feature does not
have high security, so the GRP-based OIOMM mapping of palmprint feature vectors
is used to obtain revocable palmprint feature templates to enhance security.
α2 , β1
β1 = α1 , β2 = α2 − β1 ,
β1 , β1
α3 , β1 α3 , β2
β3 = α3 − β2 − β2 (5)
β1 , β1 β2 , β2
After the above transformation, can get the orthogonalized vector group β1 , β2 , β3 .
α1 , α2 , α3 and β1 , β2 , β3 are equivalent. By Schmidt orthogonalization of the whole
chaotic matrix, the orthogonal matrix can be obtained.
After the revocable palmprint template is obtained, the palmprint image is recognized.
The palmprint image to be tested is input and use Jaccard distance to match and
classify the obtained revocable palmprint template. Jaccard uses the proportion of
different elements in two palmprint samples to measure the similarity between the
two samples and finally obtains the classification results.
Jaccard distance is used to describe the similarity between palmprint sets. The
larger the Jaccard distance, the lower the sample similarity. Given two palmprint sets
A and B, A represents the palmprint set to be tested and B represents the revocable
palmprint template set. The ratio of the size of the intersection of A and B to the
size of the combination of A and B is defined as the Jaccard coefficient and it is
represented as
|A ∩ B| |A ∩ B|
J (A, B) = = (6)
|A ∪ B| |A| + |B| − |A ∩ B|
where J(A, B) is defined as 1 if the palm print sets are empty. The Jaccard distance
is obtained based on the Jaccard coefficient and relative index and it is given as
638 X. Wang et al.
|A ∪ B| − |A ∩ B| AB
d j (A, B) = 1 − J (A, B) = = (7)
|A ∪ B| |A ∪ B|
The OIOMM hashing is a means of cancelable biometrics [19]. The pure discrete
exponential representation of OIOMM hash codes has the following advantages:
OIOMM hash makes biometric information more invisible, while biometric infor-
mation is often represented as position information. OIOMM hashing is essentially
a sort-based hashing method, so it is independent of feature size. This makes hash
codes robust to noise and changes that do not affect implicit sorting. The size inde-
pendence of OIOMM hash makes the generated hash code scale invariant. An orthog-
onal matrix is added to IOM based on GRP and orthogonalize Gaussian projection
matrix to get orthogonal Gaussian projection matrix. Moreover, the maximum loca-
tion information of palmprint features and the minimum location information of
palmprint features are extracted during the process. By simultaneously improving
the original IOM, more palmprint features can be extracted to improve recognition
rate in palm print.
Figure 2 depicts the process flow of the proposed algorithm. Firstly, the feature
of the palmprint competitive code is obtained by filtering with a Gabor filter.
The competitive code features of palmprint are divided into blocks and the palm-
print feature histograms are extracted, respectively. The generated Gaussian random
4 Experimental Results
In the OIOMM method proposed in this paper, the Gaussian random projection matrix
is Schmidt orthogonalized. Because the orthogonal Gaussian random projection
matrix obtained is highly irrelevant and the maximum and minimum position infor-
mation of palmprint feature is extracted, more palmprint features can be extracted and
redundant information of palmprint features can be avoided. Therefore, this method
can improve security and maintain the classification effect better at the same time.
The database used for the experimentation is obtained from Hong Kong Polytech
University [20] which includes 600 grayscale images from 100 unique hands and
palms. An average interval of two months is maintained to obtain two samples from
each person. About 10 palm print images of 384 × 284 pixel of 5 DPI are collected on
640 X. Wang et al.
each meet. For experimentation purpose, each image is clipped to obtain the desired
ROI with a size of 128 × 128. Competitive code features are obtained by filtering
the image using filter bank and the ROI was obtained. As shown in Fig. 3, (a) depicts
the ROI of the original palmprint image and (b) is the feature image of the palmprint
competitive code.
Table 1 provides a comparison of equal error rate (EER) for IOM, OIOM and
OIOMM models. The first line in the table represents palmprint features of different
lengths and the second line represents different sizes of random projections.
Based on the first set of data in Table 1, the ROC curves of IOM, OIOM and
OIOMM are got in Fig. 4. Based on the first set of data in Table 1, got the intra-class
and inter-class matching score distribution of OIOMM in Fig. 5.
It can be seen from Table 1 that when the random projection size is 200 and
the revocable palmprint feature length is 500, the EER of IOM is 1.54, the EER of
OIOM is 1.37, the EER of OIOMM is 0.90. According to the data of Table 1, it can be
seen that the recognition rate of the proposed scheme OIOMM is higher than that of
OIOM and IOM. Because compared with OIOM, it not only extracts the maximum
position information of palmprint feature but also extracts the minimum position
information of palmprint feature. Compared with IOM, Schmidt orthogonalization
of GRP matrix can extract more irrelevant palmprint information and make palmprint
features better expressed.
Based on the first set of data in Table 1, got the ROC curves of IOM, OIOM and
OIOMM in Fig. 4. It can be seen from the figure that when FAR is 1 × 10−1 , the
genuine acceptance rate(GAR) of OIOMM is 97.11% and that of OIOM is 96.02%
and that of IOM is 94.16%. When FAR is 1×10, the GAR of OIOMM is 99.26% and
that of OIOM is 98.81% and that of IOM is 97.75%. This shows that the maximum
and minimum position information of palmprint features can be extracted to obtain
better recognition results.
A Novel Palmprint Cancelable Scheme … 641
100
OIOMM
OIOM
95 data1
Genuine Accept Rate(%)
90
85
80
75
70
-4 -3 -2 -1 0 1 2
10 10 10 10 10 10 10
False Accept Rate(%)
2.5
Genuine
Imposter
2
Percentage(%)
1.5
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Distance
5 Security Analysis
The revocable palmprint recognition system achieves the standard of template protec-
tion scheme while maintaining the recognition accuracy. In this part, the irre-
versibility, non-linkability, revocability and diversity of the algorithm are analyzed.
5.1 Irreversibility
5.2 Non-linkability
It is observed from the plot, the proposed OIOMM satisfies the non-link ability
condition.
5.3 Revocability
This section validates the requirement of revocability. Figure 7 depicts the genuine
distribution, pseudo-imposter and imposter details. And it is observed that there is
a great degree of overlap between the distribution of imposter and pseudo-imposter.
This means that the newly generated hash code with a given random projection
matrix is different, even if it is generated by the same palmprint vector source.
This significantly provides the revocability requirement which is satisfied through
the proposed approach. If tokens (such as random matrices) are stolen, precision
performance will not be significantly reduced. Therefore, tokens in the OIOMM
hash are only used for revocability and are not secret to the public.
5.4 Diversity
Fig. 7 Genuine, imposter and pseudo-imposter distributions for revocability analysis of OIOMM
These palmprint templates can still be significantly different from the original palm-
print templates. This means that individuals can register different templates for the
same topic in different physical applications without cross-matching. Therefore,
experiments verify the diversity of the revocable palmprint template.
6 Conclusion
References
Abstract Radial basis function neural networks (RBF-NNs) are simple in struc-
ture and popular among other NNs. RBF-NNs are capable of fast learning proving
their applicability in developing deep learning applications. Its basic form with
center states or the means and standard deviations with weight adaptation makes
limited variability and complex in tuning when such embedded to the model.
Dynamics systems are nonlinear, especially behavior is uncertain and unpredictable,
and complete mathematical modeling or model-based controlling has limited appli-
cability for stability and accurate control. Shape-adaptive RBF-NN presented in
the paper theoretically proved for stability control using the Lyapunov analysis. The
autonomous surface vessel controlling selected for the numerical simulation consists
of a mathematical model developed using marine hydrodynamics for a prototype
vessel and classical proportional-derivative (PD) controller. Results indicated that
shape adaptive RBF-NN blended controlling is more accurate and has a fast learning
ability in intelligent transportation vessel development.
1 Introduction
Radial basis function [1] neural networks (RBF-NNs) are a special type of feedfor-
ward neural network [2]. As in Fig. 1, three layers of the RBF-NN are the input
layer, the hidden layer, and the output layer. The hidden layer activation function is a
radial basis function, and ψ(i) represents ith hidden neuron activation as defined by
(1). RBF-NNs having universal approximation properties [3] and simplicity of the
K. J. C. Kumara (B)
Department of Mechanical and Manufacturing Engineering, Faculty of Engineering, University of
Ruhuna, Galle, Sri Lanka
e-mail: kumara@mme.ruh.ac.lk
M. H. M. R. S. Dilhani
Department of Interdisciplinary Studies, Faculty of Engineering, University of Ruhuna, Galle, Sri
Lanka
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 647
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_47
648 K. J. C. Kumara and M. H. M. R. S. Dilhani
n-1
Fig. 1 Topology of the RBF-NN (multi-inputs single-output) showing three layers: input, hidden,
and output
structure compared to other types of NNs make it compact, fast learning-training [4],
and therefore, programming an embedded computer becomes easier using methods
rule extractions [5]. The common applications reported in the literature included
classification [6], pattern recognition [7], regression, and time series [8] analysis in
general. It also found many related works on biomedical [9], chemical [10], material
and mechanical engineering [11], robotics [12], and financial sectors [13]. In the
controller design, the nonlinear dynamics also found that the RBF-NNs have wide
use [14] and proved that such controller has a fast convergence ability compared
to classical control especially for model-based controlling [10]. When developing a
controller for autonomous systems in a dynamic environment that always consists
of perturbations, unforeseen and uncertain variations’ classical controlling does not
robust enough to handle when varying disturbances are present [15]. Even NN with
fixed shapes or with prior tuned weights, algorithms found deviations for real-time
controlling. Further various activation function optimizing algorithms found in the
literature are having offline training with big data sets [16], need high computing
power, and practical implementations also demand costly hardware.
⎡ 2 ⎤
m
z i − ci j
ψi (z) = exp⎣− ⎦, i = 1, 2, . . . , N (1)
j=1
2σi2
Shape-Adaptive RBF Neural Network … 649
In this section, first, the basic RBF-NN controller is presented. RBF function
for the shape-adaptive RBF-NN (SA-RBF-NN) is introduced, and derivation of
the controller signal is described in the proceeding section with the weight and
SA-RBF-NN shape-tuning laws as the third subsection.
N
y= yi (2)
i=1
y i = wi ψi (z) (3)
where ξ (|ζ | < ζ N ) represents neglected higher-order terms. From now onward,
subscripts are omitted for brevity. From the signal from NN y for counteract from
modeled and unmodeled uncertainties and taking NN approximation errors ε < |ε N |,
the final control signal error can be written as
∼
y= y − y + ε (5)
∼
∼ T
y = w ψi (z) + ξ + ε (6)
By substituting (7) into (3) and taking the Taylor series expansion, it yields,
∂ T
yi − y i = w·ψ a−a
∂a
∂ T ∂ T
k=k
as
∼∼ ∼T ∼T ∼T
yi − y i = 3 w ψ +w ψa a + ψc c + ψσ σ
+ξ (9)
∼ ∼ ∼ ∼
where w= w − w , a= a − a , c= c − c, and σ = σ − σ are estimation errors with
w(w ≤ wmax ), a(a ≤ amax ), c(c ≤ cmax ), and σ (σ ≤ σmax ) being ideal
values. The notation, · denotes the Euclidean norm of a matrix. Then, by combining
(5) and (9), control signal estimation error can be written as,
∼ ∼∼ ∼T ∼T ∼T
y = 3 w ψ +w ψa a + ψc c + ψσ σ
+ξ +ε (10)
where ψk R N ×N , in which N is the number of hidden neurons.
Shape-Adaptive RBF Neural Network … 651
⎡ ⎤
dψ1
dk1
dψ1
dk2
· · · dk
dψ1
⎢ dψ1 ⎥
N
⎢
dψ2 dψ2
. . . dk ⎥
ψk = ⎢ .. .. .. ⎥
dk1 dk2
⎥; k = a, c, σ
N
⎢ ..
⎣ . . . . ⎦
dψ N
dk1
dψ N
dk2
. . . dψ N
dk N
The entries of ψa , ψc and ψσ can be calculated as follows,
⎧ ⎡ ⎤
⎪
⎪
m
dψi ⎨ exp⎣ a i j ⎦; i = j
ψa = (11)
dai ⎪
⎪
j
⎩
0; i = j
⎧ ⎡ 2 ⎤
⎪ m
⎪ z − c
dψi ⎨ exp⎣− ⎦; i = j
i ij
ψc = 2σ 2
(12)
dai ⎪
⎪
j
⎩
0; i = j
and
⎧ ⎡ 2 ⎤
⎪ m
⎪ z − c
dψi ⎨ exp⎣− ⎦; i = j
i ij
ψσ = 2σ 2
(13)
dσi ⎪
⎪
j
⎩
0; i = j
Weights of the NN and the shapes of RBF are updated using tuning laws proposed
by a set of relationships in (14) to achieve guaranteed tracking performance of the
autonomous controller. In Sect. 3, a convergence of the system with SA-RBF-NN
blended PD controller is proved using Lyapunov analysis for these tuning laws or the
format of the tuning laws are selected such that proposed SA-RBF-NN achieves an
accurate path tracking with these parameter updating relationships. A similar kind
of approach is described in [21].
ẇ = F 3I Fr ψ T − κEw
ȧ = G r T w ψa − κEa
ċ = H r T w ψc − κEc
σ̇ = J r T w ψa − κEσ (14)
652 K. J. C. Kumara and M. H. M. R. S. Dilhani
where e is the tracking error of the path given by the surge (x), the sway (y), and
the orientation (φ) and ė are related to velocity errors.
Further, r is defined and the sum of error matrices, so that the space state model
of error dynamics is as follows.
∼
Ė = AE + B y (16)
r = E + Ė (17)
A and B are constant matrices related to PD controller gains, and details are
discussed in Sect. 3. Finally, the position and velocity errors are bounded, and the
convergence pf SA-RBF-NN parameters are guaranteed with practical bounds as
proved in the next section.
The Lyapunov theory was introduced in many controller designs in the literature
[19], [22] as it can ensure the globally asymptotical stability of nonlinear control
systems. In order to prove the controller developed above, the Lyapunov function
candidate is defined as,
L(E, w̃, ã, c̃, σ̃ ) = E T P E + tr w̃ T F −1 w̃ + tr ãG −1 ã T
+ tr c̃H −1 c̃ T + tr σ̃ J −1 σ̃ T (18)
˙ −1 ∼T
∼ ˙ −1 ∼T
∼ ˙ −1 ∼T
∼
+ tr aG a + tr c H c + tr σ J σ
A T P + P A + Q = 0, P B = C T
∼
Substituting for y from (10) gives,
∼∼ ∼T ∼T ∼T
w σ
L̇ = −E Q E + 2(C E) 3I
T
ψ +w ψa
T
a + ψc c + ψσ +ξ +ε
∼˙T ∼ ˙
∼ ∼T ˙
∼ ∼T
+ 2tr w F −1 w + tr aG −1 a + tr c H −1 c
˙ −1 ∼T
∼
+ tr σ J σ
∼∼ ∼T ∼T ∼T
L̇ = −E Q E + 2r 3I wψ +w ψa a + ψc c + ψσ σ
T T
2(C E)T (ξ + ε)
∼˙T ∼ ˙
∼ ∼T ˙
∼ ∼T
+ 2tr w F −1 w + tr aG −1 a + tr c H −1 c
˙
∼ ∼T
+ tr σ J −1 σ
Using the property B T A = tr AB T = tr B T A for any A, BR N ×1 , and it can
be written that
∼
∼
r T (3I ) w ψ = tr ψr T (3I ) w
654 K. J. C. Kumara and M. H. M. R. S. Dilhani
∼ ∼T ∼ ∼T
r T w ψa a = tr r T w ψa a
∼ ∼T ∼ ∼T
r T w ψc c = tr r T w ψc c
∼ ∼T ∼ ∼T
r T w ψσ σ = tr r T w ψσ σ
Then,
∼˙T −1 ∼
L̇ = −E Q E + 2(C E) (ξ + ε) + 2tr w F w
T T
∼ ˙ −1 ∼T
∼ ˙ −1 ∼T
∼
+ 2tr ψr (3I ) w + tr aG a
T
+ tr c H c
˙
∼ ∼T
+ tr σ J −1 σ
∼T ∼T
T ∼ T ∼
+ 2tr r w ψa a + 2tr r w ψc c
∼ ∼T
+ 2tr r T w ψσ σ
˙
∼ ∼ ∼˙ ∼ ∼˙ ∼ ˙
∼ ∼
Since w = − ẇ, a = − ȧ, c = − ċ and σ = − σ̇ with the tuning laws in (14),
above L̇ is further derived to the form,
L̇ = −E T Q E
+ 2(C E)T (ξ + ε)
+ 2κEtr w T w − w + 2κEtr w T w − w
T
+ 2κEtr a a − a + 2κEtr σ T σ − σ
√
Q min E + 2κ ŵ ŵ − wmax + 2κ â â − amax + 2(ξ N + ε N )
L̇ = −E
+ 2κ ĉ ĉ − cmax + 2κ σ̂ σ σ − σmax − 2
The CAD model (Fig. 1) of the surface vessel designed in Catia V6 by following
the standards of the naval architecture described especially in SV designs in [24–26]
for autonomous applications. The SV design is consisted of two similar hulls (both
shape and the mass) as the main floating bodies of the SV. Two electric propellers are
connected to the back end of the hulls as shown. To maintain the vertical stability,
a submerged body of aerofoiled shape Gertler body is attached to thin vertical strut
and that is fixed to the SV structure symmetrically. Physical dimensions of the SV
of the vessel are given in Table 1. The mathematical model of the SV was devel-
oped following the first principles and using the standard notations rather physical
parameters that are later substituted to at the numerical simulation stage.
The frame definitions are as follows.
EF: origin coincides with the center of gravity (CG) of the vessel at the initial
position. XE axis is directed toward the North. YE axis is directed toward the East
of the Earth. ZE axis points downward.
656 K. J. C. Kumara and M. H. M. R. S. Dilhani
BF: origin is fixed at the CG of the vessel. XB is directed toward the sway, YB is
directed toward the surge.
Vertical (heave), pitch, and roll motions of the SV are neglected as appreciable loss
in accuracy under typical and slightly severe maneuvers is very small. Therefore, SV
mathematical representation is limited to three degrees of freedom (3DoF) system.
The configuration vector of the SV in BF with respect to (w.r.t.) EF can be written as
T
η(t) = x y φ ; t ≥ 0 (19)
where x = x(t) and y = y(t) represents the linear displacements in the surge
(X E ) and sway (Y E ) directions and, φ = φ(t) is the yaw about Z E . By defining SV
velocitiesu = u(t), v = v(t) and ω = ω(t) in the directions ofX B , Y B and rotation
about Z B directions, respectively, velocity vector of the SV is given by (20).
T
V (t) = u v ω ; t ≥ 0 (20)
In (21), the angle of trim and angle of the roll are considered negligible and have
a minimum effect on the dynamics of the SV. The relationship between (19) and (20)
can be further derived using (21) to describe the kinematics of the SV as,
By applying the Newton–Euler equations to the motion of the SV (Fig. 2), the
general equations of motion along the directions of BF that describe the dynamics
of the SV are obtained as,
X = m(u̇ − vω − yG ω̇ − x G ω2 )
Fig. 2. 3D design of the prototype SV, its main components, and the coordinate frames defined for
the analysis of kinematics and dynamics (<1>: two hulls, < 2>: vertical strut, < 3>: gertler body,
<4>: two propellers, CG: Center of gravity, EF: Earth fixed-frame, B: Body)
658 K. J. C. Kumara and M. H. M. R. S. Dilhani
Y = m(v̇ − uω + x G ω̇ − yG ω2 )
N = I Z ω + m[x G (v̇ + uω) − yG (u̇ − vω)] (23)
where X , Y , and N are the external forces and moment acting on the vehicle. x G and
yG are the distances to the CG of the SV from the origin of the BF. Here, by placing
the origin of the BFF at the CG, x G → 0,yG → 0. Hence, the above set of relations
(23) are further simplified and concisely derived as,
where M is the positive definite mass-inertia matrix, C[V (t)] ∈ R3×3 is the total
matrix for Coriolis and centripetal terms, and D[V (t)] ∈ R3×3 is the damping force
matrix, g0 [η(t)] ∈ R3 represents the vector of gravitational forces and moments,
and finally, TR (t) ∈ R3×3 gives the input vector that represents the external forces
and moments on the SV. The detailed version of the mathematical model in (24) is
developed by considering the SV parameters and marine hydrodynamics theories in
[24] [26]and [27]. Time differentiating (22) and substituting into (24) yields,
where f [η, V ] = −J (η)[C(V ) + D(V )]V (t) − J˙(η)V (t) and τ[η, U ] =
j(η(t)) · g(U ).
The controlled terms g(U ) given by (26) are determined by the control method
described in Sects. 2 and 4 under numerical simulations. Furthermore, propeller
thrust (T ) and angle (δ) provide the actual output to move the SV. Once the entries
of g(U ) are known, (26) will then be solved for the control vector, U.
T
g(U ) = T cos(δ) T sin(δ) T sin(δ) (26)
One may refer [18] for detail kinematics and dynamics analysis of the SV with
all the physical properties and marine hydrodynamic modeling.
Referring back to the error dynamics described by the state space model in (16) and
(17), matrix A is selected, so that first part of (16) contributes to the PD controlling,
while nonlinear dynamics are handled by the second part with SA-RBF-NN control-
ling signal. Taking K p and K d are proportional and derivative gains, respectively, the
PD controller gain functions are given as follows.
0 1
A=
−K p −K d
Shape-Adaptive RBF Neural Network … 659
T T T
Further by defining B = 0 1 , C = 1 1 and E = e ė are completed
the definition of the state space model of the SV control system.
It can also be proved that the transfer function T (s) of the state space model of
SV is stable for K P ≥ 0 and K d > 1 converting the transfer function into controller
canonical form [20] and by using the pole placement approach. This proof can be
found elsewhere [17] for weight tuning law, and one can follow the same procedure
to obtain the same for all other tuning laws.
The control method presented requires full state feedback and acceleration
measurements in both surge and sway directions. Yaw rate can be measured using a
gyroscope placed near the CG, and accelerations are measured by accelerometers.
Nowadays, an inertial sensor-based low-cost hardware–software system is available
with all the above measuring capabilities [28]. Further Kalman filter-based algorithm
is used to estimate the velocities of SV [29]. With the availability of geographical
positioning system (GPS) signal corrected to localize dead reckoning, absolute posi-
tion coordinates are obtained. Further, as the SA-RBF-NN controller developed here
is compact, and an embedded computer is capable enough to process the data and
calculate the control signal to deliver the required thrust in real time.
The completed mathematical model and the controller are converted to a MATLAB
[30] code and simulated for the eight-shape trajectory defined by (27) as the desired
path (with positions; xd , yd ). The application proposed here is the load and unloading
of ship cargos temporarily anchored near the harbor.
αt
xd = 2Rsin
2
yd = Rsin(αt) (27)
Two SA-RBF-NN subnets are derived based on the controller described above
to handle the surge and sway dynamics. The design parameters of the controller
tuned with the randomly selected valued to achieve the best performance in terms of
stability and the tracking accuracy. The initial simulation results shown in Figs. 3, 4,
and 5 indicate that the SA-RBF-NN has the highest position accuracy compared to
the conventional PD controller for the controller gains, K p = 1, and K d = 7. Further,
these results are shown only the online training where dynamics are changed based
on the desired path and desired speeds governed by the desired trajectory and hence
the dynamics of the SV.
Fig. 3 Tracking of eight-shape trajectory by PD controller only and with SA-RBF-NN controller
Fig. 4 Surge directional tracking of error of eight-shape trajectory by PD controller only and with
SA-RBF-NN controller
Shape-Adaptive RBF Neural Network … 661
Fig. 5 Sway directional tracking of error of eight-shape trajectory by PD controller only and with
SA-RBF-NN controller
presented for various application with weight updates which is modified by intro-
ducing its shape change by integrating RBF center states, standard deviations, and
altitudes, so that activation function itself updated and adapt to the situation. With
the initial RBF function in (1), the author’s previous controller design approach is
extended by modifying the activation function given by (7). Tuning laws proposed
are optimized and developed by considering the overall feedforward transfer function
of the state space model consisted of error dynamics, and the control signal is conver-
gent, and therefore, controller design parameters are selected. A short tracking path
of the eight-shape curve is selected, and numerical simulations are carried out for the
dynamic model of prototype SV designed in 3D with all necessary components. Two
propellers and propeller angles (two propellers together) are controlled by the control
signal delivered by two neural subnets developed to handle longitudinal (surge) and
lateral (sway) dynamics. Results of numerical simulations are shown that the desired
trajectory is accurately tracked by the newly developed controller the SA-RBF-NN
by combining PD controller, compared to the latter controller both with the full state
feedback sensors. Therefore, SA-RBF-NN controller can be proposed to control such
nonlinear systems and especially when run by low-cost embedded hardware as the
controller is compact and need less computing power compared to most other NN
based controller today.
In an actual situation, this type of SV come across many other challenges such
as obstacle avoidance, navigation, and mapping and various application-based prob-
lems. Some of them can easily be solved by integrating the LiDAR and vision-based
sensing system; however, the computing power and energy are necessitated. Future
662 K. J. C. Kumara and M. H. M. R. S. Dilhani
References
1. Buhmann M (2009) Radial basis functions: theory and implementations. Cambridge University
Press: Cambridge, pp 1–4
2. Graupe D (2007) Principles of artificial neural networks. World Scientific, Singapore
3. Park J, Sandberg I (1991) Universal approximation using radial-basis function networks. J.
Neural Comput 3(2):246–257
4. Moody J, Darken C (1989) Fast learning in networks of locally tuned processing units. J. Neural
Computing. 1(2):281–294
5. Wang L, Fu X (2005) A simple rule extraction method using a compact RBF neural network.
In: Advances in neural network. Springer, Heidelberg pp 682–687
6. Baughman D, Liu Y (1995) Classification: fault diagnosis and feature categorization. In: Neural
networks in bioprocessing and chemical engineering. Academic Press Ltd., California. pp
110–171
7. David VK, Rajasekaran S (2009) Pattern recognition using neural networks and functional
networks. Springer, Heidelberg
8. Wu J (2012) Prediction of rainfall time series using modular RBF neural network model coupled
with SSA and PLS. In: Asian conference on intelligent information and database systems.
Kaohsiung, Taiwan (2012)
9. Saastamoninen A, Lehtokangas M, Varri A, Saarinen J (2001) Biomedical applications of radial
basis function networks. In: radial basis function networks. vol 67. Springer, pp 215–268
10. Halali M, Azari M, Arabloo M, Mohammadi A, Bahadori A (2016) Application of a radial
basis function neural network to estimate pressure gradient in water–oil pipelines. J. Taiwan
Inst Chem Eng 58:189–202 (Elsevier2016)
11. Wang P (2017) The application of radial basis function neural network for mechanical fault
diagnosis of gear box. In: IOP conference series: materials science and engineering. Tianjin,
China
12. Liu J (2010) Adaptive RBF neural network control of robot with actuator nonlinearities. J.
Control Theory Appl 8(2):249–256
13. Chaudhuri A (2012) Forecasting financial time series using multiple regression, multi-
layer perception, radial basis function and adaptive neuro fuzzy inference system models:
a comparative analysis. J Comput Inf Sci 5:13–24
14. Sisil K, Tsu-Tian L (2006) Neuroadaptive combined lateral and longitudinal control of highway
vehicles using RBF networks. IEEE Trans Intell Transp Syst 17(4):500–512
15. Marino R (1997) Adaptive control of nonlinear systems: basic results and application. J Annu
Rev Control 21:55–66
16. Howlet R, Jain L (2010) Radial basis function networks 1: recent developments in theory and
applications. Springer, Heidelberg
17. Kumara KJC, Sisil K Intelligent control of vehicles for “ITS for the sea” applications. In: IEEE
third international conference on information and automation for sustainability. IEEE Press,
Melbourne, pp 141–145
Shape-Adaptive RBF Neural Network … 663
18. Kumara KJC (2007) Modelling and controlling of a surface vessel for “ITS for the Sea”
applications. Master thesis. University of Moratuwa
19. Fadali A, Visioli A (2013) Elements of nonlinear digital control systems. In: Digital control
engineering. Academic Press, Amsterdam, pp 439–489
20. Ogata K (2010) Modern control engineering. Prentice Hall, Boston, pp 649–651
21. Giesl P (2007) Construction of global lyapunov functions using radial basis functions. Springer,
Heidelberg, pp 109–110
22. Zhang J, Xu S, Rachid A (2001) Automatic path tracking control of vehicle based on Lyapunov
approach. In: IEEE international conference on intelligent transportation systems. IEEE Press,
Oakland
23. Vidyasagar M (1993) Nonlinear systems analysis. Prentice-Hall, Englewood Cliffs
24. Bishop B (2004) Design and control of platoons of cooperating autonomous surface vessels. In:
7th Annual maritime transportation system research and technology coordination conference
25. Caccia M (2006) Autonomous surface craft: prototypes and basic research issue. In: 14th
Mediterranean conference on control and automation
26. Vanzweieten T (2003) Dynamic simulation and control of an autonomous vessel. Master thesis.
Florida Atlantic University, Florida
27. Newman J (1977) Marine hydrodynamics. MIT Press, London
28. Sukkarieh S (2000) Low cost, high integrity, aided inertial navigation system. Ph.D. thesis.
University of Sydney, Sydney
29. An intro to Kalman filters for autonomous vehicle. https://towardsdatascience.com/an-intro-
to-kalman-filters-for-autonomous-vehicles
30. MATLAB. https://www.mathworks.com/products/matlab.html
31. NVIDIA Jetson nano developer kit. https://developer.nvidia.com/embedded/jetson-nano-dev
eloper-kit
32. ROS: robot operating system. www.ros.org
Electricity Load Forecasting Using
Optimized Artificial Neural Network
Abstract Electric load forecasting becomes one of the most critical factors for the
economic operation of power systems due to the rapid increment of daily energy
demand in the world. The energy usage of the electricity demand is higher than the
other energy sources in Sri Lanka according to the record of Generation Expansion
Plan—2016, Ceylon Electricity Board, Sri Lanka. Moreover, forecasting is a hard
challenge due to its complex nature of consumption. In this research, the long-term
electric load forecasting based on optimized artificial neural networks (OANNs) is
implemented using particle swarm optimization (PSO) and results are compared with
a regression model. Results are validated using the data collected from Central Bank
annual reports for thirteen years from the year 2004–2016. The choice of the inputs
for ANN, OANNs, and regression models are given depends on the values obtained
through the correlation matrix. The training data sets used in the proposed work are
scaled between 0 and 1, and it is obtained by dividing the entire data set by its large
value. The experimental results show that OANN has better accuracy in forecasting
compared to ANN and regression model. The forecasting accuracy of each model is
performed by the mean absolute percentage error (MAPE).
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 665
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_48
666 M. H. M. R. Shyamali Dilhani et al.
1 Introduction
Predicting future electricity consumption is vital for utility planning. The process is
difficult due to the complex load patterns. Electricity forecasting is the basic planning
process which is followed in the electric power systems industry for a particular
area over different time horizons [1, 2]. Accurate electricity forecasting leads to
reduce operation and maintenance costs. It increases the power supply reliability and
delivery system which helps to obtain a reliable decision for future development. At
present, utilities have been growing interest in smart grid implementation. Therefore,
electricity forecasting has a greater impact on storage maintenance, demand-side
management, scheduling, and renewable energy integration. Forecasting helps the
user to obtain the relationship between the consumption and its price variations in
detail [3].
Forecasting is broadly classified into long-term forecasting, medium-term fore-
casting, and short-term forecasting. In this, long-term forecasting is used to predict
the plant capacity and its planning, medium-term forecasting is claimed to predict
the plant maintenance schedule and short forecasting is used to predict the daily
operations. The proposed research work focused on long-term forecasting. Long-
term forecasting is common in the planning and operation of electric utilities [4].
Electricity consumption varies according to the economic and social circumstances
of a society. The major advantage of long-term electricity load forecasting is it indi-
cates economic growth for the system. Moreover, it provides relevant data in terms of
transmission, distribution, and generation. However, electricity forecasting accuracy
differs from the nature of the situation. For example, one can forecast daily load
demand in a particular region within 1–3% accuracy, whereas accurate prediction
for an annual load is a complex process due to the unavailability of long-term load
factors information [2, 3, 5].
Annual electricity forecasting in Iran is reported in Ghanbari et al. [6] research
work where the artificial neural network and linear, log linear regression models are
used to experiment. Real GDP and population are the two economical parameters that
are considered as the experimental lags in the proposed approach. Abd-El-Monem
[7] provides an in-depth analysis for forecasting in the Egyptian region where ANN
and other forecasting parameters are used to test the given condition. The study
provides better analysis for load demand, sales, population, GDP, and average price
in an accurate manner including econometric variables.
In addition to that, meta-heuristic models are used to optimize the forecasting
model which gives reliable and robust results. Using global search procedures,
probabilistic and heuristic nature are analyzed [8–11].
Since back propagation (BP) algorithm stops at local minima, many researchers
have an interest in training the ANN model using PSO. The ability to solve complex
and nonlinear functions using PSO is a major influence to use it for electricity demand
forecasting [12–15]. PSO is a fast convergence process which is based on swarm-
based operation in which the particles are adjusted to obtain the desired performance
output.
Electricity Load Forecasting Using Optimized Artificial Neural Network 667
The neural network model is developed based on the human brain structure where
the nerves are considered as neurons for process the given input. Figure 2 depicts an
illustrative representation of a neural network and neurons. Generally, it has three
layers such as the input layer, hidden layer, and output layer. These layers are inter-
connected based on the weights. The hidden layer is used to connect the input and
output layer and the weights are used to reduce the error between the layers. Based
on the learning rules, the weights are modified in the neural network architecture.
Most probably, initial weights are chosen randomly. Then adjust them by comparing
the output error.
The mathematical representation of the input parameter of the neuron is
X (where X = [x1, x2, . . . .xn ]) and the output is y. W = [w1 , w2 , . . . wn ] represents
the synaptic weights and b is the bias. The neuron output is given by (1).
y= wn x n + b (1)
The estimation of these parameters focusing the minimum error criterion is called
the training of the network. Back propagation [16] is a widely used model in a neural
network and various research works and applications are evolved using this back
propagation algorithm [17–19]. It updates the weights and biases until it obtains a
zero training error or predetermined epochs. Weight is frequently changed based on
the error function obtained between the actual output and desired output values. The
weight correction at each iteration, k of the algorithm is given as,
Neuron
Inputs
⋮ ⋮
Output
where wik is the current set of weights and biases, gik is the current gradient which is
based on the given error, and αik is the learning rate.
first algorithm has been developed based on the observations from fish swarms and
bird flocks. Multi-dimensional search space is considered as the main objective of
PSO where the swarm of the objects is used to analyze in the search space. Based
on the particle movements, PSO provides a globally optimum solution.
PSO uses local optima to handle optimization issues and it is useful for model
implementation. Also, PSO has successfully been applied to least squares estima-
tion [22], multiple regression models [15], neural network training [14, 23–26], and
support vector machine (SVM) [27]. The generic steps of the PSO are mathematically
interpreted as follows. Each particle is initialized in the search space with random
positions and velocities. The position and the velocity of each particle at genera-
tion i are given by the vectors xi = (xi1 , xi2 , . . . , xid ) and vi = (vi1 , vi2 , . . . , vid )
respectively, where d is the number of particles.
Based on the fitness function of particles, two kinds of memories are obtained in
the particle swarm optimization process. The fitness value is the mean squared error
(MSE) between the target and actual data series. After calculating the fitness value
of all the particles, PSO updates the personal best (pbest) and global best (gbest),
where pbest is the best personal value for each particle updated so far and the gbest is
the best global value for the entire set. The pbest and gbest represent the population,
velocities, and positions are updated according to the (3) and (4).
xk (i + 1) = xk (i) + Vk (i + 1) (4)
where Vk (i) and xk (i) are the velocity and position of particle k at ith iteration and
w is the inertia weight. r1 and r2 are the random values between 1 and 0 and the
predetermined learning factors are given as n 1 and n 2 .
3 Proposed Techniques
In this proposed work, three models are employed to solve the problem of long-term
electricity demand forecasting.
1. Forecasting model using back propagation ANN.
2. Forecasting model using ANN optimized by PSO.
3. Forecasting model using linear regression.
The first model uses BP to train the weights of the ANN, while the second model
uses the PSO to optimize the weights of ANN. The third model discusses a statistical
model called linear regression to forecast long-term electricity demand. The results
are obtained using real historical monthly data from Central Bank reports, Sri Lanka.
These methods are explained in the following subsections in detail.
670 M. H. M. R. Shyamali Dilhani et al.
In this model, forecasts will be prepared for each month of the year 2016. Five inputs
are created including population (1000 per capita), GDP (per capita in US$), energy
sales (GWh), an exchange rate (US$), and historical loads (MW). The monthly GDP
data are collected for thirteen years from 2004 to 2016. The correlation matrix is
used to obtain the input choices and results. Figure 3 depicts that the selected five
factors are highly correlated with each other. Table 1 summarizes the results from
the correlation matrix. For example, the correlation coefficient between historical
annual load and population, GDP, energy sales, and exchange rate are 0.917, 0.972,
0.999, and 0.953, respectively.
All the training data in this process are scaled to be between 0 and 1. To meet
this, all the data sets are divided by its largest singular value. The proposed model is
described with the following equation to explain the inputs and outputs for the ANN.
Load of the forecasting month, F(m) is calculated using the load of the previous
month, L(m − 1), the population in the previous month, Pop(m − 1), the per capita
GDP in previous month, GDP(m − 1), energy sale in the previous month, ES(m − 1),
and the exchange rate in the previous month, ER(m − 1). In this model, ANNs are
trained by 12 years’ historical data (from 2004 to 2015) and designed to predict the
total load for each month of the year 2016. In this process, the target set is loaded with
data from 2004 to 2015 which has 144 values and the input set is used to load with
144 × 5 elements as historical load values, monthly population, etc. The monthly
population of Sri Lanka where annual population increases has divided by 12 to take
monthly population assuming uniform population growth, the third column is the
per capita GDP, the fourth column is the historical energy sales, and the last column
is the exchange rate in Sri Lanka for the specified period.
A prior analysis and simulations were carried out for different ANN structures
with different training functions by varying the hidden neurons. By varying biases
and weight functions at each series, various results are obtained for the same struc-
ture. Through the conjugate gradient BP (traincgb), training function performs better
compared to other training functions. Also, minimum forecasting error was obtained
from the three-layer topology with five hidden neurons for the first and second layers
and the last layer consists only one hidden neuron. Therefore, the later experiment is
performed with a similar structure identified through the secondary experiment. The
BP training algorithm is used with 1000 epochs to find the optimum weight values
of the network.
In particle swarm optimization, a set of weights are used to represent each particle
which provides the relationship between the neurons. Mean square error is the fitness
function of every particle which is used to measure from the output and the weights of
672 M. H. M. R. Shyamali Dilhani et al.
the given series. The error function could be reduced by updating the weight function
frequently. Once the fitness functions are calculated, the pbest and the gbest values
are updated in the process which describes the effective weight of the particle from
the entire set.
The process of ANN optimized by PSO is summarized as follows.
Step 1 Sample data are scaled to be between 0 and 1.
Step 2 All the variables are randomly initialized and update the velocity and the
position of each particle. In the process, 0 and 1 are assigned to r 1 and r 2 as
a random variable and the learning factor along with inertia weight are fixed
into 0.5 and 2, respectively. The maximum number of iteration is 100 and
the fitness value is calculated using MSE. This step also allows placing the
weights and biases of each particle. The total number of weights and biases
for the proposed model is 36: 30 weights and 6 biases.
Step 3 Calculate the MSE using the following equation
1
n
MSE = (L m − Fm )2 (6)
n m=1
The linear regression model is considered to forecast the monthly load of the year
2016. The regression model is a statistical technique and most of the researchers use
this model due to the easiness of model implication. It is the same as the factors which
is used in the ANN model. The forecasted load is considered as the dependent variable
and the other factors are considered as the independent variables. The mathematical
representation of this model can be summarized as follows.
4 Forecasting Performance
Absolute percentage error and mean absolute percentage are used to calculate the
accuracy of the forecasting model and it is given as
n
L m − Fm
APE = × 100 (8)
L
m=1 m
APE
MAPE = (9)
n
where n is the total number of months, L m represents the actual load demand at
month m, and Fm represents the forecasted load demand at month m.
In this section, the MAPE results for ANN, OANN, and regression model simulations
are summarized. According to the MAPEs in Table 2, it is observed that OANN
attains best performance in annual electricity forecasting and electricity load demand
compared to ANN with back propagation and linear regression model.
All the forecasting models have five input variables together with the historical
load demand to forecast the monthly load of the year 2016. An optimized neural
network consists of optimized weights using particle swarm optimization performs
better compared to the other two models. It has shown that the average monthly
forecasted error is 1.836. The second least average forecasting error is given by a
neural network. The average monthly forecasting error is 2.502 while the regression
model shows a 2.949 average forecasting error.
All three models have their highest forecasting error in April (3.611, 4.665, and
4.976, respectively), whereas minimum forecasted errors are given in December
(0.556, 1.452, and 1.753, respectively). Moreover, the optimized neural network has
shown that over 2 percent forecasting error only in February (2.613), May (2.304),
August (2.240), and November (2.142), whereas the ANN model and regression
model have more than 2 percent forecasting error for all the months except in
December.
Figure 4 shows that the best forecasting results are given by the OANN model.
Moreover, a paired t-test is carried to check the model accuracy. Tables 3 and 4
are showed the correlation between the actual load demand and the forecasted load
674 M. H. M. R. Shyamali Dilhani et al.
by OANN, ANN, and regression model. It is highly correlated, and the pairs are
significant with probability value 0.
6 Conclusion
The technique based on the PSO and ANN was proposed in this research to fore-
cast monthly load demand for the Sri Lankan network. The results of numerical
simulations show that the OANN model together with five input factors reduces the
forecasting error significantly. The correlation matrix is used to obtain the choice
of inputs to obtain the desired results. The selected factors have higher correlations
with each other. In the data preparation process, all the training data are uniformly
scaled to be between 0 and 1. Weights and biases of the ANN model are optimized
by using PSO and BP training algorithms, and the regression model is considered
to check the model adequacy. Though the ANN and regression model provides rela-
tively better results, it is still not accurate as the OANN model. OANN performs well
as it is having a unique ability to deal with the nonlinearity of the model. As such, it
overcomes many time series models’ drawbacks as per the case presented and tested
by the work here. It can also conclude that all techniques are quite promising and
relevant for long-term forecasting according to the paired t-test results.
References
4. Chow JH, Wu FF, Momoh JA (2005) Applied mathematics for restructured electric power
systems. In: Applied mathematics for restructured electric power systems, Springer, pp 1–9
5. Starke M, Alkadi N, Ma O (2013) Assessment of industrial load for demand response across
US regions of the western interconnect. Oak Ridge National Lab. (ORNL), Oak Ridge, TN,
US
6. Ghanbari A et al (2009) Artificial neural networks and regression approaches comparison for
forecasting Iran’s annual electricity load. In: International conference on power engineering,
energy and electrical drives, 2009, POWERENG’09. IEEE
7. Abd-El-Monem H (2008) Artifical intelligence applications for load forecasting
8. Zhang F, Cao J, Xu Z (2013) An improved particle swarm optimization particle filtering algo-
rithm. In: 2013 International conference on communications, circuits and systems (ICCCAS).
IEEE
9. Jiang Y et al (2007) An improved particle swarm optimization algorithm. Appl Math Comput
193(1):231–239
10. Samuel GG, Rajan CCA (2015) Hybrid: particle swarm optimization genetic algorithm
and particle swarm optimization shuffled frog leaping algorithm for long-term generator
maintenance scheduling. Int J Electr Power Energy Syst 65:432–442
11. Chunxia F, Youhong W (2008) An adaptive simple particle swarm optimization algorithm. In:
Control and decision conference, 2008. CCDC 2008. Chinese. IEEE
12. Subbaraj P, Rajasekaran V (2008) Evolutionary techniques based combined artificial neural
networks for peak load forecasting. World Acad Sci Eng Technol 45:680–686
13. Daş GLS (2017) Forecasting the energy demand of Turkey with a NN based on an improved
particle swarm optimization. Neural Comput Appl 28(1): 539–549
14. Jeenanunta C, Abeyrathn KD (2017) Combine particle swarm optimization with artificial neural
networks for short-term load forecasting. ISJET 8:25
15. Hafez AA, Elsherbiny MK (2016) Particle swarm optimization for long-term demand fore-
casting. In: Power systems conference (MEPCON), 2016 eighteenth international middle east.
IEEE
16. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating
errors. Nature 323(6088):533
17. Mazzoni P, Andersen RA, Jordan MI (1991) A more biologically plausible learning rule for
neural networks. Proc Natl Acad Sci 88(10):4433–4437
18. Dilhani MS, Jeenanunta C (2017) Effect of neural network structure for daily electricity load
forecasting. In: Engineering research conference (MERCon), 2017 Moratuwa. IEEE
19. Samarasinghe S (2016) Neural networks for applied sciences and engineering: from funda-
mentals to complex pattern recognition. Auerbach Publications
20. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm.
In: 1997 IEEE international conference on systems, man, and cybernetics. Computational
cybernetics and simulation. IEEE
21. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings
of the sixth international symposium on micro machine and human science, 1995. MHS’95.
IEEE
22. AlRashidi M, El-Naggar K (2010) Long term electric load forecasting based on particle swarm
optimization. Appl Energy 87(1):320–326
23. Meissner M, Schmuker M, Schneider G (2006) Optimized particle swarm optimization (OPSO)
and its application to artificial neural network training. BMC Bioinf 7(1):125
24. Freitag S, Muhanna RL, Graf W (2012) A particle swarm optimization approach for training
artificial neural networks with uncertain data. In: Proceedings of the 5th international
conference on reliable engineering computing, Litera, Brno
25. Tripathy AK et al (2011) Weather forecasting using ANN and PSO. Int J. Sci Eng Res 2:1–5
26. Shayeghi H, Shayanfar H, Azimi G (2009) STLF based on optimized neural network using
PSO. Int J Electr Comput Eng 4(10):1190–1199
27. Sarhani M, El Afia A (2015) Electric load forecasting using hybrid machine learning approach
incorporating feature selection. In: BDCA
Object Detection in Surveillance Using
Deep Learning Methods: A Comparative
Analysis
Abstract Unmanned aerial vehicle (UAV) technology has revolutionized the field
globally in today’s scenario. The UAV technologies enabled the activities to be
efficiently monitored, identified, and analyzed. The principal constraints of the
present surveillance system, along with closed-circuit television (CCTV) cameras,
are limited surveillance coverage area and high latency in object detection. Deep
learning embedded with UAVs has found to be effective in the tracking and moni-
toring of objects, thus overcoming the constraints mentioned above. Dynamic surveil-
lance systems in the current scenario seek high-speed streaming, and object detection
in real-time visual data has become a challenge over a reasonable time delay. The
paper draws a comprehensive analysis of object detection deep learning architec-
tures by classifying the research based on architecture, techniques, applications, and
datasets. It has been found that RetinaNet is highly accurate while YOLOv3 is fast.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 677
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_49
678 D. Saini et al.
For detection and monitoring of objects in real time, high-level, large-scale data
is retrieved, localized, and classified. Object detection [2], hence, supplies important
facts about the acknowledged visual data for the logical understanding.
The detection and tracking of objects are ubiquitous and find its place in many
prevalent applications that includes surveillance through human behavior analysis
[3], driverless cars, medical diagnosis [4], pose estimation, handwriting estimation,
visual object detection, large-scale object detection, and traffic surveillance [5–7].
The field of object detection holds a huge potential for research to make the most effi-
cient learning algorithms. The learning models are trained and tested on the labeled
dataset. The applied algorithm must perform in real-time situations proficiently and
efficiently, particularly in fields of a major problem with safety.
The issues experienced in the identification of objects, such as diverse lighting
conditions, occlusion [8], varying viewpoints, and poses, have broad opened a fresh
research window for the building of systems that could effectively perform object
identification and localization tasks. Therefore, the task for state-of-the-art research
is not just restricted to detection and tracking but also to meet the abovementioned
challenges.
The full paper is organized in the following manner. Section 2 in this paper
discusses the related research work on object detection. Section 3 elaborates on
the existing architectures of object detection and draws a performance comparison.
Section 4 discusses challenges to object detection. Section 5 concludes the work
along with the future scope of the research.
A most descriptive model of deep learning algorithm is CNN [9] also termed as
ConvNet. It is the non-recurrent feed-forward technique of artificial neural networks
employed to recognize visual data. Some of the traditional CNN architectures include
LeNet-5, ZFNet [10], VGG16 [11], and AlexNet [12], while modern approaches
include Inception, ResNet, and DenseNet. Convolution is a mathematical term or
operation referred for the sum of products, and successive convolutional layers are
subsets of previous layers interleaved with average pooling or maximum pooling
layers as depicted in Fig. 1. Each 3D matrix in the neural network is called the
feature map. Filtering and pooling of the transformations are applied to these feature
maps to extract robust features. LeNet 5 [13] was developed for postal services to
recognize handwritten digits of zip codes. AlexNet [12] is another architecture that
uses large kernel sized filters 11 × 11 in the first convolutional layer and 5 × 5 in
the second layer. The architecture is trained on the ImageNet dataset.
These large-sized non-uniform filters are replaced in VGG16 architecture that
has 3 × 3 uniform filters through 16 convolutional layers [11]. VGG16 architecture
has also been trained on ImageNet, and the model performance is very high with an
accuracy value of 92.7%. Modern technique based on CNN which includes Inception
architecture developed by Google, also referred to as GoogLeNet, contains Inception
Object Detection in Surveillance Using Deep Learning Methods… 679
cells that perform convolution in series at different scales. It uses global pooling at
the end of the network.
Deep neural networks face the concern of vanishing gradient due to the stacking
of several layers and backpropagation to the previous layers. This concern is
addressed and resolved in Residual Network (ResNet) by skipping one or more
layers, thereby creating shortcut connections in the convolutional network, while
DenseNet presented another approach by connecting each layer with the input layer
through shortcut connections. Table 1 draws a comparison among the discussed
architectures.
3 Object Detection
Object detection follows a series of the procedure from region selection where the
object is located by creating bounding boxes around the detected objects. This is also
referred to as region-based object detection. Following that, certain visual features
from selected objects recognized by SIFT [15] or HOG [16] algorithms are extracted
680 D. Saini et al.
and classified to make the data hierarchical, followed by predicting the data to extract
logical information using SVM [17] or K-means classifiers. The object detection
architectures have been segregated based on localization techniques and classification
techniques. However, deep learning techniques are based on simultaneous detection
and classification referred to as regression techniques. Therefore, object detection is
broadly categorized as region-based and regression-based object detection shown in
Fig. 2.
RCNN uses selective search to generate exact 2000 region proposals, followed
by classification via CNN. These region proposals are refined using regression
techniques. But the assortment of 2000 regions makes the computation slow. The
architecture is as shown in Fig. 3.
Fast region CNN [19] is an open-source and implemented in Python and C++ [20]. It
is more accurate with a higher mean average precision (MAP) and nine times faster
than RCNN since it uses single-stage training rather than the three-stage training used
in RCNN. The training in fast region CNN updates all network layers concurrently.
This architecture employs a combination of region proposal network (RPN) and fast
RCNN detector [21]. RPN is a complete convolutional network that predicts the
bounding region and uses these proposals to detect objects and predict scores. RPN
is based on the ‘attention’ mechanism and shares it with fast RCNN to locate the
objects. Fast RCNN detector uses the proposed regions for classification. Both the
accuracy and quality of region proposals are improved in this method. Figure 4 depicts
the faster RCNN architecture. The comparison of region-based object detection has
been given in Table 2.
Table 2 Comparison of
Parameters RCNN Fast RCNN Faster RCNN
region-based object detection
methods Test time/Image (sec) 50 2 0.2
Speedup 1x 25x 250x
mAP (VOC 2007) 66.0 66.9 66.9
682 D. Saini et al.
3.2.1 Yolo
You Look Only Once (YOLO) shown in Fig. 5 is faster object detection. This regres-
sion approach predicts the region proposals and class probabilities in sole estima-
tion by a single neural network. The basic YOLO processes images in real time
at 45 frames per second (fps) and stream video with latency less than 25 ms. The
regression-based value obtained by this method is twice as compared to other real-
time detectors [22]. It performs better than prior detection methods but lacks in the
detection of small objects.
3.2.3 YOLOv2
3.2.4 RetinaNET
Fig. 7 RetinaNet architecture with pyramid stacked feature map layers [25]
684 D. Saini et al.
3.2.5 YOLOv3
YOLOv3 [26] is built on the DarkNet-53 CNN model. This method uses logistic
regression for each bounding box to predict an object score. The value is “1” if
the real object boundary, overlapped by the bounding box, is more than any other
bounding box. The prediction is ignored if the previous bounding box overlaps the
object boundary more than the threshold value. YOLOv3 uses independent logistic
classifiers for multi-label classification. This method is three times faster than SSD
and equally accurate. Unlike previous versions of YOLO, this method is also capable
of detecting small objects. YOLOv3 architecture is illustrated in Fig. 8.
Table 3 embodies the models capable of region proposal and classification of
detected objects. It also highlights the constraint in terms of involved high computa-
tion time. From the table, a clear observation is drawn that the methods comprising
a selection of object regions and simultaneous classification are faster.
Table 4 is a detailed analysis of object detection methods describing backbone
CNN architecture, trained on MS COCO, PASCAL VOC 2007, and PASCAL VOC
2012 datasets. The mean average precision (mAP) values are also compared and
are found maximum in YOLOv3, about 57.9% when trained on MS COCO dataset
which is also faster but less accurate than RetinaNet [26].
Object Detection in Surveillance Using Deep Learning Methods… 685
Several difficulties emerged in identifying the objects that exist in an image. This
section discusses the impediments experienced while dealing with the standard
datasets, which limit to achieve high-performance measures.
Occlusion: The major problem in object detection is occlusion [8, 27, 28]. Occlusion
is the effect of one object blocking another object from the view in 3D images. CNN
framework is not capable to handle occlusions. The complex occlusions in pedestrian
images with deep learning framework, DeepParts is proposed [29] to deal with the
problem.
Viewpoint variation: Severe distortions occur due to degree variation in viewpoints
of the image. The classification of objects becomes difficult on varied angles and has
a direct impact on accuracy in predicting the object [6].
Variation in poses: Variation in facial expression and poses makes it difficult for
the algorithm to detect the faces. To address occlusions and pose variations, a novel
framework based on deep learning is proposed in [28] which collects the responses
from local facial features and predicts faces.
Lighting conditions: Another big challenge in detecting objects is the lighting
conditions that may vary throughout the day. Different approaches are followed by
researchers to tackle varying lighting conditions in the daytime and nighttime traffic
conditions [7].
Table 4 Dataset and model-based object detection method review with mean precision value
686
Various object detection architectures are compared based on the training datasets,
and the performance measures are analyzed in this research. The comparison
focused on recognizing the most appropriate methods that could be used for surveil-
lance, requiring real-time data extraction with the least latency and maximum
accuracy. The analysis shows that object detection methods performing regional
proposal detection and classification reduce computation time and are there-
fore faster than traditional methods. The study highlights the fact that there is
always a trade-off between speed and accuracy. SSD provides maximum preci-
sion; however, with minimal latency, YOLOv3 outperforms all other object detec-
tion techniques. Using an unmanned aerial vehicle (UAV), YOLOv3 can be used to
produce highly responsive smart systems for live streaming videos and images in a
surveillance system over the Internet.
Acknowledgements This work is supported by the grant from Department of Science and Tech-
nology, Government of India, against CFP launched under Interdisciplinary Cyber-Physical Systems
(ICPS) Programme, DST/ICPS/CPS-Individual/ 2018/181(G).
References
11. Simonyan K, Zisserman A (2014) Very deep Convolutional networks for large-scale image
recognition, pp 1–14
12. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional
Neural Networks. In: Proceedings of the 25th international Conference on neural information
processing systems, Vol 1, pp 1097--1105
13. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document
recognition. proc IEEE
14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc IEEE
Comput Soc Conf Comput Vis Pattern Recognit 2016: 770–778
15. Keypoints S, Lowe DG (2004) Distinctive Image Features from. Int J Comput Vis 60(2):91–110
16. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceed-
ings—2005 IEEE Computer society Conference on Computer Vision and Pattern Recognition,
CVPR 2005
17. Kyrkou C, Theocharides T (2009) SCoPE: Towards a systolic array for SVM object detection.
IEEE Embed Syst Lett 1(2):46–49
18. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature Hierarchies for accurate object
detection and semantic segmentation. IEEE Conf Comput Vis Pattern Recogn 2014:580–587
19. Girshick R (2015) Fast R-CNN. In 2015 IEEE International Conference on Computer Vision
(ICCV), vol 2015 Inter, pp 1440–1448
20. Jia Y et al (2014) Caffe: Convolutional architecture for fast feature embedding,” MM 2014.
Proc 2014 ACM Conf Multimed 675–678
21. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with
region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
22. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time
object detection. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016:779–788
23. Liu W, et al (2016) SSD: Single Shot MultiBox Detector,” in Lecture notes in computer
science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in
Bioinformatics), vol 9905 LNCS, pp 21–37
24. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: 2017 IEEE conference on
Computer Vision and Pattern Recognition (CVPR), 2017, vol 2017, pp 6517–6525
25. Lin TY, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. Proc
IEEE Int Conf Comput Vis 2017: 2999–3007
26. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement, Apr 2018
27. Yao L, Wang B (2019) Pedestrian detection framework based on magnetic regional regression.
IET Image Process
28. Yang S, Luo P, Loy CC, Tang X (2015) From facial parts responses to face detection: a deep
learning approach. Proc IEEE Int Conf Comput Vis vol 2015 Inter, no 3, pp 3676–3684
29. Mathias M, Benenson R, Timofte R, Van Gool L (2013) Handling occlusions with franken-
classifiers. Proc IEEE Int Conf Comput Vis pp 1505–1512
MaSMT4: The AGR Organizational
Model-Based Multi-Agent System
Development Framework for Machine
Translation
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 691
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_50
692 B. Hettige et al.
1 Introduction
system and avoiding unnecessary message parsing can help to deal with the above
limitations.
Note that, the multi-agent system also uses to handle the complexity of a software
system and to provide intelligent solutions through the power of agent communi-
cation. Thus, the development of multi-agent systems is also a bit of a complicated
process. According to such complexity, selecting a suitable framework is highly
essential than ad-hoc development. In general, the multi-agent framework provides
agent infrastructure, communication, and monitoring methods for agents. In addition
to that common standards available for agent development, especially communica-
tion among agents including FIPA-ACL [6, 7] and KQML [8]. Among others, FIFA
is one of the most used common stranded for agent development.
The development of a multi-agent system from scratch is not an easy task and
that required to model agents, communications, and controlling with some standards.
Therefore, most of the multi-agent system developers give attention to an existing
framework to build their MAS solutions easily. Number of well-developed multi-
agent framework available for multi-agent system development including, JADE
[9], MaDKit [10], PADE [11], SPADE [12], AgentBuilder [13], and ADK [14].
Further, most of the existing multi-agent development frameworks are uniquely
designed to develop general-purpose, fully distributed multi-agent applications.
However, these existing frameworks directly do not support distinct requirements
of the agent-based English to Sinhala machine translation. The English to Sinhala
machine translation system requires several activities on natural language processing,
including morphological processing, syntax processing, and semantic processing.
A new multi-agent development framework has been designed and develops with
incorporating the following required features, including capabilities to work with a
large number of agents efficiently, the ability to send and reserve several messages
quickly, and the ability to customize agents easily for local language (Sinhala)
processing requirements. Multi-agent system for machine translation (MaSMT) was
released in 2016 to provide the capability to AGR-based agent development. This
paper reports the latest version of the MaSMT4.0, which consists of new features,
including email-based message parsing and the ability to work with customized
behaviors.
The rest of the paper is organized as follows. Section 2 presents a
summary on existing multi-agent frameworks. Section 3 comments on the AGR
(Agent/Group/Role) model and infrastructure of the agents. Section 4 presents the
design of the MaSMT, including agents, communications, and monitoring features
of the framework. Section 5 gives some details of the developed multi-agent systems
developed through the MaSMT. Finally, Sect. 6 concludes the paper along with the
future scope.
694 B. Hettige et al.
2 Related Works
There are several multi-agent system development frameworks available for different
requirements. This section briefly describes some existing multi-agent system devel-
opment frameworks with their features. Java Agent Development Framework (JADE)
[9] is a Java-based open-source software framework for MAS development. JADE
provides middle-ware software support with GUI tools for debugging and deploy-
ment. Further, JADE provides task execution and composition model for agent
modeling, and peer-to-peer agent communication has been done with asynchronous
message passing. In addition to the above, JADE consists of the following key
features: JADE provides a FIPA-compliant distributed agent platform, multiple direc-
tory facilitators (change agents active at run time), and messages are transferred
encoded as Java objects.
MaDKit [10] is a Java-based generic multi-agent platform based on the Aalaadin
conceptual model [16, 17]. This organizational model consists of groups and roles for
agents to manage different agent activities. MaDKit also provides a lightweight Java
library for MAS design. The architecture of MaDKit is based on three design prin-
ciples, such as micro-kernel architecture, agentification of services, and the graphic
component model. Also, MaDKit provides asynchronous message passing. Further, it
can be used for designing any multi-agent applications, from distributed applications
to multi-agent simulations.
PADE is a free, entirely Python-based multi-agent development framework to
develop, execute, and manage multi-agent systems in distributed computing environ-
ments [11]. PADE uses libraries from twisted project to allow communication among
the network nodes. This framework supports multiplatform, including embedded
hardware that runs Linux. Besides, PADE consists of some essential functionalities.
PADE agents and their behaviors have been built using object-orientation concepts.
PADE is capable of handling messages in the FIPA-ACL standard and support cyclic
and timed behaviors.
A smart Python multi-agent development environment (SPADE) is another frame-
work for multi-agent system development [12]. This framework provides a new
platform aimed at solving the drawbacks of the communication models from other
platforms. SPADE includes several features such as SPADE agent platform is based
on the XMPP support, and agent’s model based on behaviors supports FIPA metadata
using XMPP data forms and provides a web-based interface for agent control.
In addition to the above popular frameworks, AgentBuilder [13], Agent Develop-
ment Kit (ADK) [14], Jason [15] and Shell for Simulated Agent Systems (SeSAm)
[18] are the most people used other multi-agent development frameworks.
Table 1 gives a brief description of the selected multi-agent system develop-
ment frameworks and their main features. With this theoretical and application
base, MaSMT was developed through the AGR organizational model. The next
section briefly reports the AGR model and its architecture for multi-agent system
development.
MaSMT4: The AGR Organizational Model-Based Multi-Agent System … 695
This AGR organizational model was designed initially under the Aalaadin model,
which consists of agents, groups, and roles. Figure 1 shows the UML-based Aalaadin
model [15] for multi-agent system development. According to the model, each agent
is a member of one or more groups and a group contains one or more roles. The agent
should be capable of handling those roles according to the agents’ requirements. This
model is used by the MaDKit system [16] by allowing free overlapping agents among
groups.
The MaSMT model is almost the same as the above model but removes a freely
overlapping feature of the group and role at the same time. It means the agent is
on one or more groups as well as one or more roles; however, there is only one
active group and role. Thus, the agent does actions according to this active group and
role. Note that, agents are only active communicating entities capable of playing roles
within groups. Therefore, MaDKit provides the freedom for agent designers to design
appropriate internal models for agents. With this idea, the MaSMT agent is designed
considering the three-level architecture that consists of a root agent, controlling
agents, and ordinary agents. The root represents the top-level in the hierarchy and
4 MaSMT
This model comprises of three types of agents, namely the ordinary agent, controller
agent (previously called as manager), and root agent. The MaSMT ordinary agents
do actions when others are used to controlling them. A controller agent consists of
several MaSMT agents. Hierarchically root is capable of handling a set of controller
MaSMT4: The AGR Organizational Model-Based Multi-Agent System … 697
agents. Using this three-layer agent model, agents can easily cluster and model to
build a swarm of agents easily.
MaSMT agents are the active agents in the framework provides agents’ infrastruc-
ture for agent development. The modular architecture of the MaSMT consists of
several built-in features including noticeboard reader and environment controller.
The modular architecture of the MaSMT agent is shown in Fig. 3. The noticeboard
Agents in the MaSMT system follows the life cycle (status of the agent), namely
active, live, and end. When new agents initiate it directly start the “active” section
(it usually work as a one-step behavior for the agents) Then, agent moves to its live
section. (The live section is the working section of the agent that usually works as
a cyclic behavior). The MaSMT agent leaves from the live section when the live
property of the agent is false. According to the requirements, an agent may wait until
a specific time or until a new message comes to the in-queue. Figure 4 shows the life
cycle of the MaSMT agent.
MaSMT controller agent is the middle-level controller agent of the MaSMT frame-
work, capable to control its, clients, as required. Figure 5 shows the life cycle of
the MaSMT controller agent. The MaSMT controller agent also provides all the
features available in the MaSMT agents. In addition to that, MaSMT controller
should capable to provide message passing, network access, noticeboard access and
environment handling capabilities.
The root agent is the top-level controller agent (MaSMT Manager), which is capable
of handling other MaSMT controller agents. According to its architecture, there is
MaSMT4: The AGR Organizational Model-Based Multi-Agent System … 699
only one root for the system. Further, the MaSMT root agent is also capable of
communicating with other root agents through the “Net access agent”.
The MaSMT framework uses messages named MaSMT Messages to provide agent
communication. These MaSMT Messages have been designed using FIPA-ACL
message standards. MaSMT Messages can be used to communicate in between
MaSMT agents as well as other agents that support “FIFA-ACL” message stan-
dards. Table 2 gives the structure of the “MaSMT Message” including data fields
and types. More information on “MaSMT Messages” is provided under the MaSMT
development guide [19].
The multi-agent system development framework can be used as a tool for multi-
agent system development. The MaSMT framework has been designed with consid-
ering the AGR organizational model which was available in the Aalaadin project.
According to MaSMT’s AGR model, each agent has a group and a role at a time.
However, agents can change their role and group at run time. Further, MaSMT also
uses a three-layer agent model (root, controller, and agent) to build agents’ swarms
quickly. Also, MaSMT is capable to communicate with agents using noticeboard
method and a peer-to-peer or broadcasting way for message parsing. Especially
MaSMT framework allows email-based message-passing capabilities. Numbers of
multi-agent applications have been successfully developed with MaSMT including
EnSiMaS, Octopus, AgriCom, and RiceMart. The framework is freely available and
can be downloaded from the source forge. Further, MaSMT can implement for other
languages like Python is one of the further directions of the research.
702 B. Hettige et al.
References
1 Introduction
Robustness is of great importance in control system design, the reason behind that is,
real engineering systems are unprotected to external disturbance and noise. Generally,
to stabilize a plant, a control engineer is required to design a controller, if the plant is
not stable at first and meets certain levels of efficiency in the presence of disturbance,
noise, and plant parameter variations. One of the robust control issues that were
deemed astronomically to be solved by the H-∞ approach and MU (μ) approach
[1, 2].
It is possible to achieve nominal performance and robust stability against unstruc-
tured perturbations with H-∞ optimal approach but the issue of robust perfor-
mance requirements is neglected. In a real-time implementation, the higher-order
controller may not be viable because of computational and hardware limitations.
Design methods based on structure-singular value MU(μ) can be used to achieve
Robust Stability and Robust Performance (RSRP). One of the strong robust design
approaches is MU(μ) synthesis problem [11, 12] [14]. The stabilizing controller can
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 703
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_51
704 A. B. Patil and R. H. Chile
The μ-synthesis controller is in state-space form and can be defined with the
following models by incorporating suitable state variables and easy manipulations
ẋ = Ax + Bu
y = C x + Du
Here, x is the state vector, u and y are input and output vector respectively and
A, B, C, and D are state-space parameters. The structure uncertainties are taken into
consideration when the μ-synthesis problem is stated. The structure uncertainties are
arranged in a particular manner as follows [1],
where
s
s
ri + mj = n (2)
i=1 i=1
n—Dimension of the blocka from Eq. (1) there are two typof uncertainty blocks of
block and they are,
s—Repeated scalar blocks.
f —Full scalar blocks.
δi—the repeated scalar blocks parameter can be only real numbers. The value of μ
is given as
1
:= min{σ () : |(I − M)| = 0} (3)
μ(M)
B := { : σ () ≤ 1, ∈ } (5)
Equation (4) shows the structure singular value of interconnected transfer function
matrix. If M(s) is stable and μ (M(s)) < 1 (or, ||M||μ < 1), then and only then the
Standard M− configuration in Fig. 1 is robustly stableand shows, ‘w’ as the input,
generally including disturbances, noises, command signals, ‘z’ is the error output
normally consists of tracking errors, regulator output, and filtered actuator signals.
Let M(s) is split appropriately as below,
M11 M12
M(s) = (6)
M21 M22
From Eq. (8), Fu(M,) is the upper linear fraction transformation (ULFT), an
upper loop of an interconnected transfer function matrix is enclosed by structured
uncertainty hence it is called ULFT. If the condition, Fu(M, )∞ < 1 satisfies
then concerning , Fu(M, ) becomes robustly stable. Now the fictitious uncertainty
block P is added in such a way that it does not affect the robust stability of the
system and it can be shown in Fig. 2. The reliable performance is determined if only
∈ B condition is satisfied.Fig. 1 describes the robust stability problem replace
by ˜
˜ thus the below is a robust stability problem in respect of
˜ ∈
˜ := {diag{, p}} : ∈ B, p ∞ ≤ 1} (9)
If the infinity norm of M22 is less than one then the performance of the plant is
nominal and also if M(s) is internally stable then the system gives nominal stability.
The general method used to solve the μ synthesis problem is the D-K iteration
method. This method is shown in Fig. 2 with considering the controller K., the
feedback signal y and u are shown and M is created by P and K. The relation of M
⎡ ⎤
P11 P12 P13
P(s) = ⎣ P21 P22 P23 ⎦ (11)
P31 P32 P33
The effect of weighting functions of the system is used to suppress noise and distur-
bances that occur in the system. The controller’s performance and characteristics of
the system are attained by the transient response. Weight function can be selected by
the following assumptions by the selection of the weighting function that should be
in the frequency domain, Stable weights, and diagonal weights are chosen, and the
diagonal elements are to be restricted for a minimum phase, real-rational function.
The closed-loop system block diagram is shown in Fig. 3 by taking weighting
functions into account. Wu and W p are respectively, the performance weights linked
to control law and tracking, K is a controller, G is the plant’s transfer function, d is
an external disturbance and e is an error.
Table 1 shows the ISE calculation for different weighting functions. These
weighting functions are chosen by the trial and error method. As seen from the
Table 1 and Fig. 4, it is analyzed that the 6th column weight function gives better
results, but this is very time-consuming process. To overcome this problem in this
708 A. B. Patil and R. H. Chile
0.95 ss 2 +1.8s+10
2
2 s+1
s+0.0005 1 6.738 5 +8s+0.01
10−2 4.156
s 2 +1.8s+11
3 s+2
s+0.01 1 6.654 6 0.95 s 2 +8s+0.01
10−2 4.064
s 2 + 1.8s + X (1)
W p = 0.95
s 2 + 8s + X (2)
−2
Wu = 10 ; μ = X
Where, Range of the optimization parameters is, x(1) = [10 11], x(2) = [0 1] and
x(3) = [0 2]. The parameter required to optimize fitness function in PSO algorithm
is set in Table 2. After PSO optimization, the optimized fitness function obtained is
given in Eq. (16),
s 2 + 1.8s + 10.14
W p = 0.95 and μ = 0.35 (16)
s 2 + 8s + 0.011
⎡ ⎤
−0.0013 0.0292 0.0058 − 0.0209
⎢ −0.0292 −6.6647 8.7775 −10.9527 ⎥
A=⎢
⎣ −0.0058 −6.7775
⎥,
−0.3426 2.0498 ⎦
0.0209 10.9527 2.0498 −20.9505
⎡ ⎤
−1.2492
⎢ −14.6961 ⎥
B=⎢
⎣ −2.8073 ⎦
⎥
10.2185
C = −1.2492 −14.6961 2.8073 −10.2185
D=0 (18)
Fuzzy logic control is inherently robust in sense of the imprecise parameter infor-
mation and the variation with some bound. Hence fuzzy controllers are used for
the systems where the data is complex and with variations. Here in this paper
fuzzy controller is developed using Takagi Sugeno based compensation technique.
Takagi—Sugeno’s suggested fuzzy model is described by fuzzy IF–THEN rules that
represent a nonlinear system’s local input–output relationships. The main feature of
a Takagi—Sugeno fuzzy model is that a linear system model expresses the local
dynamics of each fuzzy implication (rule). The fuzzy dynamic model or T–S fuzzy
model consists of a family of local linear dynamic models smoothly connected
through fuzzy membership functions. The fuzzy rules of the fuzzy dynamic model
have the form
where Rl denotes the lth fuzzy inference rule, m the number of inference
rules,F jl ( j = 1, 2, ..., ν) the fuzzy sets, x(t) ∈ n the state vector, u(t) ∈ g
the input vector,y(t) ∈ p the output vector, and (Al , Bl , al , Cl ) the matrices of
the lth local model, and z(t) = [z 1 , z 2 , . . . z v ] are the premise variables, which are
some measurable variables of the system, for example, the state variables or the
output variables. Fuzzy rules are designed based on the local state-space model for
the dynamic system. The control gains are designed using the linear quadratic regu-
lation technique. The sample rules are given in Eq. (19), with different equilibrium
points obtained with the phase plane method. For fuzzy designing, the triangular
fuzzy functions are used as shown in Fig. 5.
Fuzzy rules:
Rule 1: If x2 (t) is low (i.e.,x2 (t) is about 0.8862).
THEN
where
are to be designed.
In this proposed method, μ-synthesis is designed with T-S fuzzy in which the
reduced transfer function obtained by the μ- synthesis method is converted in the
form of control input and error signal. This will be used in T-S fuzzy and for creating
a membership function. Error is taken as input and control input is taken as output
of the FIS file.
CSTR plays an important role in the chemical process, where the exothermic reaction
takes place and the heat of the reaction needs to be removed by the use of coolant.
The control objective of CSTR is to maintain the temperature inside the tank at the
desired value. The system has a cooling process it is shown in Fig. 6 and the block
diagram is shown in Fig. 7.
The controller is used to minimize the error signal. The output of the controller is
given to DAC since the output of the controller is digital and the CSTR system only
understands physical quantity that is analog, hence it is given to the E/P converter
(Electro-pneumatic converter) which is used to convert a current input signal to a
linear proportion. To initiate the work of the E/P converter the compressor is used
and the output of the E/P converter is given to the control valve. Control valve action
takes place to minimize an error. The procedure is repeated until the desired output is
obtained. Using the process reaction curve method the second-order transfer function
obtained for CSTR is shown in Eq. (20). This system equation is imprecise and with
different nonlinearities. The proposed methods are applied to study the responses of
CSTR with a transfer function given in Eq. (20).
−0.12S + 12S
G(s) = (20)
3S2 + 4S + 1
Comparative Study of Optimized and Robust Fuzzy… 713
6 Simulation Results
Experimental results are taken on the CSTR system explained in Sect. 5. Table 4
shows the iteration summary of the D-K iteration method by which it will prove that
the system performance is robust since the value of μ is less than one.
The time response with the D-K iteration controller and μ T-S fuzzy controller is
given in Figs. 8 and 9 respectively. Figure 10 gives the comparative time response of
PID with gains as KP = 12.24, KI = 2.0980, KD = 14.2, μ-controller D-K iteration
Table 4 Summary of
Iteration no 1 2 3
iteration for D-K method
Order of controller 4 12 16
Peak value of μ 1.461 0.950 0.946
Fig.10 Time response of PID, μ-controller D-K iteration, μ-controller with T-S fuzzy
method and μ-synthesis controller with T-S fuzzy. It shows that the μ-synthesis
controller with the T-S fuzzy controller has less overshoot (zero) than the other two
controllers. The μ-Synthesis controller with T-S fuzzy controller is more robust as
compared to the other two controllers.
Table 5 gives analysis for the comparative time response of PID, μ-controller D-K
iteration method, and μ-synthesis controller with T-S fuzzy. The Integral square error
function (ISE) is 3.046 in the case of a fuzzy controller. The settling time and over-
shoot are improved in T-S fuzzy control as compared to conventional methods. The
analysis of response shows better performance for optimized μ- synthesis controller
with T-S fuzzy as compare to others.
Figures 11 and 12 give the RSRP for the D-K iteration controller and presented a
μ-PSO fuzzy based controller, which shows the relationship between frequency and
upper bound of μ, which is less than one. The reduced 4th order controller (transfer
function) obtained using the proposed topology is shown in Eq. (17). This shows the
robust stability of the proposed methods.
716 A. B. Patil and R. H. Chile
The result of the μ-synthesis controller using D-K iteration is shown in Fig. 13
which describes that the setpoint is correctly tracked by the controller. Initially, the
temperature of the CSTR set at 60 degrees and the setpoint is 55, so the controller
gradually decreases temperature form 60 to 55° as shown in Fig. 13.
The result of the T-S fuzzy controller is shown in Fig. 14 where the initial temper-
ature is 45° and the setpoint is 40°. In both, the case controller tracks the setpoint but
in T-S Fuzzy control, the required time is less compared to the D-K iteration method.
The Mu-synthesis controller using the D-K iteration method and Mu synthesis
controller using T-S fuzzy is proposed in this research. In order to improve the
precision and stability of the real-time system, fuzzy control is used. The μ value
obtained is less than one which defines the stability of the proposed method. In the
proposed method overshoot problem is nullified and settling time is also improved
as compared to the PID controller. Similarly, the ISE value for the proposed method
is less than the PID controller and DK iteration method. Simulation results clearly
show that the μ-synthesis controller with T-S fuzzy is more robust than the D-K
iteration method. PSO based tuning used to optimize the weighting function added
to the system gives more accurate results. Also, from the hardware study, it is clearly
visible that the CSTR system works properly and gives accurate results when it uses
the μ-synthesis controller with a T-S fuzzy controller.
The Future Scope of the system can be modified further by using optimization
algorithm like GA, TLBO, and JAYA by which more accurate results can be achieved
and satisfies the RSRP criteria.
718 A. B. Patil and R. H. Chile
References
1. Zhou K, Doyle JC (1998) Essentials of robust control. Vol. 104. Upper Saddle River, Prentice
hall, NJ
2. Pannu S, Kazerooni H, Becker G, Packard A (1996) μ-synthesis control for a walking robot.
IEEE Control Syst Mag 16(1):20–25
3. Bendotti P, Beck CL (1999) On the role of LFT model reduction methods in robust controller
synthesis for a pressurized water reactor. IEEE Trans Control Syst Technol 7(2):248–257
4. Buso S (1999) Design of a robust voltage controller for a buck-boost converter using /spl
mu/-synthesis. IEEE Trans Control Syst Technol 7(2):222–229
5. Stein G, Doyle JC (1991) Beyond singular values and loop shapes. J Guidance Control Dyn
14(1)
6. Tchernychev A, Sideris A (1998) /spl mu//k/sub m/-design with time-domain constraints. IEEE
Trans Autom Control 43(11):1622–1627
7. Wallis GF, Tymerski R (2000) Generalized approach for /spl mu/ synthesis of robust switching
regulators. IEEE Trans Aerosp Electron Syst 36(2):422–431
8. Tsai KY, Hindi HA (2004) DQIT: /spl mu/-synthesis without D-scale fitting. IEEE Trans Auto
Control 49(11):2028–2032
9. Lee TS, Tzeng KS, Chong MS (2004) Robust controller design for a single-phase UPS inverter
using /spl mu/-synthesis. IEE Proc Electric Power Appl 151(3):334–340
10. Lanzon A, Tsiotras P (2005) A combined application of H/sub /spl infin// loop shaping and /spl
mu/-synthesis to control high-speed flywheels. IEEE Trans Control Syst Technol 13(5):766–
777
11. Qian X, Wang Y, Ni ML (2005) Robust position control of linear brushless DC motor drive
system based on /spl mu/-synthesis. IEE Proc Electric Power Appl 152(2):341–351
12. Shahroudi KE (2006) Robust servo control of a high friction industrial turbine gas valve
by indirectly using the standard$mu$-synthesis tools. IEEE Trans Control Syst Technol
14(6):1097–1104
13. Franken N, Engelbrecht AP (2005) Particle swarm optimization approaches to coevolve
strategies for the iterated prisoner’s dilemma. IEEE Trans Evol Comput 9(6):562–579
14. Kahrobaeian A, Mohamed YAI (2013) Direct single-loop /spl mu/-synthesis voltage control
for suppression of multiple resonances in microgrids with power-factor correction capacitors.
IEEE Transactions on Smart Grid 4(2):1151–1161
15. Bevrani H, Feizi MR, Ataee S (2016) Robust frequency control in an islanded microgrid: ${H}
_{\infty }$ and $\mu $ -synthesis approaches. IEEE Trans Smart Grid 7(2):706–717
16. Cai R, Zheng R, Liu M, Li M (2018) Robust control of PMSM using geometric model reduction
and $\mu$-synthesis. IEEE Trans Indus Electron 65(1):498–509
17. Gu DW, Petkov P, Konstantinov MM (2005) Robust control design with MATLAB®. Springer
Science & Business Media
18. Cao YY, Frank PM (2000) Analysis and synthesis of nonlinear time-delay systems via fuzzy
control approach. IEEE Trans Fuzzy Syst 8(2):200–211
Ant Colony Optimization-Based Solution
for Finding Trustworthy Nodes
in a Mobile Ad Hoc Network
Abstract Mobile ad hoc network (MANETs) is one of the most popular wireless
networks which is having dynamic topologies due to their self-organizing nature.
These are less infrastructure-oriented networks due to nodes being mobile, and
hence routing becomes a more important issue in these networks. With the ubiq-
uitous growth of mobile and the Internet of things (IoT) technologies, mobile ad
hoc networks are acting as a vital character in the process of creating social inter-
actions. Although these are having plenty of problems and challenges including
security, power management, location management, and passing multimedia over
the network due to the routing issue. MANETs consist of plenty of dynamic connec-
tions between nodes by finding a trustworthy route for communication is a challenge.
Therefore, based on swarm intelligence methodologies, an ant colony optimization
(ACO) algorithm for finding the most trusted path is proposed here via the use of
probabilistic transition rule and pheromone trails.
1 Introduction
With the invention of IoT and social networking concepts, the use of wireless devices
and mobile technologies has increased over the recent past decades. Hence, wireless
networks including mobile ad hoc networks provide the major contribution in estab-
lishing the interactions among the network nodes in service-oriented networks. But
the major issue in MANETs is lack of security because of the easily changing nature
of network topology since organizing the network nodes happen in a decentralized
manner and not having a fixed infrastructure. Therefore, there is a problem with the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 719
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_52
720 G. M. Jinarajadasa and S. R. Liyanage
Artificial ants live in a discrete world and store pheromone in an issue relying on the
way. They have additional capacities like neighborhood search, look forward, and
backtracking. By exploiting the inward memory and storing a measure of pheromone
capacity of the arrangement quality, they can utilize neighborhood heuristics [4].
As shown in Fig. 1, ants are given a memory of visited hubs where ants manufac-
ture arrangements probabilistically without refreshing pheromone trails. Ants deter-
ministically in reverse backtrack the forward way to refresh pheromone and they
store several pheromone functions, as per the quality of the arrangement they created
(Fig. 2).
At each node, ant takes the decision where to move next, depending on the pheromone
trail. In light of the pheromone stockpiling on every node, the ant takes the choice
Ant Colony Optimization-Based Solution for Finding Trustworthy … 721
α
N
Pikj = i ij/ k
i n=1 [i ikj ]α (1)
i ij is the amount of pheromone trail on the edge and N k i is the set of probable
neighbor nodes ant k positioned on node i can shift to where α is the relative influence
of pheromone function [5, 6].
This is the populace-based strategy in which artificial ants iteratively develop possible
answer arrangements. In each cycle, every ant makes one possible answer arrange-
ment utilizing a productive search technique. The development of the answers is prob-
abilistically influenced by pheromone trail data, heuristic data, and partial possible
722 G. M. Jinarajadasa and S. R. Liyanage
solutions or answer arrangements of every ant. Pheromone trails are adjusted during
the search procedure to reflect aggregate experience [7, 8].
2 Related Work
Swarm intelligence and the discipline of mobile ad hoc networks have been
researched in many types of aspects in the field including shortest path search,
routing protocol optimization, energy balancing, and improving the quality of
service. Among the swarm intelligence mechanism, there are plenty of utilizations
of the ant colony optimization in the improvement of different aspects of MANET
environments.
A hybrid routing algorithm for MANETs based on ant colony optimization called
HOPNET is proposed by Wang, J. et al., where the calculation has been contrasted
with the random waypoint model and random drunken model with the ZRP and DSR
routing protocols [9].
A routing protocol called ant colony-based routing algorithm (ARA) is proposed
with the main goal of reducing the routing overhead. The presented routing protocol
is exceptionally versatile, productive, and scalable [10].
The “AntNet” and the “AntHocNet” are applications of swarm intelligence in
MANETs where they utilize the notion of ant colony optimization (ACO) by finding
near-best solutions to graph optimization problems [11].
The AntNet and the AntHocNet are finding the closest optimal routes in a drawn
diagram of interactions without global information. But the disadvantage of this
approach is, it creates additional communication overhead by the usual transferring
of both “forward ants” and the “backward ants” [12, 13].
Schoonderwoerd et al. have addressed the above-mentioned issue by proposing
the solution called ant-based control (ABC) which is very similar to AntNet where
makes the communication overhead relatively smaller by using only the forward ants
[14].
A trust calculation for online social networks based on ant colony optimization
is suggested by Sanadhya et al., where it creates a trust cycle for trust pathfinding in
the means of achieving the trustworthiness and satisfaction of the service provided
to the service requester [8].
An improved ACO-based secure routing protocol for wireless sensor networks is
proposed by Luo et al. for the optimal path finding with the combination of probability
value and applying fuzzy logic for trust value calculating. The proposed secure
routing protocol shows that the calculation can ensure the discovery of the forwarding
path with low cost in the reason of guaranteeing security [15].
When it comes to mobile ad hoc networks, the biggest challenge is to discover a
way among the communication endpoints, satisfying the client’s quality of service
(QoS) prerequisites. Several approaches are suggested to improve the quality of
service requirement in MANETs while finding the multiple stable paths among the
source and the goal [16, 17, 18].
Ant Colony Optimization-Based Solution for Finding Trustworthy … 723
3 Proposed Work
The proposed work can be divided into three major parts which are creating the
artificial intimacy pheromone for defined mobile ad hoc network, applying the trust
concept with the ant’s probabilistic transition rule, and the algorithm calculation of
trust value.
3.1 Network Intimacy Pheromone ( i s )
network.
With the aggregation of the network heuristic values and the network intimacy
pheromone values, ant’s probabilistic transition rule can be modified as follows to
make a trust defining rule for ants to decide the next trustworthy node among the
neighbor nodes.
α β α
β
Pikj = i s (i, j) ηr (i, j) /i=1
n
is ηr (3)
Trust decision value by ant’s probabilistic transition rule can be declared as above
where i s (i,j) and Ár (i,j) are, respectively, the pheromone value and the heuristic value
related with the i and j nodes. Further, α and β are positive real parameters whose
values decide the overall significance of pheromone versus heuristic data.
Ant Colony Optimization-Based Solution for Finding Trustworthy … 725
Because of the network parameters filtered from the network simulation and with the
help of probabilistic and heuristic values in ACO the following algorithm is proposed
to calculate the trust in the created MANET environment.
4 Experimental Results
Figure 4 shows the created MANET environment with 9 nodes that choose the routing
behavior with the ad hoc on-demand distance vector (AODV) routing protocol, and
Fig. 5 shows the training of ants for minimum trust pathfinding by the pheromone
updating where 10 ants and the no. of iterations were 100.
726 G. M. Jinarajadasa and S. R. Liyanage
5 Conclusion
The data are transferred between the nodes against a MANET by utilizing a reliable
path in the given network is achieved in this research. The integrity of the commu-
nication between the nodes is calculated by the ant’s probabilistic transition rule and
heuristic values calculated by the modified probabilistic trust value calculating equa-
tion. From the simulation of the trust, the calculation algorithm applied network is
finding the shortest and the optimal trust path which is having the trustworthy nodes.
References
1. Papadimitratos P, Haas Z (2002) Secure routing for mobile ad hoc networks. In Communication
Networks and Distributed Systems Modeling and Simulation Conference (CNDS 2002) (No.
SCS, CONF)
2. Marti S, Giuli TJ, Lai K, Baker M (2000) Mitigating routing misbehavior in mobile ad hoc
networks. In Proceedings of the 6th annual international conference on Mobile computing and
networking (pp 255–265), ACM
3. Bonabeau E, Marco DDRDF, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural
to artificial systems (No. 1). Oxford university press
4. Hsiao YT, Chuang CL, Chien CC (2004) Ant colony optimization for best path planning. In
IEEE International Symposium on Communications and Information Technology, ISCIT 2004.
(vol 1, pp 109–113). IEEE
5. Kanan HR, Faez K (2008) An improved feature selection method based on ant colony
optimization (ACO) evaluated on face recognition system. Appl Math Comput 205(2):716–725
6. Asif M, Baig R (2009) Solving NP-complete problem using ACO algorithm. In 2009
International conference on emerging technologies (pp 13–16). IEEE
7. Dorigo M, Stützle T (2003) The ant colony optimization metaheuristic: algorithms, appli-
cations, and advances. In Handbook of metaheuristics (pp 250–285), Springer, Boston,
MA.
8. Sanadhya S, Singh S (2015) Trust calculation with ant colony optimization in online social
networks. Procedia Comput Sci 54:186–195
9. Wang J, Osagie E, Thulasiraman P, Thulasiram RK (2009) HOPNET: A hybrid ant colony
optimization routing algorithm for mobile ad hoc network. Ad Hoc Netw 7(4):690–705
10. Gunes M, Sorges U, Bouazizi I ARA-the ant-colony based routing algorithm for MANETs. In
Proceedings international conference on parallel processing workshop (pp. 79–85). IEEE
11. Dorigo M, Birattari M, Blum C, Clerc M, Stützle T, Winfield A (eds) (2008) Ant colony
optimization and swarm intelligence. In Proceedings 6th international conference, ANTS 2008,
Brussels, Belgium, 22–24 Sept 2008 (vol 5217) Springer
12. Di Caro G, Dorigo M (1998) AntNet: distributed stigmergetic control for communications
networks. J Artif Intell Res 9:317–365
13. Di Caro G, Ducatelle F, Gambardella LM (2005) AntHocNet: an adaptive nature-inspired
algorithm for routing in mobile ad hoc networks. European Trans Telecomm 16(5):443–455
14. Schoonderwoerd R, Holland OE, Bruten JL, Rothkrantz LJ (1997) Ant-based load balancing
in telecommunications networks. Adaptive Behavior 5(2):169–207
15. Luo Z, Wan R, Si X (2012) An improved ACO-based security routing protocol for wireless
sensor networks. In 2013 International Conference on Computer Sciences and Applications
(pp 90–93). IEEE
16. Roy B, Banik S, Dey P, Sanyal S, Chaki N (2012) Ant colony based routing for mobile ad-hoc
networks towards improved quality of services. J Emerg Trends Comput Inf Sci 3(1):10–14
728 G. M. Jinarajadasa and S. R. Liyanage
17. Deepalakshmi P, Radhakrishnan DS (2009) Ant colony based QoS routing algorithm for mobile
ad hoc networks. Int J Rec Trends Eng 1(1)
18. Asokan R, Natarajan AM, Venkatesh C (2008) Ant based dynamic source routing protocol
to support multiple quality of service (QoS) metrics in mobile ad hoc networks. International
Journal of Computer Science and Security 2(3):48–56
19. Asghari S, Azadi K (2017) A reliable path between target users and clients in social networks
using an inverted ant colony optimization algorithm. Karbala Int J Modern Sci 3(3):143–152
Software Development for the Prototype
of the Electrical Impedance Tomography
Module in C++
A. A. Katsupeev, G. K. Aleksanyan, N. I. Gorbatenko, R. K. Litvyak,
and E. O. Kombarova
Abstract The basic principles and features of the implementation of the electrical
impedance tomography (EIT) method in the C++ language are proposed in this
research. This software will significantly reduce the hardware time for performing of
computational operations and will expand the capabilities of the technical implemen-
tation of the EIT method in real technical systems of medical imaging. An algorithm
for the operation of the EIT module prototype software in C++ has been devel-
oped. The principles of building the software for the EIT module prototype have
been developed, which provides the possibility of embedding into other medical
equipment. The software interface of the EIT module prototype has been developed.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 729
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_53
730 A. A. Katsupeev et al.
The PyEIT framework [5] is based on the Python framework for modeling and
visualizing of the conduction field using the EIT method. The capabilities of PyEIT
include finite element modeling, 2D and 3D visualization, solving the forward and
inverse EIT problems, meshing, and image formation for external applications. The
mesh module can split the region into triangles (2D) and tetrahedra (3D). PyEIT
implements state-of-the-art EIT algorithms that support both static and dynamic
reconstruction.
PyEIT can use the following algorithms for dynamic reconstruction such as back-
projection method, GREIT [7], and NOSER [8]. PyEIT includes demo examples
of the operation of reconstruction algorithms [5]. To visualize the results of EIT
(including 3D), PyEIT uses the Matplotlib charting library [5].
Software Development for the Prototype of the Electrical … 731
2.2 Eidors
The EIT reconstruction algorithm requires high performance due to two factors such
as the first factor is building a finite element mesh and calculating the reconstruc-
tion matrix to calculate the conductivity field of the object under study based on the
measurement data. And the second factor is displaying the change in the conduc-
tivity field in quasi-real time during the measurement process of the object under
study. Here, the first factor is not key when implementing the EIT algorithm in
medical devices, since the reconstruction matrix can be generated in advance and
used repeatedly for different patients. However, the second factor is important in
view of the need to display the change in the conductivity field of the object under
study with sufficient speed.
Since C++ is a compiled programming language, it meets the stated performance
requirements and, as a result, can be used to develop software for the EIT module.
The proposed work has also developed a solution for processing EIT results in the
form of a Web portal [11], but this development is not intended for use in medical
equipment. This is due to the fact that the EIT channel is often used in clinical practice
not independently, but is integrated into existing medical and technical devices, for
example, lung ventilators, the software of which is implemented in the C++ language.
Thus, the general scheme of the developed software can be represented in Fig. 1.
The software is divided into two modules: information processing and software
interface. Software processing, in turn, is divided into a measurement process and
statistical processing of measurement data.
732 A. A. Katsupeev et al.
Measurement Statistical
process processing
= H ∗ ( −),
Start
Measurement
process Other activity
Definition of the work purpose
Setting measurement
Selecting the model view (2D
parameters MP (I, f, signal Selecting the patient
or 3D)
form)
Visualization of the
Reconstruction matrix
conduction field Ω =
calculation
{σ1,…,σm}
End
The vector is a set of values of the conductivity field at the finite elements of
the reconstructed object model:
= {σ1 , ...σm },
(1) The main criteria are uninterrupted measurement process and speed of
processing the information received from the measuring device;
(2) Based on the analysis of state-of-the-art graphics visualization [12], a combina-
tion of GTK + OpenGl [13, 14] is used as a graphic library for displaying the
conductivity field to ensure high speed of information output. A screenshot of
the software is shown in Fig. 4.
The main blocks of the information displayed on the screen are the recon-
structed conductivity field, ventilation graphs, control buttons, and measurement
parameters setting block.
Within the framework of compatibility with the software used in lung ventilators,
it is planned to change the GTK graphics library to MFC [15].
(3) To minimize the computation time in the measurement data processing module,
it is necessary to use a compiled programming language, therefore, the
developed solution is implemented in the C++ language.
Software Development for the Prototype of the Electrical … 735
Measurement Information
User interface
process processing
EIT
Visualization of
measurement
Measurement the results of
process
data processing calculating
control
indicators
characterizing
the function and
Archiving and Calculation of condition of the
recording the ventilation by the lungs
measurement EIT method
procedure
Calculation of Signaling
perfusion by the
EIT method
Generation and
Calculation of maintenance of
the ventilation- patient records
perfusion ratio
Storage of
reconstruction
and visualization
results
Generation of
reports
DICOM protocol
generation
4 Conclusions
Acknowledgements The study is carried out as part of the federal target program “Research and
Development in Priority Directions for the Development of the Russian Science and Technology
Complex for 2014-2020”, with financial support from the Ministry of Science and Higher Education
(agreement No. 05.607.21.0305). Unique agreement identifier RFMEFI60719X0305.
References
4. Adler A, Lionheart W (2006) Uses and abuses of EIDORS: an extensible software base for
EIT. Physiol Meas 27(5):S25–S42. CiteSeerX 10.1.1.414.8592. https://doi.org/10.1088/0967-
3334/27/5/S03. PMID 16636416
5. Liu B, Yang B, Xu C, Xia J, Dai M, Ji Z, You F, Dong X, Shi X, Fupy F (2018) EIT:a python
based framework for electrical impedance tomography // SoftwareX 7(C):304–308
6. Stroustrup B (1997) “1". The C++ Programming Language (Third ed.). ISBN 0–201–88954–4.
OCLC 59193992
7. Adler A, Arnold J, Bayford R, Borsic A, Brown B, Dixon P, Faes T, Frerichs I, Gagnon H,
Garber Y, Grychtol B, Hahn G, Lionheart W, Malik A, Stocks J, Tizzard A, Weiler N, Wolf G
(2008) GREIT: towards a consensus EIT algorithm for lung images. Manchester Institute for
Mathematical Sciences School of Mathematics. The University of Manchester
8. Cheney MD, Isaacson D, Newell JC (2001) Electrical impedance tomography. IEEE Sign
Process Mag 18(6)
9. MATLAB Documentation (2013) MathWorks. Retrieved 14 Aug 2013
10. Lionheart WRB, Arridge SR, Schweiger M, Vauhkonen M, Kaipio JP (1999) Electrical
impedance and diffuse optical tomography reconstruction software. In Proceeding soft he 1st
world congresson industrial process tomography, pp 474–477, Buxton, Derbyshire
11. Aleksanyan G, Katsupeev A, Sulyz A, Pyatnitsin S, Peregorodiev D (2019) Development of
the web portal for research support in the area of electrical impedance tomography. Eastern-
European J Enterprise Technol 6(2):6–15
12. Chotisarn N, Merino L, Zheng X (2020) A systematic literature review of modern software
visualization. J Vis 23:539–558
13. The GTK Project // https://www.gtk.org/
14. OpenGL—The Industry Standard for high performance graphics // https://www.opengl.org/
15. MFC Applications for desktop // https://docs.microsoft.com/ru-ru/cpp/mfc/mfc-desktop-app
lications?view=vs-2019
Information Communication Enabled
Technology for the Welfare
of Agriculture and Farmer’s Livelihoods
Ecosystem in Keonjhar District
of Odisha as a Review
Abstract Odisha is an Agrarian State, and extension education plays a vital role in
agriculture growth and promotion of farmer’s livelihoods. For technology dissemina-
tion and knowledge generation purposes, ICT plays a vital role in Agrarian society
to empower the farming fraternity. In the year 2013–2017, a pilot-based research
was made by the group of researchers with the help of OUAT, KVK and the agri-
culture department to publish this research review for investigation. Here, expost
facto design and multistage sampling were taken for study along with a structured
followed for collection of data. Around 170 samples were taken for a comprehensive
study in Keonjhar District Of Odisha. Here, the objective of this research is to provide
support for data collection.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 739
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_54
740 B. S. Behera et al.
1 Introduction
Other reasons are language barriers, illiteracy as well as refusal to accept the new
technology.
The manner in that ICT projects entry, apply, assess, as well as deliver content
might enhance the possibility of ICT utilization through farmers as well as therefore
can develop into an essential element in a project. To submit the data seeking appetite
of farmers, ICT may be acted as a panacea for all of them. Here, the problem of a
farmer may be analyzed by using ICT tools with relevance to local condition of them.
Local content was described as content that is proposed for particular neighborhood
viewers, as described by language, culture or geographic location, or as content
which is economically, politically socially as well as culturally related to a certain
people. The optimum benefit of ICT should be reached to the doorstep of the farming
fraternity and rural artisans.
Among the several types of researches accomplished on the subject associated with
this particular effort, the mechanism is linked to ICT’s communication as well as
ICT’s role in non-urban development in the area such as agriculture, education,
health sanitation, and economy they are:
“Mohanty and Bohra [1]iii highlight the ICT’s role in the entire world through
the appearance of different types of equipment. ICT played a major role that has
not merely prepared access throughout the world easier but has assist combination
of consideration, procedure synergies in functioning techniques along with place,
democratic function approach along with participation in learning along with the
Information Communication Enabled Technology for the Welfare … 743
3 Specific Objectives
The specific objectives of the work are to illustrate and analyze ICTs application
in agriculture in the Patna block and to discover the role of ICTs in agriculture
development. And also to assess the people’s consciousness toward ICTs application
in agriculture development.
Information Communication Enabled Technology for the Welfare … 745
The present work was accomplished through gathering “both primary and secondary
data.
Secondary Data Collection: The secondary data” information has been gathered
with the help of various sources of portals, Web sites, materials, along with other
existing records which are Act and Policy of Odisha Government, national as well
as state government agriculture portal, different projects as well as schemes on ICT
under Odisha’s Government.
The additional related details were collected from different publications, journals,
Internet, research paper, official records, magazines and news articles along further
exiting sources of information.
Sample Design: To work for the responsibility of ICT in a location like Patna in
Keonjhar District, the sample was created as per the possibility of investigating ways
in the fixed time.
Population of the Study: 29,755 are the research population composed of farming
labors, and farmers (private agencies) still are exclusively connected through the
farming.
Sample Area: In Odisha’s agriculture, Patna blocks play an important part in
maize production. Approximately 80% of people are exclusively associated with
agriculture as well as industries based agro that offer livelihood on the population
of the block. 2 g panchayats are chosen out from 24 panchayats for the gathering
of data as per agricultural activities, 1 maximum cultivation actions and another one
lowest actions. The chosen location of the sample from the block of Patna includes
“16 different villages, 8 villages” through every 2 panchayats along with a sample
which is small individuals as well as big farmers. Thus, ICT setup is needed in which
it experiences the lack of data. The study done is descriptive research.
Sample Size: 170 are sample sizes which consist of: “4 stakeholder (private agents
of seeds and fertilizer companies),” 4 government officials, 160 farmers along 2 ICTs
experts.
746 B. S. Behera et al.
Sample Selection: Simple random sampling methods were used for the sample.
With the help of stratified random sampling techniques, ten farmers were chosen
from every village; in each panchayat, they have eight revenue villages.
Primary Data Collection: It was gathered via two techniques observation as well as
a survey. From the farmers of selected villages, data was collected through a schedule.
From every village, ten farmers were taken. The selection was completed from the
catalog of stakeholders. The schedule was organized through both open-ended and
closed-ended “questionnaire. Although gathering primary data of non-participatory”
surveillance, a technique was observed.
Tools and Techniques: Schedule” was utilized “as a tool of survey method.”
Data analysis and interpretation: Data is examined in quantitative as well as qual-
itative procedures. Collected data from each panchayat are examined averagely. To
understand the variation, a comparative analysis was completed. To evaluate the size
of data, SPSS software was utilized.
Farmers are experiencing different press behavior that do not limit with particular
media. An 21% of average consumes just on media based on electronic (radio as
well as TV), 7% on folk media, 51% on both folk and electronic, 5.6% on print as
well as electronic, 3.15% on print, folks as well as electronic, 4.4% on electronic
as well as the Internet, 0.65% on print only as well as 6.25% on every category,
respectively. Here, over 10% of farmers have utilized the Internet. Therefore, the
impact of electronic media (radio as well as TV), as well as folk media, is superior
compared to other media.
Consumption of media is approximately equivalent to both sides since farmers
spend the rest time of theirs on leisure. Primarily paying on entertainment typical is
41.55%, 3.15% just on a news program, on 37.75% on both entertainment as well
as news are entertainments along with others 0.65%, agriculture as well as news
associated information around 2% as well as every category of programs are15.1%.
Although the ICT based information system are data oriented deliberately go via
agricultural based data as a news channel and public information system. When a
consequence, “15.1 + 1.9 = 17% of farmers are” remarkably search data according
to the requirements of theirs with the help of media.
India’s only completely agricultural-based channel is DD Kisan. The channel
has broadcasting programs mainly based on agriculture, but an average of 37.79%
of individuals are widely recognized all concerning the channel, while 62.21% of
farmers do not have any clue about its significance as well as existence.
It presents that farmers approximately 55.5% become weather/climate-associated
information with the surveillance as well as asking out of knowledgeable ICTs users
or friends. 5% of farmers have studied with the help of media, i.e., weather forecast or
news of radio, newspapers or TV. Basically, 1.25% with the assist ICTs application,
27.55% during different sources such as relatives, friends along with family those
are ICTs users, tools along with observation of “media, 11.25% of them access” via
ICTs application as well as media. Thus, 1.25 + 11.25 = 12.50% “an average of
farmers” utilize “ICTs application for weather/climate” information.
Information Communication Enabled Technology for the Welfare … 747
180
160 136 145
140
120 95
100 AdopƟon rate
80 68 70 65 60
60 34 36 35 32
40 18 19 Non-AdopƟon rate
20
0
Linear (AdopƟon rate)
Linear (Non-AdopƟon
rate)
Fig. 1 Information sources versus adoption and non-adoption rate. Source own data
180 165
160 140
140
120 AdopƟon Rate
95 89
100 85
80 71
Non-AdopƟon Rate
60
40 30
Linear (AdopƟon Rate)
20 5
0
Print Electronic Internet Local & Linear (Non-AdopƟon
Media Media Media TradiƟonal Rate)
Media
Rural Youth Women Farmers
ArƟsans
Fig. 2 Adoption rate of information consumers with various information sources along with rate
of adoption. source own data
adopted by people, and Internet media is the lowest part. The trendline reflects
the adoption and non-adoption level of information which passes through both
information consumers and information media.
Information Communication Enabled Technology for the Welfare … 749
Media consumption is much superior between the farmers. Almost 99% of them use
on media, it might be regular or electronic media, folk media, or media which is
new. According to the information, less or more farmers behave on several media
types. Around 11% of the farmers are utilizing the Internet. Usual 17% of farmer’s
use media for agriculture-based information, another percentage of farmers utilizing
for other reason like news, entertainment, along with various other kinds of details.
Between the media users, about 38% are familiar with “the DD Kisan Channel.
Those are” familiar with the DD Kisan Channel, 63.63% are seeing the channel
frequently, and 97.5% declared every information is appropriate for the farming
extension.
Weather information cooperates a critical role for farming. Now at Patna, an
average of 12.50% of farmers utilizing ICTs application to find out regarding
weather/climate information. Several of them are receiving data via media, ICTs,
as well as through the surveillance.
In Patna, essentially farmers are discovered agriculture strategies by the ancestors
as well as friends. ICTs assist them degree to understand cultivation procedure.
Essentially information is moving with agriculture experts or extension offices to the
farmers; Patna’s portion is 53.8%. 5.7% are permission via the ICTs applications.
The latest trends of methods utilize modern gadgets, use of improving and
advanced agriculture techniques, hi-tech concepts, pesticides management along
with other methods—normally diffuse through one to more as well as again more to
much more. In Patna, 9.45% of farmers use every kind of data with the assistance of
ICTs application as well as every source. At the starting phase, stakeholders, experts
and extension officers increase trends via ICTs types of equipment for the future
phase; they instruct farmers through visit fields, workshops as well as demonstrations.
Mobile phone functions as resources of ICT in Patna “block, an average of 87.87%
farmers” utilizing a cell phone. Between the 22.3% using smartphone while 77.7%
owning a regular phone, tab person is 0. Cell phone mostly utilizing now for commu-
nication among relatives or friends as well as the portion is 36%, along with 23.29%
utilizing for entertainment and communication “(listen to music, watching the video
such types,” play game), an approximately 41% utilizing for collect data concerning
farming. Out of the cell phone customers, 65% of mobile users hardly ever read SMS
that is received in the inbox, except 35% seriously read the SMS. An average 38% of
farmers get SMS through several portals, registered Web sites or government offices
concerning agricultural requirements. In Patna, Internet users hike gradually, and the
absence of appropriate connectivity of broadband and vulnerable strength of mobile
network produce obstacles to use the “Internet. In spite of these, 58.35% of farmers”
are utilizing the Internet, out of which “22.3% are smartphone users. Between the
58% Internet users, 62.50% of farmers browse” farming associated information on
the Internet. However, just 3% of them frequently visits as well as know portals
related to farmers along with such types of Web sites.
750 B. S. Behera et al.
Basically, 40% of farmers understand regarding Kisan Call Centres, others do not
include any clue regarding it. Between the 40% to 23.49%, users maintain commu-
nication in KCC; out of the 23.49%, just 65.8% are authorized mobile numbers of
theirs on KCC, among 65.8% of farmers who are registered 75% obtain message from
KCC regularly. Government of Odisha, free of charge mobile distribution programs
for Kisan Credit cardholders, in Patna Block, no one availed mobile phone.
PCs as well as laptops are seldom utilized in this area, just 1.75% of farmers
utilized such kinds of progenies or devices of farmers are utilizing laptops, and
PCs; they tell their elders concerning questions related to agriculture. In Agriculture
Extension, 8.67% of farmers are aware of the benefits of ICTs application. Through
the suggestions of extension workers, collected data with the help of ICTs application,
21.3% of farmers enhance the manufacture of theirs.
Mandi facilities are really poor, just paddy are bought through neighborhood
mandi, and fresh foods are delivered to the neighbor mandis of the states. Approx-
imately 8.7% of the farmers obtain gain via the assistance of mobile phones or
ICTs application. The mobile phone essentially enables you to collect information
regarding sell price; approximately 19% of farmers study sell price results via the
mobile phone.
ICTs applications like cell phones assist farmers to transform the old perceptions.
With the help of the mobile phone, farmers can create communication with Kisan
Call Centres, market holders queries with extension officers, share information with
friends, and also web on a mobile phone. Above most assists them to alter the standard
design of farming.
Acknowledgement Acknowledging the help and support of OUAT, Green College and LIUTEBM
University for publishing this paper sucessfully. Also thankful to Dr.B.P.Mohapatra, Dr.K.S.S
Rakesh for encouragement.
References
1. Mohanty L, Bohra N (2006) ICT strategies for school a guide for school administration. Sage
Publication, NewDelhi
2. Dasgupta D, Choudhary S, Mukhopadhyay SD (2007) Development communication in rural
sector. Abhijeet Publication, Delhi
3. Bhatnagar S, Schware R (eds) (2006) Information and communication technology in develop-
ment. Sage Publication, New Delhi
4. Narula U (2011) Development Communication: theory and practice. Har-Anand Publication,
New Delhi
5. Singhal A, Rogers EM (2011) Indian communication revolution, From Bullock Carts to cyber
marts. Sage Publication, New Delhi
6. Hanson J, Narula U (2012) New communication technologies in developing countries.
Routledge, Publication New York
7. Behera SB et al (2015) Procedica of computer science, Elseiver Publications
CHAIN: A Naive Approach of Data
Analysis to Enhance Market Power
Abstract Data analytics is one of the most important fields in this computing world.
Every emerging paradigm finally results in the generation of data. The rapidly
growing attraction from different industries, markets, and even academics results
in the requirement of deep study of big data analytics. The proposed work deals with
the unstructured data and raw data, and it then converts that into the structured and
consistent data by applying various modern methods of the data warehouse. These
methods include logistics and supply chain management (LSCM), customer relation-
ship model (CRM), and business intelligence for market analysis to make fruitful
decisions for an organization. Thus, the processes of data analysis are performed in
data warehouses that are ETL process which describes the steps of gathering and
transforming the data and finally placing the data to the destination. The proposed
CHAIN method is a core sector of our research work as a naive approach assisting in
improvising the market power. Here, in this research work, a market analysis on an IT
hardware sector is performed that deals with the sales of peripherals and telecommu-
nication devices in the market. This can be achieved via continuous communication
of clients and retailers to generate meaningful and relevant data, and this data can
also be analyzed for generating various required reports.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 751
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_55
752 P. Matta et al.
1 Introduction
Big data creates a big opportunity with big data sets and helps to realize the benefits.
The targeted solutions for data analysis are to provide new approaches to achieve
impressive results. Marketers are collecting lots of data daily from a variety of
customers to paint a complete picture of each customer’s behavior. According to
the past CRM, analytics comprises all programming that analyzes data customers to
streamline better business decisions and monetizing.
Similarly, in this research work, the introduction of another analysis technique by
seeking the present results will lead a better decision making in businesses and also
be healthy for customers in the coming years. In this, marketers can feed these new,
real-time insights back into the organization to influence product development.
This research work provides a new methodology for data analysis using the chain
process. It begins by defining the ETL process and describing its key characteristics.
It provides an overview of how the analysis process is to be done. As by seeking the
past and present results in the market, it shows less interaction between customers
and shopkeepers. To enhance the relationship between them, their implications for
the future are also discussed. Key application areas for the process are introduced,
and sample applications from each area are described. Two case studies are provided
to demonstrate the past results, and how they can yield more powerful results now.
The paper concludes with a summary of the current state that how businesses can be
more powerful for the future.
The paper is composed of nine sections. After the introduction part, Sect. 2 defines
data analytics and its importance. Section 3 describes the extraction, transformation,
and load (ETL) cycle. Section 4 discusses the motivation behind our research. The
proposed technology is elaborated in Sect. 5, containing the definition of CHAIN,
Why CHAIN, How CHAIN can be achieved, and the proposed model of method-
ology. In Sect. 6, market analysis through the case study is discussed. The result
analysis is presented in Sect. 7. A comparison of the proposed approach with existing
methods is provided in Sect. 8, and the conclusion of the work is presented in Sect. 9.
According to Gandomi [1], “Big data are worthless in a vacuum. Its potential value
is unlocked only when leveraged to drive decision making.” Decision making can
be accomplished by different institutes and organizations by turning the vast and big
amount of data into precise and meaningful information. According to him, different
industries and different organizations define and express big data in different ways.
Data analysis is a term that examines the extraction of the data set either a primary or
secondary, to organize and mold into helpful information for healthy decision making.
It also helps in reducing complexities in the managerial decisions, to enhance the
effectiveness, marketing policies, and end-user serviceability to boost the business
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power 753
Input
TransformaƟon
Concept
Output
Analysis
ComputaƟon
performances. From input to the generation of output, the process can be understood
with the help of Fig. 1.
Importance of Data Analytics
According to Kempler [2], “to capitalize on these opportunities, it is essential to
develop a data analytics framework in which defines the scope of the scientific,
technical, and methodological components that contribute to advancing science
research.”
Other importance can be mentioned as it helps in reducing the banking risk by
identifying fraudulent customers from the historic data and also assists in presenting
appropriate advertisements depending on selling and purchasing historical data. It is
also exploited by various security agencies to enhance security policies, by gathering
the data from different sensors employed. It also assists in eliminating replicated
information from the data set.
Limitations of Data Analytics
The limitations of the data analytics can be described as while having surveys, the
person does not need to be providing accurate information. The missing values and
lack of substantial part could also limit its usability. Data may be varying in quality
and format when it is collected from different sources.
(A) Extraction
According to Akshay. S, “In the extract process, data is extracted from the source
system and is made accessible for further processing. The main objective of
the extract step is to extract the required data from the source systems utilizing
the least possible little resources” [3]. Extraction is the act of extracting the
records from a range of homogenous and heterogeneous resources, and it is to
be transferred to the data warehouse. This process is done in such a way that it
will not be affecting the performance as well as the response time of the system.
754 P. Matta et al.
(B) Transformation
According to Akshay [3], “The most complex part of the ETL process is the
transformation phase. At this point, all the required data is exported from the
possible sources but there is a great chance that data might still look different
from the destination schema of the data warehouse” [3]. Transformation is an
act of transforming the extracted data into fruitful information which is not
exactly similar to the structure of the data in the warehouse. In this process, the
naïve values are sorted, merged, and even derived by the application of various
validation rules.
(C) Loading
Once the data is extracted and transformed, it is now finally ready for the last
stage that is the load process in which the data is collected from one or more
sources into their final system. However, there are many facts like the process
of data loading and its influence on the storage of data in the data warehouse.
The way the data is being loaded may have its impact on the server’s speed of
processing as well as the analysis process. The other major consideration during
the data loading is to prevent the database from getting debilitate. According to
Talib and Ramazan, “This step makes sure data is converted into the targeted
data structure of data warehouse rather than source data structures. Moreover,
various schema and instance joining and aggregation functions are performed
at this phase” [4].
The process of ETL can easily be explained with the help of Fig. 2.
4 Motıvatıon Behind
As the local survey of the micro and medium enterprise has been recorded, the owners
of these enterprises said that they are lacking in establishing proper communication
with their customers since various platforms came into existence. A variety of plat-
forms is the reason for segregating the choice of customer, and thus the customer is
not stable to purchase goods from a single platform. Seller’s said that if the customers
will help them by giving their valuable and suitable suggestions regarding the product,
they will be providing the best services for them. Thus, from this survey, the CHAIN
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power 755
process is introduced which helps to interact and enhance the relationship between
both the consumer and the seller.
5 Proposed Methodology
The analysis process can be accomplished after the data is gathered and inputted
on the basis of dynamic requirements generated by various end-users or customers.
“During early requirement analysis, the requirement engineer identifies the domain
stakeholders and models them as social actors, who depend on one another for goals
to be fulfilled, tasks to be performed, and resources to be furnished” [5]. After having
concern about the requirements, then it will be forwarded to the next process, that is,
data collection. Big data analytics in logistics and supply chain management (LSCM)
has received increasing attention because of its complexity and the prominent role
of LSCM in improving the overall business performance. Thus, while acquiring data
from multiple sets and performing analyses it found that the CHAIN method will be a
rising factor in mining techniques for the customer relationship model (CRM). “CRM
requires the firm to know and understand its markets and customers. This involves
detailed customer intelligence in order to select the most profitable customers and
identify those no longer worth targeting” [6]. “In the emerging markets of Asia,
dynamic capability played a crucial role in gaining competitive CRM performance
across all three industries” [7].
(A) CHAIN
“Another popular approach to customer preference quantification is the discrete
choice analysis (DCA) technique, which includes the probit model and logit models
(multinomial, mixed, nested, etc.) to name but a few” [8]. Data requirement is the first
and foremost part of data processing. The analysis process can be accomplished after
data is gathered and inputted on the based on dynamic requirements generated by
various end-users or customers. There are lots of ways to collect data from various
inputs but according to the situation of the market, the introduction of the term
“CHAIN” can be a rising factor for the local market businesses.
CHAIN stands for “Customer’s Help, Advice and Information Networks,” and it
is a process of communicating customers with the sellers from which the customer
will help the business persons to gain knowledge by providing suitable suggestions to
enhance a business and boost up the services between the customer and sellers. This
is the process in which the shopkeepers will raise the questions with their customer
in the form of sentiments or in any other possible ways to interact with them (Fig. 3).
(B) Need of CHAIN
“Market segmentation is one of the most fundamental strategic planning and
marketing concepts, wherein the grouping of people is done under different categories
such as the keenness, purchasing capability, and the interest to buy” [9]. According
to the market analysis, it shows that nowadays customers are avoiding approaching
756 P. Matta et al.
the local market and trying to less interact with the local vendors. This is because
of having changes in our lifestyle and having attraction toward the platform which
is not social. “OLC establishes an organizational culture where the existing mental
models regarding the collection, retention, and utilization of customer knowledge
are gradually replaced with new ones to better exploit market opportunities which
translate customer needs into value-added offerings” [10].
According to the survey, the shopkeepers have suffered a lot due to e-commerce not
only because of cheaper prices of products but also not having the proper interaction
between customers and shopkeepers which causes a distance between them. By the
process of “CHAIN,” it is possible to challenge and target other platforms easily. By
having healthy communication, an information network between them will also help
in enhancing business services and by providing a product at effective cost.
(C) Performance of CHAIN
“Today is the era of loyalty such as customer loyalty, employee loyalty, manage-
ment loyalty, and loyalty to the principles, ideals, and beliefs. Several studies have
shown that satisfaction is not the key to ultimate success and profitability” [11].
With the help of e-services (digitally) or can be non-digital, shopkeepers will easily
approach toward the customer with providing weekly suggestion assessment for the
customer, who will help the shopkeeper to gain fruitful approaches and aid to enhance
their business and also help in maintaining relations with customer, and thus it will
form a customer network.
Examples of e-services are by providing sentiment assessment for their customers
through contact numbers, creating an effective Web site of their respective stores and
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power 757
by providing a chatbot system, providing effective services for all products and a
proper review system for each product, and also ERP system which attracts the
customer by getting any kind of information regarding purchased goods.
These factors will lead to rising of CHAIN’s process. “These findings prove the
findings given by Brown and Gulycz (2001) and Chen (2008), who recommended that
satisfied customers are more inclined toward retaining a relationship with existing
companies and positive repurchase intentions in the future” [12].
(D) Proposed Model of Methodology
Figure 4 depicts the relation between customer and seller in which seller will
provide a suitable interface which contains the reviews and sentiments for their
customers to communicate with them and later on customers will advise sellers about
their product and share the appropriate reviews with them, and at last vendor will
come up with the best outcome and share it with them accordingly. Thus, it will form
a CHAIN to enhance the business management system and customer relationship
management.
The shopkeeper will have a Web site and application software that consists of
a good support system design through which they can interact with the end-users.
Each customer will have to register and to create a user id and password through
which they can login and communicate with shopkeepers; there will be a choice for
customers that they want their information to be public or private. This feature will
keep customer’s data safe, confidential, and away from duplication.
And if any kinds of comments or information either positive or negative are there,
the rights are given only to the admin or a shopkeeper to ignore or block the false
information.
758 P. Matta et al.
According to Sgier [13], “Big data analytics in Logistics and Supply Chain Manage-
ment (LSCM) has received increasing attention because of its complexity and
the prominent role of LSCM in improving the overall business performance.”
According to Shafiei [14], “Reliable field data and well-documented field trials are
the foundation of developing statistical screening tools.”
According to Marjani [15], “Moreover, big data analytics aims to immediately
extract knowledgeable information using data mining techniques that help in making
predictions, identifying recent trends, finding hidden information, and making deci-
sions.” According to Nastic [16], “detecting patterns in large amounts of historic
data requires analytics techniques that depend on cloud storage and processing capa-
bilities.” After going through different viewpoints of researchers and practitioners,
concludes the following research. The study can be made on any kind of computing
device, for example, desktop PC, laptops, smartphones, and others of the same cadre.
The investigation involves a thorough market analysis of IT products of different
companies that have been sold in Indian markets in the 2nd quarter of the year
2019 (July–December). The collected data from the local market reached all the IT
product exclusive stores about trending sales. After the analysis, there has been a
decline in market sales and an increase in e-commerce sales by 60–70%. And as a
whole, there is a threefold decrease in the overall IT hardware sales in India. The
data collected from the smartphone sellers depicts that there is a huge downfall in
sales of mobile phones and its accessories in small local markets. Thus, an analysis is
performed due to less interaction between customers and a shopkeeper is the major
factor in the downfall of IT sales in the local market. The lack of services provided
by the shopkeeper also leads to loss of their market. A survey is conducted for the
shopkeepers regarding the market strategies. Some of those questions are as follows:
Criteria 1: If you want to buy any digital or IT product, will you purchase that from
the local market or online market (Flipkart, Amazon, any other online platform)?
Criteria 2: Which platform among these two you feel more secure and safe?
Criteria 3: If the local market gives you the facility, not only of item purchase but
also information sharing, accepting customer requests and ideas, then which platform
would you like to choose?
The result of that survey is depicted in the graphical forms.
After analyzing Fig. 5, it is observed that half of the customers are attracted to the
online platform which is the basic reason for the downfall of the local market. If the
shopkeeper starts communicating with customers and provides the best services to
their customers, then customers will regain the local market and it may enhance the
economy of the local market especially of IT products and there will be a healthy
circulation of money in the market again.
After analyzing Fig. 6, it is observed that a lesser number of customers are attracted
to online platforms, as they feel insecure regarding the quality of the product.
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power 759
Figure 7 shows 60–65% of customers will back to the local market if the best
services are provided by the sellers and according to the previous results, this time
there will be an increase of sales in the market by 10–15% which will boost up the
local market economy.
7 Result Analysis
“Based on the theoretical and the reality of what happened, there is still a gap of
research on the influence of customer satisfaction on customer trust and loyalty. The
key problem in this research is questioning the variable customer satisfaction and
760 P. Matta et al.
trust influence customer loyalty and the role of customer trust as a mediating variable
in the BRI Kendari of Southeast Sulawesi province” [17]. The customers now have
a variety of options to purchase products because of rapid globalization and growing
competition. They can easily compare products or even switch their platform which
is the reason behind of downfall for retailers.
Thus, to collect information from consumers there will be a “CHAIN” process
which can be digital or non-digital will help lots of shopkeepers and customers to
maintain a long-term relationship between them and thus there will be a safe, secure,
and good product services in local markets.
Not only “CHAIN” process applies to the IT sector but for the other remaining
sectors “The CHAIN” methodology is a requirement. “Through customer collabo-
ration, organizations learn, meet customer requirements better, and improve perfor-
mance (Prahalad and Ramaswamy 2004). Customers offer a wide base of skills,
sophistication, and interests and represent an often-untapped source of knowledge”
[18].
However, this can be resolved through the CHAIN process which can be an
operational process to enhance the growth of the business and make them more
profitable.
CHAIN is quite similar to customer service and support (CSS), but CHAIN is an
advance feature in which it will take customer’s help and business-related information
for better outcome of business and by providing quick, convenient, and consistent
services to its customers by interacting them with the help of e-services directly to
the shopkeepers without having any conciliator.
9 Conclusıon
Data analysis and mining are some of the most important and most promising aspects
in the field of information technology for the integration of the businesses, vigorous
decision making. In this paper, the process of data analysis is explained by employing
various techniques. The outline is significantly related to data analysis and its chal-
lenges. Furthermore, an overview of the ETL process can be applied to both the public
and private sectors for forecasting, making crucial decisions in finance, marketing,
sales, etc.
The introduction of the CHAIN process will be the most effective technique in
the field of analysis and mining. It will boost up the economy of the retail market in
the coming years. Lots of people will interact with the retailers digitally, and thus it
will also help in the growth of the digital sector which will be more safe and secure.
People will also learn to be more social.
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power 761
The outcome of data analytics is one of the major requirements of the industry
nowadays. All industries, entrepreneurs, and other organizations have recognized the
significance of data analytics to improve their throughput, to enhance their profits,
and to increase their efficiencies.
One can easily understand the criticality of efficient and better data analytics for
the proper growth of any kind of industry, business, or organization. It also provides
the speed and accuracy to business decisions and also maximizes the conversion
rates. In data analytics, it is a great career opportunity and has a good future in a
thriving era.
References
1. Gandomi A, Haider M (2015) Beyond the hype: Big data concepts, methods, and analytics. Int
J Inf Manage 35(2):137–144
2. Steve K, Mathews T (2017) Earth science data analytics: definitions, techniques and skills.
Data Sci J 16
3. Lohiya, Akshay S et al (2017) Optimize ETL for banking DDS: Data Refinement Using ETL
process for banking detail data store (DDS). Imperial J Interdiscip Res (IJIR) 3:1839
4. Ramzan T et al (2016) A multi-agent framework for data extraction, transformation and loading
in data warehouse. Int J Adv Comput Sci Appl 7(11):351–354
5. Giorgini P, Rizzi S, Garzetti M (2008) GRAnD: A goal-oriented approach to requirement
analysis in data warehouses. Decis Support Syst 45(1):4–21
6. Rygielski C, Wang J-C, Yen DC (2002) Data mining techniques for customer relationship
management. Technol Soc 24(4):483–502
7. Darshan D, Sahu S, Sinha PK (2007) Role of dynamic capability and information technology
in customer relationship management: a study of Indian companies. Vikalpa 32(4):45–62
8. Conrad T, Kim H (2011) Predicting emerging product design trend by mining publicly avail-
able customer review data. In DS 68–6: proceedings of the 18th international conference on
engineering design (ICED 11), impacting society through Engineering Design, vol 6, Design
Information and Knowledge, Lyngby/Copenhagen, Denmark
9. Kashwan, Kishana R, Velu CM (2013) Customer segmentation using clustering and data mining
techniques. Int J Comput Theory Eng 5(6):856
10. Ali Ekber A, et al (2014) Bridging organizational learning capability and firm performance
through customer relationship management. Procedia Soc Behav Sci 150:531–540
11. Ali K, et al (2013) Impact of brand identity on customer loyalty and word of mouth commu-
nications, considering mediating role of customer satisfaction and brand commitment. (Case
study: customers of Mellat Bank in Kermanshah). Int J Acad Res Econ Manage Sci 2(4)
12. Ishfaq A , et al (2010) A mediation of customer satisfaction relationship between service
quality and repurchase intentions for the telecom sector in Pakistan: A case study of university
students. African J Bus Manage 4(16):3457
13. Sgier L (2017) Discourse analysis. Qual Res Psychol 3(2):77–101
14. Ali S, et al (2018) Data analytics techniques for performance prediction of steamflooding in
naturally fractured carbonate reservoirs. Energies 11(2):292
15. Mohsen M, et al (2017) Big IoT data analytics: architecture, opportunities, and open research
challenges. IEEE Access 5:5247–5261 (2017)
16. Stefan N, et al (2017) A serverless real-time data analytics platform for edge computing. IEEE
Int Comput 21(4):64–71
762 P. Matta et al.
17. Madjid R (2013) Customer trust as relationship mediation between customer satisfaction and
loyalty at Bank Rakyat Indonesia (BRI) Southeast Sulawesi. Int J Eng Sci 2(5):48–60
18. Vera B, Lievens A (2008) Managing innovation through customer coproduced knowledge in
electronic services: an exploratory study. J Acad Market Sci 36(1):138–151
Behavioural Scoring Based on Social
Activity and Financial Analytics
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 763
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_56
764 A. Gupta et al.
1 Introduction
presence and financial record to provide with a score that signifies the behaviour of
an individual.
Furthermore, the work demonstrates how a person’s online activity on social
media sites such as Facebook and Twitter determines the nature and behaviour of the
person. Some factors that are included for the social scoring are types of posts shared,
comments added, posts posted, pages followed and liked. These data are plotted
against a graph signifying the time to obtain a social score. There is a financial scoring
model that will determine the person’s financial fitness and likelihood to engage in
criminal activities due to financial deformity. Combining both social scoring and
financial scoring at a specific weight will provide the behavioural score. This score
will classify the subjects and help determine good citizens among the rest. This can
be used to engage and provide added incentives to good citizens in order to promote
good citizenship and behaviour.
The problem statement is to focus on the development of a scientific and method-
ological approach to determine a scoring mechanism to assess and score human
behaviour based on online social media activity and financial activity. Credit scoring
provides the banks with a base to determine whether the borrower or lender can
be trusted to give or grant loans. It will help and filter out trusted and non-trusted
sources. Big data sources are utilized to enhance statistical and economic models so
as to improve their performance. Using this data, it will provide more information
on the consumer which will help companies in making better decisions.
There has always been a system to punish people for the mistakes, and it does not
necessarily set the right example for others. Rather, building a system that will reward
the people for their good behaviour and financial expenditures. It would accomplish
two things, firstly it will inspire people to be well behaved on social media, and
secondly, t helps to spend the money wisely. After the availability of required data,
the system cleanses and analyses on the basis of various attributes, and later sentiment
analysis will be performed to develop the sentiment of the data. Once this is done for
all the data attributes, a normalized behavioural score will be generated. The system
moves on to financial scoring and the user required to provide data like monthly
income, debt ratio, number of times defaulted, number of dependents, number of
open lines of credits, number of secured and unsecured lines of income, age and so
on.
2 Literature Survey
The financial capacity of the state and the statistical approach is to assess it. [1] illus-
trates the event of specialized approaches to find the amount and use of an economic
scope to find current patterns of the variations for development. This research is
necessary because of the demand to authenticate the best national security level,
dubious methods to spot its modules, and therefore must select the direction for the
development of the state. The sole specialized method to assess the state’s financial
ability is incomplete, hence making it really hard to prepare steps for improving
766 A. Gupta et al.
the management of its components. This approach calls for comparing the values of
creation and use of the statistical data provided by the authorities like ‘the ratio of
the deficit/surplus of the state’s budget to Gross Domestic Production (GDP)’, etc.
[2]. This has proposed a scientific method for an inclusive analysis of the financial
potential of the state.
An individual’s online behaviour can be precisely be explained by personality
traits [3]. Therefore, understanding the personality of an individual helps in the
forecast of behaviours and choices. Several types of research have shown that the
user’s answers to a personality examination can help in predicting their behaviour. It
can be helpful in many different spheres of life from punctuality, job performance,
drug use, etc. [4]. The study was mainly focused on Facebook profiles of the users
as everyone uses Facebook to share about their life events That Facebook profile
features, For the purpose of the study, Facebook data of 32,500 US citizens were
acquired who had been active on Facebook for the past 24 months or more. The
data included the friend’s list of the user, events that are posted, regular status,
posts updates, images and groups. For analysis, the features were arranged according
to ‘number of months’ since when the user had joined Facebook. A simple linear
regression method was used for this purpose. Correlating the personality of the user
with that of Facebook profile features, ‘Spearman’s rank correlation’ was used to
calculate the correlations of the features. All these were tested using t-distribution
tests with significant at < 0.01 level. Openness—The people who are eager to have
experience and are free tend to like more Facebook pages and have a higher number
of status updates and are in more chatting groups. Conscientious users either join
few groups or no groups at all and also their use of like is less frequent. The research
reveals that the average number of likes made by the most conscientious users is
higher than 40% likes of the most spontaneous people. Extroverts mean the individual
will interact more with others on Facebook. These people like to share what all events
they are attending, events that are happening in their life and like to come in contact
with more and more users.
Neuroticism and Facebook likes have a positive correlation, showing the
emotional users using ‘like’ features more. It has been found that 75% of normal
users like fewer than 150 pages or posts, but emotional people use the like feature
for 250 times or more [5]. The study is limited to a small or moderate population
who volunteered to report their behaviours though the research paper limits to only
US-based Facebook users. Another issue is that Facebook users can be very selective
in liking the pages which can defer from their personality.
The credit scoring system has been the main support system for the bankers,
lenders, NBFCs, etc., to perform statistical analysis about the creditworthiness of
the borrower [6]. The lenders thus based on this credit score decide whether to grant
credit to the borrower or not [7]. The article proposes a unique improvement in the
traditional credit scoring model by the inclusion of mobile phone data and social
network analysis results. Also, the article proves that adding call data records (CDR)
adds value to the model in terms of profit. Thus, a unique dataset is made by including
the call data record, account information related to the credit and debit made by the
borrower to make the credit scorecard [8]. Call data networks which are made from
Behavioural Scoring Based on Social Activity and Financial … 767
call data records are used to differentiate the positive credit influencers from the
defaulters. Advanced social networks analytics techniques are used to achieve the
desired output. Hence, the results of the research show that the inclusion of a call data
record increases the efficiency of the traditional credit scoring model. The positive
aspect is that the traditional credit scoring models can be improved by adding call
logs. It is easier to forecast what all features are more essential for the prediction,
both in terms of profit and statistical performance. On the other side, the biggest
limitation is the dataset. It just generalizes the whole scorecard card. The lender
cannot differentiate whether the credit is for micro-loans or mortgages. The credit
scoring model used in the research does not include the behaviour feature which is
also an important parameter for analysing creditworthiness; behavioural data can be
obtained from telecom companies or social media platforms.
The subjective material well-being (S-MWB) covers a wide range of concepts
like financial satisfaction, stress due to financial insecurity, how the government is
handling the economy, etc. It can be a self-reported well-being gathered by a ques-
tionnaire [9]. It may consist of a range of economic issues like how the government is
handling the economy of the country, the rate of essentials, pay and margin benefits
from the job, and livelihood. It can also focus on particular dimensions of the phys-
ical life like contentment with the family financial situation as well as self-financial
situation, if the income is adequate or not, the standard of living is up to the expec-
tation or not, etc. The advantage here is about the study revealing several interesting
facts that may have policy significance, understanding the precursors of subjective
material well-being can help policy-makers to make schemes to improve the sense
of material well-being of the people as well as their quality of life. The drawback is
the comparison by the people done amongst their own country people and not the
other country people. The subjective well-being of a person is very much dependent
on the economy of the country.
Credit rating could be a process to estimate the power of an individual or orga-
nization to fulfil their financial commitments, supporting previous dealings [10]. By
such a process, financial institutions classify borrowers for lending decisions by eval-
uating their financial and/or non-financial performances. This paper focuses on using
social media data to determine an organization’s credibility. It is done so because
many times a case may arise where the financial and non-financial assessments done
on the organization might not be accurate or cannot be trusted. Preferably, there may
be a case where their credit analysers provide false data, so that they can easily get a
loan. In these cases, using social media data can be very fruitful. It also undertakes
financial measures which have been implemented in our project as well. Therefore,
using multiple criteria for the credit rating approach can be very useful for accurate
information. A multiple criteria approach will help to identify the loan taker’s cred-
ibility as a large extent to have various factors to distinguish. This approch not only
tracks their financial and non-financial assets but also their social media data which
can reveal important things about the company’s mind set and behaviour. The bene-
fits are the integration of social media data into credit rating. Analysts are provided
with more interpretation opportunities. The negative side is that the credit ratings
768 A. Gupta et al.
tend to decrease when social media data that is considered. Gathering this much data
might require a lot of permissions which need to be granted.
The work develops models to identify the scores of consumers with or without
social network data and how the scores change the operation [11]. A consumer might
be able to change the behaviour of social media to attain a higher score. This article
deals with the amount of change from the normal score a consumer can get by doing
accordingly. A credit score is required by consumers, so that people can apply for a
loan, to get their lender’s confidence which can extend its consumers credit based on
the score. Therefore, a high credit score is a must for all consumers. Consumers realize
that the score is based on social network data; people will try to be better citizens.
Given that consumers use credit for a variety of undertakings that affect social and
financial mobility, like purchasing a house, starting a business, or obtaining teaching,
credit scores have a substantial impact on access to opportunities and hence on social
inequality among citizens. Until recently, credit score was captivated with debt level,
credit history, and on-time payments as summarized by FICO. But now, an outsized
number of firms depend on network-based data to see consumer creditworthiness.
The positive aspect is the development of models to assess the change in credit
score and usage of network-based measures rather than consumer’s data, whereas
the negative part is developing these models which might be a tedious task, and there
is a possibility of the network provided data being fabricated.
A credit score system is presented in an article for ethical banking, micro-finance
institutions, or certain credit cooperatives [12]. It deals with the evaluation of the
social and financial aspects of the borrower. Social aspects include employment,
health, community impact and education. The financial aspects are mainly based on
banking details of the company, its account statements, loan history and repayment
of debts on time. Based on these financial aspects and social as well, this paper will
help to figure out companies who are actually trustworthy of borrowing or lending
money and hence be fruitful for the bank where the people are associated. On the
one hand, it provides a credit scoring system for ethical banking systems, identifies
responsible leaders, whereas on the other side, there are security issues, consumer’s
banking details which are sensitive information.
The paper tries to show that compulsive buying and ill-being perception acts as
controlling on credit and debit card uses and debts [13]. They both act as a repulsive
force for each other. A person with compulsive buying will tend to spend more and
more money which can lead to debt, whereas a person with a perception of ill-buying
may think of future financial problems and knows that unnecessary buying may lead
to debt will tend to spend money judiciously. There are several studies proving
this theory that compulsive buying encourages debt, while ill-buying perception
discourages debt [14]. But today, materialistic happiness is a dominant factor for
people; hence, mankind tends to spend on materialistic goods. Well, there can be
multiple hypothetical situations which the research paper had assumed like the urge
of getting materialistic goals positively impacts compulsive buying, individuals with
compulsive buying will overuse their credit cards, compulsive buying leads to debt,
responsible use of credit cards have a negative effect on debt, ill-being perception
leads to responsible use of credit cards, and ill-being perception discourages credit
Behavioural Scoring Based on Social Activity and Financial … 769
card debt. The benefit is that the article is not gender biased, and it clearly shows how
a materialistic individual can lead to financial debt easily irrespective of the gender.
The drawback is that the research is unable to analyse an individual’s behaviour at
different time periods, and the credit analysis models lack robustness.
3 Proposed Methodology
The work aims to focus on the improvement of a logical and methodological way to
deal with deciding a scoring mechanism to assess and score human behaviour based
on online social media activity and financial activity. More or less, its principle
development, once completely executed, could be that every user will be given a
certain mark estimating their truthfulness, genuineness and uprightness and that this
score will at that point be a significant determinant for their lives, for example,
regardless of whether to have the option to get credit, lease a house, or purchase a
ticket or being given favoured access to medical clinics, colleges and taxpayer-driven
organizations. This score will focus on thinking about a wide scope of individual
elements. It additionally takes after, yet goes farther then, a scope of frameworks
that are proposed to build the noticeable quality of notoriety with exchanges, online
stages and in the ‘sharing economy’.
Consequently, it is focused here on rating frameworks concerning distinct people.
The social angles attempt to evaluate the advance effect on Millennium Development
Goals, for instance, work, training, condition, well-being or network sway. The social
FICO rating model combines the bank’s expertise and ought to likewise be lucid with
its significance. Scoring alone based on financial aspects may risk the institution to
let a socially bad person get loans and other financial benefits. A socially bad person
may tend to be a defaulter or use financial benefits for unethical purposes. Therefore,
keeping this view in mind, a methodology has been proposed which will score a
person based on both social and financial aspects. The proposed system comprises
four major components, namely user, third-party companies/govt., social media data
pool and financial/bank data. The user is required to register with our system and
connect a social media account of the desired choice. Once the user’s social media
account is connected with this system, the user will be required to provide an ‘access
token’ which the system utilizes to access the required data. After the required data
is available, the system will clean, it will be analysed based on various attributes,
and then, sentiment analysis is being performed to develop the sentiment of the data.
Once this is done for all the data attributes, a normalized behavioural score will be
generated. The system moves on to financial scoring, and the user will be required
to provide data like monthly income, debt ratio, number of times defaulted, number
of wards, amount of open lines of credits, number of secured and unsecured lines
of income, age, etc. The system will make use of machine learning models for both
behavioural and financial scoring. To perform behavioural scoring, the system will
request an external API and use a local model for financial scoring. To send this ML
model as a REST administration, it adopts Flask. Furthermore, the system is using
770 A. Gupta et al.
a WSGI compliant application server along with NGINX Web server. The trained
model is deployed with a Flask. The model can be saved and loaded to make new
predictions on data provided by clients/users.
Considering behavioural scoring, sentiment analysis is executed on the user’s
social data. Sentiment analysis classifies the data based on its sentiment and provides
a positive, negative, or a neutral score and a confidence value which is used to
generate a score. For financial analysis, the above-mentioned attributes are taken into
consideration. Also, a local model is built using a random forest classifier algorithm
to generate the score accordingly. The input dataset consists of social data obtained
from user’s Facebook and other social media accounts using various external APIs
like Facebook Graph API. It is the leading method for applications to analyse and
compare user data. Financial data would be provided by the user at the time of
generation of the score.
Once the user data is available, the retrieval of important information is carried
out to score a user, and unnecessary information can be removed. Parameters and
features decide the availability of data and subject to check reliability. Monkey learn
classifier is utilized to perform sentiment analysis and obtain the sentiment of the user.
Score aggregation and normalization involve combining of both the behavioural score
and financial score. Also, there are various techniques like the weight of evidence
(WOE) and information value (IV) which are applied. This will make the system
more reliable and efficient. Additional ranking based on time dependency is also
performed. The final scores are intimidated to the user with the scoring benchmark
and reference. The behavioural and the financial score will allot a final score to the
user.
4 Empirical Analysis
The objective of this module is to classify and provide a score to a person based
on their social media activity. Behavioural scoring involves the collection of user
data from various social accounts, analysis of user data and final generation of the
score. Data is obtained from Facebook and Instagram using Graph API. This is done
by generating access tokens with required permissions. Inspection of parameters
like the post, quotes, likes and feed data is done. Behavioural scoring module uses
the user’s social data to perform sentiment analysis. The analysis is performed on
all the above parameters. The system will use an external API, MonkeyLearn and
compute the sentiment of each parameter. This provides the system with sentiment
value and confidence. The system performs weight of evidence (WOE) and allots
rank weighted by time for parameters related to like posts and likes because these
are influenced by time and dynamic appearances.
Graph API provides various functionalities for applications to read and write
the Facebook community-based diagram. The API’s structure is made up of nodes,
edges and fields. Nodes are singular objects like user, picture and group. Edges are
the connections between the nodes. In simple words, the link between a group of
objects and a single object is an edge. Fields provide information regarding an object
like general information of a person. So the nodes are used to fetch metadata about
a particular object which are individual users in our system, use the connections
between nodes (Edges) to fetch groups of entities on an individual object, and use
fields to retrieve individual user’s general information which will be used as scoring
parameters to generate a score for the individual user. Graph API is HTTP-based
which makes it compatible with any language that has an HTTP library. This allows
the Graph API directly with the proposed system once the access tokens are provided
by the user. Also, field parameters can be included in the nodes (individual users)
and describe which categories or areas that can be sent back with the response. An
immediate check of the admin node reference shows that one of the categories that
can be fetched when accessing admins entity in the name field, which is the name
of the admin. Nodes are singular objects, each with a distinct ID, to get information
about a node that directly queries its ID.
In regards to MonkeyLearn API for sentiment analysis, an external API is applied
to perform sentiment analysis. This assists in categorizing and finding utilitarian
metadata from raw texts like electronic mails, online chats, and other media resources
like Web pages, online documents, tweets and more. Also, the content can be cate-
gorized with formal groups or bins like emotion or subject, and extricate any specific
772 A. Gupta et al.
discern, separate, evaluate and study states and abstract data. An essential errand
in assessment investigation is about ordering the extremity of a given book at the
archive, sentence, or highlight/perspective level—regardless of whether they can
communicate conclusion in a record, a sentence or an element include/viewpoint
is sure, negative, or unbiased. Next, the access token is utilized and the model ID
to call the MonkeyLearn API. Further, each column is iterated, and then, data is
classified as negative or positive. An attribute called ‘confidence’ is obtained. These
two attributes (‘senti-value’, ‘confidence’) with ‘ID’ are added in the CSV file which
was obtained in the second step as new attribute columns with their values. These
steps are involved to call the MonkeyLearn API that will initiate a post request. The
endpoint expects a JSON body. It should be an object with the data property and the
list of the data (texts) which need to be classified. The response consists of a list of
all the data with their response that is negative, positive, or neutral, and a confidence
value if the API call is successful.
Next is identifying the weight of evidence and assigning them to parameters.
The weight of evidence (WoE) provides the functionalities to re-engineer the values
in continual and unconditional forecasting of the variables into individual boxes
on its own and finally assign to every individual box category a distinct weight of
evidence value. As different users will have different parameters, it will be used
to allocate the score. WoE provides weight based on the priority and usefulness of
the parameters. This weight is assigned to the parameters. The parameters with the
highest weights are chosen for sentimental analysis. WoE can be used to compute
774 A. Gupta et al.
the ‘strength’ of a marshal in order to uncouple positive and negative default. It can
also be written as the ratio of spread of positives / spread of negatives, where spread
refers to the proportion of true and false values in the distinct bins of the total amount
of positives and negatives. Mathematically, the ‘weight of evidence (WOE)’ value
for some number of observations can be computed as:
The amount of WoE will be zero if the likeliness of spread of positives / spread
of negatives is equivalent to one. If the distribution negatives or badin, a bin is more
than the distribution positives or goods, the probability will be lesser than one, and
the WoE will be a negative number; if the number of positives is greater than the
negatives in a group, the WoE merit will be a definite (>0) number. From all the above-
extracted features, the best features are identified which are related to differentiate
the various leaf diseases.
get at least some insights and forecasting power. It also becomes really important
that the decision trees as well as the forecasting made by them are unrelated or
at least have very low levels of degree of similarity. While the algorithm itself via
feature haphazardness tries to execute the lower degree of relations for us, the features
selected and the final parameters decided will ultimately impact the relations as well.
The two main reasons of utilizing random forest are follows. The predict_proba
function of the random forest classifier can be used directly on the output to get a
probability value in a range from zero to one. Another reason is that it is extremely
effortless to change the output of a random forest classifier to a simpler binary
classification problem which would further ease computation.
In regards to dataset description, the attributes which are taken into consideration
are as follows:
• Age: The age in years of the borrower.
• Debt ratio: The debt ratio is defined total costs incurred by the borrower in a month
like living costs, payment of monthly EMI or any other debt payments divided by
their net gross income of a month.
• Monthly income: The gross monthly income of the borrower.
• Number of dependents: Total number of dependents in the family including
parents, children, wife, etc.
• The total number of unsecured lines of income: This may include personal lines
of loans, borrowed credit from friends or family, credit card balances, etc.
• The total number of secured lines of income: Secured lines of income refers to
real estate income, business or service income.
• Defaulted in first 30 days: Amount of times the debtor failed to pay in the first
thirty days.
• Defaulted between 30–59 days: Amount of times the debtor failed to pay between
30 to 59 days.
• Defaulted between 60–89 days: Amount of times the debtor failed to pay between
60 to 89 days.
• Defaulted after 90 days: Amount of times the debtor failed to pay after 90 days.
• Total number of loans: Total number of loans taken by the borrower.
For model development in detecting outliers, the outliers in statistics can be
thought of as data points as it differs greatly from the remaining part of the data.
Outliers are abnormalities in the data, but it is imperative to understand the nature of
the outliers. It is essential to dropping them only if it is clear that they are incorrectly
entered or data not properly measured, so that removing them would not change the
result. To detect the outliers, the interquartile range (IQR) method is applied. These
are a set of mathematical formulae applied to retrieve the set of outlier data. IQR is
defined as the interquartile range (IQR) which is the midpoint or centre half that is
a share of quantifiable dissipation being equal to the difference between the ranges
of seventy-fifth and twenty-fifth percentiles.
At the end of the day, the IQR is the subtraction of the lower quartile from the third
quartile that is also known as the upper quartile. These quartiles can be observed by
plotting them on a case plot. It is a proportion of the scattering like standard deviation
or fluctuation, yet is considerably more powerful against exceptions. The indexes of
outliers are appended, and the entries removed are from the dataset. The dataset
cleansing is the process of removing data which is unfit for the training process.
This may include NaN values present in the dataset. A series of python functions
are utilized to perform the same. The essential one being functions such as qcut
and get_dummies. Qcut is defined in the python documentation as ‘quantile-based
discretization function’. This means that this function will divide up the original
or the fundamental data entities into similar-sized boxes. This function defines the
boxes using percentiles based on the dispersion of the available data rather than the
true numeric boundaries of the boxes. The values which are greater than six will
be bonded together as the standard deviation of chosen data is extremely high. All
the NaN values are present in the chosen dataset with the median of the column.
Get_dummies is a python functionality which is used to convert categorical vari-
ables into dummy/indicator variables. When this function is applied to a column of
categories where there is one category per observation. It produces a new column for
each unique categorical value. The value one is placed in the column corresponding
to the categorical value present for that observation. When the number of values is
increased, the accuracy and the efficiency of the model are also improved when the
random forest is used in further processes. After this, a final check is performed on
the dataset to check for any NaN values present in the dataset. For model creation,
the dataset is divided as testing and training data, and the target value is separated
from the trained features.
Moving on with the model fitting and accuracy aspects, a random forest classifier
is used for creating and fitting the model. A confusion matrix will be used to generate
the accuracy of the model. The confusion matrix is used to get an understanding of
the working and execution of a classification pattern on certain testing data for which
the label values are known. It allows envisaging the working of an algorithm. The
specifics of the accuracy of the model can be determined using a confusion matrix,
such as absolute accuracy, responsiveness, reactiveness and so on. These measures
assist to determine whether to accept the model or not. Taking into account the cost
of the errors is an imperative part of the decision whether to accept or reject the
model. After this, accuracy can be calculated. The accuracy of the proposed model
came out to be 80.78%.
The next step is the generation of scores. The model is loaded, and the data
values provided by the user are passed into the model to generate scores. But first,
the model is dumped in a package. This is done using the Joblib library. Joblib
provides a quick and strong mechanism especially for bigger amounts of data and
operations like streamlining of ND arrays. After loading the model, the predict_proba
function is used to generate the scores. This is an extremely significant function. The
Behavioural Scoring Based on Social Activity and Financial … 777
predict_proba gives you the probabilities for the target in array form. The amount
of probabilities for every row is taken out and is equal to the length of the total
categories. It gives the value of the log-probability for all the features of the model,
where features are ordered as they are in classes. The predict_proba (SELF, M) [SRC]
estimates the final returned values for each class and is ordered by the index of the
features.
The mathematical logic applied to normalize and couple the scores. There are
two aspects of the system namely financial and behavioural where equal weights are
given to both behavioural as well as financial. Another approach is to do a weighted
average (Fig. 3).
To normalize the score accurately there are certain cases that need to be handled.
For instance, there can be cases like unavailability of financial or behavioural data,
categorization, ranking based on score and so on. In that case, there may be a necessity
to convert the scoring parameters. To address these issues, below are a set of problems
and solutions:
• Unavailability of behavioural data, hence no generation of the behavioural score.
• Similarly, in case of unavailability of financial data, the financial score cannot
be generated. So there can be a requirement to perform scoring on financial or
behavioural data alone.
• As the financial model generates the default probability, is it possible to transform
it into a financial scoring metric?
• How to categorize users based on score and to justify the lack of data, if any of
the above cases are encountered?
• To handle these cases and maintain the effectiveness of the system, there are
following possibilities:
• As there is a calculation about the chances of a user defaulting for a certain
number of days or the default probability, the probability a user does not default
can be simply calculated, 1—probability (user defaults) therefore, the probability
of good behaviour.
778 A. Gupta et al.
• There is not enough usable data for behavioural scoring, and yet it is significant
to score them.
In order to understand this clearly, hypothetically when there is non-availability
of neither behavioural nor financial data, then both behavioural and financial score
will be zero, and the final score will be calculated as,
• Score = 1 – 0 + 0 / 2 = 0.5
• The user will be put into a neutral category—neither good nor bad.
Suppose when behavioural data is unavailable, then the user has a behavioural
score of 0 and financial default probability of 0.5 (which will come under a bad
borrower), for this the score will be calculated as,
• Score = 1−0.5 + 0 / 2 = 0.25
• This is a bad score and at the lower end of the spectrum, considerably not an ideal
position to be in.
When there is both financial as well as behavioural data and both the scores are
0.5 each, then the score will be given as,
• Score = 1−0.5 + 0.5 / 2 = 0.5.
• This will again come under good category and hence acceptable as the user has a
good behavioural score but a bad financial score.
If there is an extremely bad case, −0.5 as the behavioural score and a financial
default probability of 1 which means the user will certainly default. In this case, the
score will be calculated as,
• Score = 1–1 + (−0.5) / 2 = -0.25.
• Now, this is extremely bad, and hence, it is extremely concerning.
On this basis, it is decided to categorize the scores as, 0.75–1.0: Excellent, 0.5–
0.75: Good, 0–0.5: Okay and −1.0 to 0: Concerning. Looking at all those cases,
it is understood that with the data obtained and using the different behavioural or
financial score computed, the score can be justified based on this logic. Hence, the
score stays consistent, and all the scores can be justified in every case. One interesting
thing to note over here is that financial score ranges from 0 to 1 (never negative), and
behavioural score ranges from −1 to 1. Therefore, if an equal-weighted average is
taken for both, the score will range in −0.5 to 1.0. In this case, it can be concluded
with ease that a negative aggregate score can be an extremely concerning score.
The work demonstrates how behavioural scoring can be used to promote good
behaviour and identify good citizenship among the actors. This can be used to engage
Behavioural Scoring Based on Social Activity and Financial … 779
and provide added incentives to good citizens to encourage good citizenship. The
research work portrays how a person’s online activity on online media sites like Face-
book and Twitter determines the nature and behaviour of the person. Some factors
that are included for the social scoring are types of posts shared, comments added,
posts posted, pages followed and liked. These data are plotted against a graph signi-
fying the time and obtain a social score. There is a financial scoring model that will
determine the person’s financial fitness and likelihood to engage in criminal activities
due to financial deformity. Combining both social scoring and financial scoring at
a specific weight will provide us with a behavioural score. This score will classify
the subjects and help determine good citizens among the rest. This can be used to
engage and provide added incentives to good citizens to enhance good citizenship.
Many compelling avenues are open for future enhancement and exploration such as
only certain specific features have been used to predict the personality of the user,
though there is a wide variety of features that were not explored like a specific type
of group a user is a member of. The user can be selective in liking any page, group,
or public figure. Thus, more sophisticated approaches can be used to overcome this
drawback. The analysis is only done based on the online behaviour of the user. A user
can have different behaviour in the virtual environment and the real environment.
Hence, work can be done in the future to outperform this negative aspect. Another
scope of further improvement can be the study of privacy-safeguard mechanisms to
further enhance and secure online data.
References
12. Gutierrez-Nieto J, Begona SC, Carlos CC (2016) A credit score system for socially responsible
lending. J Bus Ethics 133:691–701
13. Bertran D, Echeverry MP (2019) The role of small debt in a large crisis: credit cards and the
Brazilian recession of 2014
14. Lee L, Qiu GM (2016) A friend like me: modeling network formation in a location-based social
network. SSRN Electron J 33(4):1008–33
An Optimized Method for Segmentation
and Classification of Apple Leaf Diseases
Based on Machine Learning
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 781
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_57
782 S. S. Slathia et al.
1 Introduction
The goal of this work is to discover leaf malady recognition and characterization.
Leaf illness identification and characterization are the significant qualities required
for agrarian businesses. As cultivation is a significant piece of the global wealth it
gives sanitation. It has been noticed that plants are widely contaminated by various
ailments. This causes tremendous monetary misfortunes in the horticulture industry
around the globe, and a proficient recognizable proof and acknowledgment of organic
product leaf maladies are a waste and flow challenges in machine vision because of
their significant requisition in farming.
In agribusiness, different sorts of natural product sicknesses exist which influence
the creation and nature of organic products. The vast majority of these maladies
are decided by according to a specialist here based on their indications. It may
be costly because of the inaccessibility of specialists and greater expense. In this
respect, the registering analysts in a joint effort with agribusiness specialists have
proposed numerous calculations for robotized identification of infections in plants
and natural products. The manual examination of natural product infections is a
troublesome procedure which can be limited by utilizing computerized strategies
for recognition of plant maladies at the previous stage. Hence, it is fundamental to
build up a mechanized electronic framework for recognition and arrangement of leaf
side effects at the beginning period. To satisfy the above necessity, a machine-based
picture handling method is suitable for recognizing and grouping the leaf illnesses.
The past framework portrays mechanized leaf infection imperfection discovery
from the pictures. A robotized vision-based framework which comprises of a picture
snatching system and an investigation strategy is utilized for identifying and grouping
the leaf illnesses. The proposed investigation strategy treats malady like unusual
areas and different effects that are identified with the order of leaf infections. In the
imperfection recognition process, a few pictures preparing calculations are utilized
to extricate pictures includes and find deformity’s situations on leaf pictures.
The current framework proposed a color-based division picture preparing calcula-
tion for leaf ailment ID of this work is to specialize in the development of a scientific
and methodological approach to see a scoring mechanism to assess and score human
behavior supported by online social media activity and financial activity. The draw-
back of the existing system is that those strategies are not quicker and adaptive. The
ribs may cause undesirable mistakes in the characterization of leaf diseases. Superior
preprocessing and division are absent. The yielded result is less when contrasted with
our proposed system. The accuracy of the system is low.
2 Literature Survey
The article portrays computerized leaf sickness imperfection recognition from the
pictures. In this paper, a robotized vision-based framework includes a picture
An Optimized Method for Segmentation and Classification … 783
results show that the proposed technique is a generous procedure for the disclosure
of plant leaves diseases.
The features differentiate in the kind of nonlinear post-taking care of which is
applied to the local force run. The features are operated with the Gabor imperative-
ness, complex minutes, and crushing cell executive features [7]. The capacity of the
relating directors to make specific part vector packs for different surfaces is contem-
plated using two strategies. The Fisher standard is for gathering result connection.
The two methods give consistent results. The pounding cell director gives the best
detachment and division results. The surface distinguishing proof capacities of the
executives and their power to non-surface features are moreover taken a gander at.
The crushing cell chairman is the one specifically that explicitly responds just to
the surface and limits a false response to non-surface features, for instance, object
structures.
Concerning leaf disease detection, the discussion is made on the exploration of
customized leaf disease areas which is an essential research subject as it would exhibit
benefits in checking immense fields of yields and along these lines perceive symptoms
of ailment when it appears on plant leaves [8]. There are the guideline adventures for
contamination acknowledgment of image acquisition, image preprocessing, image
segmentation, feature extraction, and statistical analysis. This proposed work is to
represent the first step in filtering using the center channel and convert the RGB
picture to CIELAB concealing part, then second step is isolation using the k-medoid
method, and finally, the resulting stage is to veil green pixels and remove the disguised
green pixels, later in the following stage, it focuses on texture features statistics,
and this features spread in the neural framework. The neural network depicts the
performance well and could viably perceive and arrange the attempted disease.
Toward plant disease detection, genuine organic fiascos cause incredible harm to
edit creation. The plant infection information of various could be dissected, and later
an estimating report was produced. This exploration was made by normal for the
plant ailment which presents an estimated framework’s structure, and key improve-
ments acknowledgment dependent on information mining’s plant malady framework
[9]. The information mining capacity may isolate into two sorts; they are descrip-
tion and conjecture. The information mining is a sort of profound level information
investigation, it can extricate the mass information from the certain standard infor-
mation, and the profound level’s advancement may additionally improve the data
asset. The restricted information within the information of the executive’s module
information input stockroom is available as indicated by the portrayal. The infor-
mation unearthing endures the accomplishment of the conjecture of the impact. The
framework combines information of the executives to report the structure to produce
and gauge in windows framework.
In regards to detection, pictures structure noteworthy data and information in
common sciences. Plant sicknesses have changed into an issue as it can cause a basic
decline in both quality and measure of provincial things [10]. Modified acknowledg-
ment of plant illnesses is a principal inquiry about a point as it would exhibit benefits
in watching colossal fields of yields, and as such subsequently recognize the signs of
infections when it appears on plant leaves. The proposed structure is an item answer
An Optimized Method for Segmentation and Classification … 785
3 Proposed Methodology
The early identification of these side effects is useful to progress the value and
creation of natural products. On the grounds of machine visualization, it is a func-
tioning examination region to discover the injury spot and concerning a few tech-
niques that are presented for organic products sicknesses identification through
picture handling and AI calculations. A great deal of division, highlights extrac-
tion, and grouping methods are proposed in writing for organic products infections
division and acknowledgment, for example, a mix of highlights based indications
distinguishing proof, shading-based division, connection-based highlights determi-
nation, improved weighted division, surface highlights, shape highlights, support
vector machine (SVM), and so on.
The covering model is an extra substance hiding model in which RGB lights are
associated with different propensities to replicate a broad demonstration of shades.
The name of the model beginnings from the initials of the three included substance
key shades namely R: Red, B: Blue, and G: Green. The rule reason behind the red-
blue-green covering model is for the perceiving, delineation, and show of images in
electrical frameworks, for example, television screens and personal computers. In any
case, it possesses the way of utilizing standard photography. Before the electronic age,
the red–green–blue covering model concerning a strong theory behind it, orchestrated
in a human point of view on tints (Fig. 2).
Fig. 2 RGB
A + x = α + x|αε A (1)
Note that since the advanced picture is comprised of pixels at an essential arrange
area (Z2), this suggests limitations on the reasonable interpretation vectors x. The
fundamental tasks of Minkowski are included and taken away, and they are presently
quantifiable. Morphological separating procedures apply to pictures at the dim level.
The segments are organized with certain constraints to a limited number of pixels,
and the curves are rearranged. Be that as it may, the organizing viewpoint presently
has dark qualities related to each area of the directions as the picture produces. The
points of interest can be found in Dougherty and Giardina.
The consequential result is that the most extreme channel and the base channel
are dark level widening and dim level disintegration for the particular organizing
component given by the state of the channel window with the dim worth “0” inside
the window. Morphological smoothing calculation is the perception that a dark level
opening flattens a dim worth picture over the outside of splendor known as the
capacity and the smooth dim intensity shutting from beneath. The morphological
788 S. S. Slathia et al.
Fig.3 Grayscale
gradient is where the inclination channel gives a vector representation. The version
given here makes an approximation of the scale of gradients.
Border corrected mask is where a channel is a cover. The covering guideline is
otherwise called spatial filtration. Veiling is called sifting as well. In this definition,
there is a process of managing the separating activity that is performed legitimately on
the picture. A portion, convolution network, or veil is a little grid in picture handling
that is valuable for obscuring, honing, decorating, edge-discovery, and that is only
the tip of the iceberg. This is accomplished by methods for a part picture convolution.
To locate the specific tasks in a picture, the veil is produced. The issues or highlights
can be discovered in a picture. The outskirt amended cover is a veil wherein all the
issues of a picture are shut to the edges.
In machine visualization, surrounding is the strategy for apportioning a digital
image data into various fragments. The division objective is to disentangle or poten-
tially change a picture’s portrayal into something that is progressively important and
simpler to examine. The division of pictures is commonly used to search for bends
and blemishes in the images. More exactly, image dissection is the procedure by
which every pixel in a picture is doled out a name, so pixels with an akin to coordi-
nate have the same attributes. The result of image dissection is a lot of sections which
covers the complete image all in all, or an assortment of forms got from the picture
(see Edge recognition). Every one of the pixels in an area is comparable. Nearby
areas differ significantly for the similar principles of the data. Utilizing interjection
calculations like leading solid shapes, the form produced after data division can be
used to make a three-dimensional simulation when used on a pile of data, which is
the basis of clinical imaging.
CCA is a notable picture handling method which checks a picture and gatherings
pixels into marked segments dependent on pixel network. An eight-point CCA stage
An Optimized Method for Segmentation and Classification … 789
is performed to find all articles produced from the former stage inside the double
picture [11]. The yield of this stage is a variety of N antiques that gives a case of that
stage’s info and yield. The proposed framework’s fundamental applications basi-
cally point to mechanical applications such as supporting early location, finding, and
fitting treatment, and segmentation of pictures assumes a significant job in numerous
applications for the picture preparing. Finally, to lower SNR conditions and various
things, the available issues are managed by computerization for the effective and
exact division.
4 Empirical Analysis
4.1 Preprocessing
To lower the abstract of pictures, both input and output are termed as preprocessing.
Preprocessing is used to improve the data; it removes distortions and undesirable
aspects of the data and also enhances important features that are required for further
processing. As all image processing methods use redundancy of images, Pixels
of the same image have identical corresponding luminosity values. The input data
requires preprocessing techniques, so that the correct analysis of the data can take
place. This implies that if any neighboring pixels that may be corrupted and can also
be applied for data analysis. This method requires changing the size of the input data
and changing it to a grayscale picture by using different filters. Data cleaning is the
process to find, remove, and replace or missing data. Searching for local extreme
and abrupt changes in the data can help in identifying notable trends in the data. The
grouping method is used to signify the relationships between the various data points.
The preprocessing applies certain methodologies wherein all the input images are
resized into the same dimensions. The output image is altered in case the input data
does not have the same specified aspect ratio. Image filtering is a process to enhance
the input data. For example, an image can be filtered to highlight some aspects or
erase some aspects. Next, if one single bit of colored pixel needs to be stored then 24
bits are required, whereas a grayscale image only requires 8 bits of storage. There is
a significant drop in the memory requirement (by almost 67%) which is extremely
useful. Grayscale reduces ambiguity from the value of a 3D pixel (RGB) to a value
of 1D easily. Most functions with 3D pixels (e.g., edge detection, morphological
operations, etc.) cannot be enhanced.
4.2 Segmentation
In the field of automotive training, recognizing the sequence of processing the input
image, culling of the features is initiated by putting together the processed informa-
tion and constructs properties that are optimized for information and non-redundancy.
The extraction of features is connected to a reduction in dimensionality. When the
input for an algorithm is extensive for processing and is considered to be repeti-
tive, it can be translated into a minimized set of properties. The function selection
is called to determine a subset of the initial features [12]. In order that the desired
task may not be complete, the opted characters will have the required information
An Optimized Method for Segmentation and Classification … 791
from the input data, using this minimized depiction. Shape features, color features,
geometrical features, and texture features.
To begin with, the shape features comprise of shape characteristics like round
objects or any other shape where perimeter boundaries of the objects along with
the diameter of the order, are defined as shape features. Next, color features where
the color and texture histograms and the whole picture color structure form part
of the global apps. Color, texture, and shape features provide local characteristics
for sub-images, regions, and points of interest. These image extracts are then used
to match and retrieve images. Then, geometrical features in which the geometric
characteristics of objects consisting of a sequence of geometric elements such as
points, lines, curves, or surfaces are essential. Such characteristics may be corner
elements, edge characteristics, circles, rids, a prominent point of picture texture, etc.
Finally, texture features of an image fabric is a cluster of the processed parameters
in the processing of an image that defines the quantum of a perceived arrangement
of an image texture that gives information about the contiguous pattern of spectrum
or sharpness of the data or a specific area the data (Fig. 5).
Here, there is the utilization of feature extraction methods like gray-level co-
occurrence matrix (GLCM), local binary pattern (LBP), region segmentation, and
genetic algorithm. GLCM gives the structure features of the input data like clarity,
correlation, energy, etc. Then, the LBP gives the various shape features of the input
image. After that genetic algorithm is applied to choose the best attribute to distin-
guish the different diseases that may occur in the leaf. The region properties segmen-
tation is utilized to get the mathematical features of the input image such as density,
area, and so on. From all the above, the extracted features serve as the best features
that are identified which are related to differentiating the various leaf diseases.
792 S. S. Slathia et al.
4.4 Classification
The methodology of taking out data from a host of image indexes is called image
classification. To create thematic charts, the resulting raster from image classification
can be used. A toolbar for image classification is the preferred way for classification
and multivariate analysis.
In this research, an enhanced robotized PC based strategy is planned and approved for
acknowledgment of disease. The sore blemish differentiates extending, sore division,
and conspicuous highlights determination and acknowledgment steps. The differenti-
ation of the contaminated spot is upgraded, and division is performed by the projected
technique. The performance of the projected technique is additionally upgraded by
region segmentation. At that point, numerous highlights are removed and melded by
utilizing an equal strategy. A genetic calculation is used to choose the best highlights,
and later they are used by KNN for grouping. In the future, the proposed method-
ology can be grouped as many others that are yet to be developed methods like texture
analysis and classification. This helps in determining the stages of the disease. It will
be of great help as the system is not dependent on the disease. The proposed system
can also be greatly enhanced to identify diseases that do not originate at the leaves
but rather at different parts of the plant. Sudden death syndrome (SDS) can also be
integrated into our module, but due to the lack of proper dataset at present, it could
An Optimized Method for Segmentation and Classification … 793
not be incorporated into this present work. Another advancement of the work could
be to add and identify the different ways in which pests affecting the plants as each
pest has a different way of attacking the plants. Finally, one major upgrade could
also be used to identify what kind of nutrient deficiency the plant is facing due to
which it is having those diseases and proper care can be taken care of the plant.
References
1. Rozario LJ, Rahman T, Uddin MS (2016) Segmentation of the region of defects in fruits and
vegetables. Int J Comput Sci Inf Secur 14(5)
2. Chuanlei Z, Shanwen Z, Jucheng Y, Yancui S, Jia C (2017) Apple leaf disease identification
using genetic algorithm and correlation based feature selection method. Int J Agric Biol Eng
10(2):74–83
3. Sharif MY, Khan MA, Iqbal Z, Azam MF, Lali MI, Javed MY (2018) Detection and classifi-
cation of citrus diseases in agriculture based on optimized weighted segmentation and feature
selection. Comput Electron Agric 150:220–234
4. Sapkal AT, Kulkarni UV (2018) Comparative study of leaf disease diagnosis system using
texture features and deep learning features. Int J Appl Eng Res 13(19):14334–14340
5. AlShahrani AM, Al-Abadi MA, Al-Malki AS, Ashour AS, Dey N (2018) Automated system
for crops recognition and classification. Comput Vis Concepts Method Tools 1208–1223
6. Gavhale KR, Gawande U (2014) An overview of the research on plant leaves disease detection
using image processing techniques. J Comput Eng 16(1):10–16
7. Camargo A, Smith JS (2009) An image-processing based algorithm to automatically identify
plant disease visual symptoms. Biosyst Eng 102(1):9–21
8. Zhang S, Wu X, You Z, Zhang L (2017) Leaf image based cucumber disease recognition using
sparse representation classification. Comput Electron Agric 134:135–141
9. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput
Electron Agric 145:311–318
10. Shuaibu M, Lee WS, Hong YK, Kim S (2017) Detection of apple marssonina blotch disease
using particle swarm optimization. Trans ASABE 60(2):303–312
11. Kamilaris A, Prenafeta-Boldu FX (2018) Deep leuarning in agriculture: a survey. Comput
Electron Agric 147:70–90
12. Gu Y, Cheng S, Jin R (2018) Feature selection for high-dimensional classification using a
competitive swarm optimizer. Soft Comput 811–822
A Thorough Analysis of Machine
Learning and Deep Learning Methods
for Crime Data Analysis
Abstract The analysts belonging to the police forces are obliged for exposing the
complexities found in data, to help the operational staff in nabbing the criminals and
guiding strategies of crime prevention. But, this task is made extremely complicated
due to the innumerous crimes, which take place and the knowledge levels of recent
day offenders. Crime is one of the omnipresent and worrying aspects concerning
society, and preventing it is an important task. Examination of crime is a systematic
means of detection as well as an examination of crime patterns and trends. The data
work involving includes two important aspects, analysis of crime and prediction of
perpetrator identity. Analysis of crime has a significant role to play in these two steps.
Analysis of the crime data can be of massive help in the prediction and resolution of
crimes from a futuristic perspective. To avert this issue in the police field, the crime
rate must be predicted with the help of AI (machine learning) approaches and deep
learning techniques. The objective of this review is to examine the AI approaches
and deep learning methods for prediction of crime rate that yield superior accuracy,
and this review article also explores the suitability of data approaches in the attempts
made toward crime prediction with specific predominance to the dataset. This review
evaluates the advantages and drawbacks faced by crime data analysis. The article
provides extensive guidance to the evaluation of model parameters to performance
in terms of prediction of crime rate by carrying out comparisons ranging from deep
learning to machine learning algorithms.
Index Terms Big data analytics (BDA) · Support vector machine (SVM) ·
Artificial neural networks (ANNs) · K-means algorithm · Naïve Bayes
J. Jeyaboopathiraja (B)
Research Scholar, Department of Computer Science, Sri Ramakrishna College of Arts and
Science, Coimbatore, India
e-mail: jeyaboopathi@gmail.com
G. Maria Priscilla
Professor and Head, Department of Computer Science, Sri Ramakrishna College of Arts and
Science, Coimbatore, India
e-mail: mariajerryin76@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 795
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_58
796 J. Jeyaboopathiraja and G. Maria Priscilla
1 Introduction
Recently, big data analytics (BDA) has evolved to be a prominent technique used for
the data analysis and extraction and their relevance in an extensive array of appli-
cation fields. Big data involves the accessibility to an extreme amount of data that
tends to become hard to be stored, processed, and mined with the help of a classical
database fundamentally because the data available is massive, complicated, unorga-
nized, and quickly varying. This one is the possible and critical basis behind the big
data’s idea, which is being initially encouraged by online companies such as Google,
eBay, Facebook, LinkedIn, etc. The term “big data” indicates a digital repository of
information having an enormous volume, velocity, and diversity. Analytics in big
data refers to the software developing process to unravel the trends, patterns, asso-
ciations, or other meaningful perspectives in that enormous amounts of information
[1]. Big data has an inevitable part to play in different domains, like agriculture,
banking, data mining, education, chemistry, finance, cloud computing, marketing,
and healthcare stocks [2].
Owing to the consistent urbanization and rising population, society has become
city-centric. But, an increasing number of savage crimes and accidents have also
accompanied these developments. To deal with these problems, sociologists, analyst,
and protection organizations have dedicated many endeavors toward the mining of
important patterns and factors. The development of ‘big data,’ necessitating new
techniques toward the well-organized and precise analysis of the rising volumes of
data that are very much criminal, has remained a critical problem for every law
enforcement and foundations of intelligence collection. The crime has increased
multi-fold over the passage of time, and the criminals have begun using the latest trend
in technology not just for committing the offenses, along with the runoff acquittal.
Crime is not anymore confined to the boulevards and back rear entryways (back
alleys) in the neighborhood places. Also, the Internet which acts as a connecting
bridge for the whole world thrives as a field for the more crooked minded criminals
recently. Acting upon the barbaric acts like the 9/11 terrorist assaults and technology
exploitation for hacking the most protected databases used for defense, novel and
efficient techniques of crime prevention has emerged to gain rising significance [3].
Data mining is called as a potential tool with remarkable capability to aid the illegal
examiners in highlighting the majority of vital information concealed in the crime
‘big data.’ Mining the data in the form of a tool for crime investigation is identified
as a relatively novel and popular research domain. In addition to the improving usage
of the systems that are computerized for crime tracking, analysts in computer data
have got into assisting by the officers of law enforcement and detectives not just to
accelerate the process of crimes’ resolution [4], but also for advance prediction of
crimes. The improving accessibility of big data has also influenced the importance
of command by the applications involving several data mining approaches, and its
simplicity to be used by people with no skills in data analysis and knowledge on
statistics.
A Thorough Analysis of Machine Learning and Deep Learning Methods … 797
The capability of analyzing this extremely high amount of data along with its
intrinsic drawbacks with no computational assistance imposes manual pressure [5].
This review investigates the modern approaches of data mining techniques, which
are utilized for the prediction of crime and criminality. Diverse classification algo-
rithms like Naïve Bayesian, decision tree (DT), back propagation (BP), support
vector machine (SVM), and deep learning techniques have been utilized for the
prediction “crime category” for differentiation. This technical work provides the
details on different data mining classifiers employed for the crime data prediction.
This technical work also studies the advantages and drawbacks of different data
mining approaches in crime data.
2 Literature Review
Mining the data techniques are utilized for the detection as well as prevention of
crime. Classical classification approaches concentrate on both organized and unor-
ganized data for pattern detection. The evolution of big crime data has made many of
the available systems employ an ensemble of data mining approaches to get exact and
accurate predictions. Crime analysis can range over an extensive array of activities
in crime starting from simple mistreatment of public duties to crimes that are pre-
arranged at an international level [1]. This section reviews the classical data mining
techniques and deep learning techniques in crime data analysis.
crime. This research work describes the problems that come up during the analysis,
which need to be eliminated to obtain the required result.
Pramanik et al. [7] studied a framework on big data analytics, which investigates
four parameters, they are criminal networks, such as network extraction, subgroup
location, association design revelation, and central member identification and with
success. Big data sources, change, stages, and devices are integrated to render four
significant functions, which exhibit a potential correlation with the two dimensions
of SNA. Also, social network analysis (SNA) is a well-known and proactive tech-
nique to unravel the earlier mysterious structural patterns from criminal networks.
SNA was identified to be a resourceful and effective mechanism for analyzing crim-
inal organizations and firms. With the constant introduction of modern platforms of
big data analytics, tools, and approaches, the years to come will witness the broad
deployment and usage of big data across defense and law enforcement companies.
Jha et al. [8] presented on data analytical techniques based on big data can
help prevent the crimes. Also, various approaches of data collection have been
studied, and it comprises of volunteered geographic information (VGI) along with
geographic information system (GIS) and Web 2.0. The final stage determination
includes the forecasting that depends on the gathering of data and investigation.
Big data is regarded as a suitable framework for the crime data analysis since it
renders better throughput, fault resilience helps in the analysis of massive datasets,
processes on hardware goods, and produces trustworthy outcomes, while the Naïve
Bayes algorithm of machine learning can predict better utilizing the available dataset.
Nadathur et al. [9] provided a comprehensive overview of crime incidences and
their relevance in literature through the combination of techniques. This review work,
as the first step, analyzes and detects the features of crime occurrences introducing
a schema on the combinatorial incident description. The newly introduced schema
tries to find a method for a systematic merging of various elements or crime features,
and the system provides a database with much better throughput and lesser main-
tenance expenditure applying Hadoop tools with HDFS and map-reduce programs.
Besides, an elaborate list comprising of crime-associated violations is presented.
This facilitates a perfect interpretation of the repetitive and underlying criminal
actions. This review work tries to help experts and law enforcement officers in finding
the patterns and trends in rendering the forecasts, discovering the association, and
probable explanations.
ToppiReddy et al. [10] studied different visualizing approaches and AI algorithms,
which are followed for the distribution of crime prediction over a region. First, process
the untreated datasets that are processed and then envisage as per the requirement.
KNN is a technique employed for arrangement purposes. The object classification
is performed with a mainstream vote from its neighbor, and the presumed object
belongs to the class that is famous among its k-nearest neighbors. Naïve Bayes
depends on Bayes theorem that defines the likelihood of an occurrence by the earlier
acquaintance of constraints having relevance to the event. Then, AI was utilized
for the information extraction from these massive datasets and finds the concealed
associations amid the data which in turn is further utilized for reporting as well as
finding the patterns in the crime that was helpful for crime analysts in the analysis
A Thorough Analysis of Machine Learning and Deep Learning Methods … 799
the Naïve Bayes method is used. It is found that Naïve Bayes yields much better
accuracy in comparison with KNN for the crime prediction.
Deepika and SmithaVinod [15] designed a technique for India’s crime detection
that employs data mining approaches. The mechanism comprises steps such as data
preprocessing, clustering, classification, and visualization. The field of criminology
studies about different crime features. Clustering through K-means helps in the detec-
tion of crime, and the groups are created depending on the resemblance found in the
crime characteristics. The random forest algorithm and neural networks are used for
data classification. Revelation (visualization) is performed employing the Google
marker clustering, and the crime hotspots are plotted on the India map. WEKA tool
helps in validating the accuracy.
Dhaktode et al. [16] presented a data mining approach that is employed for
analysis, examination, and verifies the patterns in crimes. A clustering technique
is enforced for the analysis of crime data, and the stored data is clustered employing
the K-means algorithm. Once the classification and clustering are performed, a crime
can be predicted by its past data. This newly introduced system can specify areas
having a greater probability of crime rate and different regions having a greater crime
rate.
Jain et al. [17] designed a systematic approach for crime analysis and prevention
to spot and investigate the patterns and trends found in crime. This system can
help to predict the areas having a greater probability for crime incidences and can
help to visualize the crime vulnerable hotspots. The growing usage of computerized
systems is of much aid to the crime data analysts in helping law enforcement officials
to solve crimes faster. K-means algorithm is performed by dividing the data into
groups according to means. Further, this algorithm includes a modification known
as the expectation–maximization algorithm where the data is partitioned based on
their parameters. This data mining framework is easy for implementation, and it
jointly operates with the geospatial plot of wrongdoing and increases the detective’s
efficiency and other law enforcement officials.
Sukanya et al. [18] worked on the analysis of the criminals’ data, and grouping
and classification approaches are utilized. These data are accumulated in the crimi-
nals’ repository. Spatial clustering algorithms and structured crime classification are
utilized for the classification of the crimes. These algorithms are useful in identifying
the spots of crime occurrences. The identification of the criminals will be done based
on the spectator or clue present at the location of the crime. Identifying the hotspot of
criminals’ occurrences will be valuable to the police forces to improve the security
of the specific region, and this will reduce the crimes to a much better extent in the
future. After the application of this concept to every area, the criminal activities can
be reduced to the maximum extent possible. The crimes cannot be controlled entirely.
Ladeira et al. [19] presented data preprocessing, transformation, and mining
approaches to find the crime details hidden in the dataset associating similar records.
Subsequently, the criminal records are categorized into three groups considering the
complexity of the criminal action, which are: A (low sophistication), B (medium
sophistication), or C (high sophistication). To know the effect of non-application and
utilization of preprocessing approaches and the data mining approaches that attain
A Thorough Analysis of Machine Learning and Deep Learning Methods … 801
the best outcomes, two experiments were carried out, and the comparison of their
mean accuracy was done. The application of the preprocessing and random forest
algorithm produced superior results and also the potential of knowing high dimen-
sional and dynamic data. As a result, an ensemble of these approaches can yield
better information to the police department. Inference of Data Mining Methods for
Crime Data is shown in Table 1.
Keyvanpour et al. [20] designed data mining approaches that were supported using a
multi-use framework for investigating the crimes intelligently. The framework used
a systematic technique for employing a self-organizing map (SOM) and multilayer
perceptron (MLP) neural networks for the grouping and classification of data in
crime. Design aspects and problems in employing hierarchical/partitional grouping
approaches are used in clustering the crime data.
Lin et al. [21] studied the idea of a criminal situation in a framework-based crime
forecast demonstrating and characterizes a lot of spatial-fleeting features that rely
upon 84 sorts of segment data utilizing the Google Places API to robbery information
for Taoyuan City, Taiwan. Deep neural networks was the best model, and it performed
better than the well-known random decision forest, support vector machine, and
K-near neighbor algorithms. Experiments show the significance of the geographic
feature design in increasing performance and descriptive capability. Also, testing for
crime displacement reveals that the copy of this design outshines the criterion format.
Feng et al. [22] suggested data analysis for the analysis of criminal data in San
Francisco, Chicago, and Philadelphia. First, the time series of the data is explored,
and forecasting crime trends in the coming years are performed. After this, with
the crime category predicted and the time and location are given, to get over the
problem of disproportion, compound classes are combined into bigger classes, and
selection of feature is carried out for accuracy improvement. Multiple state-of-the-
art data mining approaches, which are specially applied for forecasting the crime,
are presented. The results of experiments reveal that the tree classification models
outperformed this task of classification over KNN and Naive Bayesian techniques.
Holt-Winters integrated with the seasonality of multiplicative yields superior results
in the forecasting the crime trends. The potential results will be advantageous meant
for police forces and law enforcement in solving crimes faster and render the cues,
which can help in them nabbing the crimes, forecast the probability of happenings,
efficiently exploit the assets, and formulate quicker decisions.
Chauhan and Aluvalu [23] studied that in this emerging technological field, the
cyber-crimes are increasing at an exponential rate and are quite a challenge to the
skills of investigators. Also, the data on crime is rising magnanimously, and it is
generally in digital format. So the data generated cannot be managed with efficiency
employing classical analysis approaches. Rather than applying conventional data
analysis mechanisms, it would be advantageous to employ big data analytics for this
802 J. Jeyaboopathiraja and G. Maria Priscilla
Table 1 (continued)
S. Author name Technique name Benefits Drawbacks
no.
7 Yu [13] Data mining Best prediction It does not provide
classification technique to attain good support for
techniques the most consistent real-time
outcomes applications. Future
work has to
incorporate motor
vehicle theft-based
crime
8 Jangra [14] Naïve Bayes Enhances the Computation time is
accuracy of the crime excessive for a few
prediction approach classifiers.
Concurrent
techniques are
required for reducing
the classification
time
9 Deepika [15] K-means clustering, The technique will be One more problem is
random forest advantageous for the that they cannot
algorithm, and neural crime department of predict the time of
networks India in the analysis occurrence of the
of criminal activities crime
with superior
forecasting
10 Jain [17] K-means clustering Helps to increase the 1) Hard to
efficiency of the predict K-value.
officer and other law 2) Performance is
enforcement officials not good with the
global cluster
11 Sukanya [18] Clustering and The hotspot of the Real-time prediction
classification criminal activities is slow, hard to
technique and identifying the implement, and
criminals employ complicated
clustering and
classification
algorithms
12 Ladeira [19] Data preprocessing, Application of The prediction
transformation, and preprocessing process employing
mining techniques techniques and which random
data mining forests consumes
approaches yields more compared to
superior results other algorithms
804 J. Jeyaboopathiraja and G. Maria Priscilla
Lin et al. [28] designed a deep learning algorithm that has been found extensive
application in various fields; such as image identification and processing the natural
language. The deep learning algorithm yields superior prediction results compared
to other methodologies like random forest and Naïve Bayes for probable crime loca-
tions. Also, the model performance is improved by collecting data with diverse
time scales. For validating the results of experiments, the probable crime spots are
visualized on a map, and it is inferred if the models can find the real hotspots.
Krishnan et al. [29] formulated an artificial neural networks model, which replace
the traditional data mining approaches in a better manner. In this analysis, the predic-
tion of the crime is done with the help of recurring long short-term memory (LSTM)
networks. An available organized dataset is helpful in the prediction of the crimes.
Data is divided into training, testing data. Both the testing and training go through
the training process. The resultant training and testing data is then compared with
the real crime count, and its visualization is done.
Gosavi and Kavathekar [30] examined data mining approaches, which will use in
the detection and prediction of crimes employing association rule mining, k-means
clustering, decision trees, Naive Bayes, and machine learning approaches like deep
neural network and artificial neural network. Inferences from this survey were that if
the dataset instances contain more number of missing values, then preprocessing is
an important task, and crimes do not happen consistently across urban locations but
is concentrated in particular regions. Therefore, the prediction of crime hotspots is an
essential task, and the usage of post-processing will be of massive help in reducing
the crime occurrence rate.
Wang et al. [31] designed the benchmarked deep learning spatio-temporal
predictor, ST-ResNet, for aggregated prediction of the distribution of crime. These
models consist of two steps. The first one performs the preprocessing of the crude
data of crime. This comprises of regularization in both space and time to improve
the guessable signals. Secondly, hierarchical architectures of residual convolutional
units are adapted for training multifactor crime prediction models.
Mowafy et al. [32] showed that criminology is a critical field in which text mining
approaches have a significant part to play for law enforcement officers and crime
analysts to help in the investigation and speed up the resolution of crimes. A common
architecture for an extracting the crime procedure which combines the extraction of
text with the investigation of criminal procedure for forecasting the type of crime by
using the classification of text for the unorganized data in the police incident reports,
which is regarded to be a segment of the criminal behavior analysis.
Ivan et al. [33] recommended that approach is called the intelligence of business
and it is considered as dependent on supervised learning (organization) approaches
provided that there was branded training data. The comparison of four varied classifi-
cation algorithms including decision tree (J48), Naïve Bayes, multilayer perceptron,
and support vector machine was carried out to get the most efficient algorithm for
forecasting the crimes. The study employed classification models created with the
help of Waikato Environment for Knowledge Analysis (WEKA). Decision tree (J48)
is performed Naïve Bayes, multilayer perceptron, and support vector machine (SVM)
algorithms, and exhibited much better presentation both in terms of execution time
806 J. Jeyaboopathiraja and G. Maria Priscilla
and accuracy. Inference of Deep Learning Methods for Crime Data is shown in Table
2.
In the criminology literature, the association among crime and different features
has been rigorously analyzed, where common instances are historical crime records,
unemployment rate, and similarity in space. This literature review depicts conceptu-
alizing predictive policing, and it is imminent and reaped out advantages and disad-
vantages. The research reveals a variance between the substantial focus for potential
benefits and disadvantages of predictive policing in the literature and the available
empirical proof. The empirical proof yields very limited assistance for the advan-
tages claimed of predictive policing. While few empirical studies show that predictive
policing mechanisms result in a reduction in crime, others show no influence. Concur-
rently, no empirical proof exists at all for the disadvantages given. With the rising
advent of computerized systems, crime data analysts can prove to be of massive
help to the law enforcement executives to accelerate the practice of rectifying the
crime. Employing the extraction of data and statistical approaches, novel algorithms
and systems have been designed alongside new sorts of information. The impact of
AI and measurable methodologies (statistical approaches) on wrongdoing or other
enormous information applications like auto collisions or time arrangement infor-
mation will encourage the investigation, extraction, and translation of significant
examples and patterns, subsequently helping in the prevention of criminal activities
and control. In comparison with deep learning algorithms, machine learning algo-
rithms are bogged down by a few challenges. Having all those benefits to its potential
and ubiquity, machine Learning is just not exact. The below factors are its limitations:
Machine learning need huge datasets for training, and these must be comprehen-
sive/impartial and worthy. They can also encounter circumstances where they have
to wait for the generation of novel information.
ML requires sufficient time to allow the algorithms that are trained and extend suffi-
ciently to satisfy their objective with a reasonable amount of exactness and accor-
dance. Also, enormous resources are required for its functioning. These can imply
more computer resources required.
A Thorough Analysis of Machine Learning and Deep Learning Methods … 807
Table 2 (continued)
S. no. Author name Technique name Benefits Drawbacks
6 Shermila [27] Multilinear The system describes KNN algorithm is
regression, the offender that it is not an active
K-neighbors employing learner; it does not
classifier, and neural algorithms use the training data
networks predicatively for learning anything
through, multilinear and just uses only
regression, the training data
K-neighbors for classification
classifier, and neural
networks
7 Lin [28] Deep learning Machine learning Crime incidence
algorithm technique developed prediction, similar to
to yield increased finding highly
prediction of future nonlinear
crime hotspot correlations,
locations, with results redundancies, and
verified by real crime dependencies
data between numerous
datasets
8 Krishnan [29] Neural network Crime prediction is Undefined behavior
done with recurring of the network,
LSTM networks hardship in
demonstrating the
problem to the
network, and the
duration of
the network is
unpredictable
9 Shruti S. Gosavi Association rule Prediction of crime Crime does not
[30] mining, k-means hotspots is a highly happen consistently
clustering, decision significant task, and across urban regions
trees, Naive Bayes, also usage of but is concentrated
and machine post-processing will in particular regions
learning techniques aid in reducing the
rate of crimes
10 Wang [31] CNNs and ST-ResNet Every training step
ST-ResNet framework, the past takes a much longer
dependencies have to time
be fixed in an explicit
manner and explicit
dependencies that are
much longer make
the network to be
highly complicated
and hard to train
A Thorough Analysis of Machine Learning and Deep Learning Methods … 809
One more important dispute is the capability of accurate interpretation of results that
the algorithm generates. The algorithms also must be carefully chosen.
4 Solution
Using deep learning and neural networks, novel representation has been designed
for the prediction of crime incidence [34]. Since deep learning [35] and artificial
intelligence [36] have gained many victories in the vision of the computer, they
have found application in BDA for the prediction of tendency and categorization.
Big data analytics offers the skills for transforming how that the department of law
enforcement and agencies in intelligence security perform the extraction of essential
information (e.g., criminal networks) from various sources of data in real-time for
corroborating their surveys. Also, deep learning can be introduced in the form of a
cascade of layers. Every succeeding layer takes the output signal from the earlier
layer as its info (input). This feature and several other features yield few benefits
while used for resolving different problems.
Traditional machine learning assumes that humans design purpose and this strategy
consume quite a lot of time. Deep learning has the capability of creating new services
depending on the inadequate number of them present in their learning dataset. It is
implied is that deep learning algorithms can generate novel works to attain the recent
goals.
Deep learning methods are greatly simple to be adapted to dissimilar fields, and
compared to conventional ML algorithms, it can evolve the potential facilitation of
transfer learning in which the complete model is learned, in many cases, aiding to
attain much greater efficiency in a shorter span. Scalability is one more significant
drawback. The neural networks can deal with data increase compared to conventional
machine learning algorithms.
the deep learning is to be evaluated, and insights can be provided for their configura-
tion for attaining superior performance in crime classification and ultimately crime
prediction.
References
1. Hassani H, Huang X, Silva ES, Ghodsi M (2016) A review of data mining applications in
crime. Statist Anal Data Mining ASA Data Sci J 9(3):139–154
2. Memon MA, Soomro S, Jumani AK, Kartio MA (2017) Big Data analytics and its applications.
Ann Emerg Technol Comput (AETiC) 1(1):46–54
3. Dhyani B, Barthwal A (2014) Big Data analytics using Hadoop. Int J Comput Appl 108(12)
4. Hassani H, Saporta G, Silva ES (2014) Data Mining and official statistics: the past, the present
and the future. Big Data 2(1):34–43
5. McClendon L, Meghanathan N (2015) Using machine learning algorithms to analyze crime
data. Machine Learning Appl Int J (MLAIJ) 2(1):1–12
6. Tyagi D, Sharma S (2018) An approach to crime data analysis: a systematic review.
Communication, Integrated Networks Signal Processing-CINSP 5(2):67–74
7. Pramanik MI, Zhang W, Lau RY, Li C (2016) A framework for criminal network analysis using
big data. In IEEE 13th international conference on e-business engineering (ICEBE), pp 17–23
8. Jha P, Jha R, Sharma A (2019) Behavior analysis and crime prediction using Big Data and
Machine Learning. Int J Rec Technol Eng (IJRTE) 8(1)
9. Nadathur AS, Narayanan G, Ravichandran I, Srividhya S, Kayalvizhi J (2018) Crime analysis
and prediction using Big Data. Int J Pure Appl Math 119(12):207–211
10. ToppiReddy HKR, Saini B, Mahajan G (2018) Crime prediction and monitoring framework
based on spatial analysis. Proc Comput Sci 132:696–705
11. Yerpude P, Gudur V (2017) Predictive modelling of crime dataset using data mining. Int J Data
Mining Knowl Manage Process 7(4)
12. Pradhan I, Potika K, Eirinaki M, Potikas P (2019) Exploratory data analysis and crime prediction
for smart cities. In Proceedings of the 23rd international database applications and engineering
symposium
13. Yu CH, Ward MW, Morabito M, Ding W (2011) Crime forecasting using data mining
techniques. In IEEE 11th international conference on data mining workshops, pp 779–786
14. Jangra M, Kalsi S (2019) Naïve Bayes approach for the crime prediction in Data Mining. Int
J Comput Appl 178(4):33–37
15. Deepika KK (2018) SmithaVinod, Crime analysis in India using data mining techniques. Int J
Eng Technol 7:253–258
16. Dhaktode S, Doshi M, Vernekar N, Vyas D (2019) Crime rate prediction using K-Means. IOSR
J Eng (IOSR JEN) 25–29
17. Jain V, Sharma Y, Bhatia A, Arora V (2017) Crime prediction using K-means algorithm. Global
Res Dev J Eng 2(5):206–209
18. Sukanya M, Kalaikumaran T, Karthik S (2012) Criminals and crime hotspot detection using
data mining algorithms: clustering and classification. Int J Adv Res Comput Eng Technol
1(10):225–227
19. Ladeira LZ, Sanches MF, Viana C, Botega LC (2018) Assessing the impact of mining techniques
on criminal data quality, Anais do II Workshop de Computação Urbana (COURB), vol 2(1)
20. Keyvanpour MR, Javideh M, Ebrahimi MR (2011) Detecting and investigating crime by means
of data mining: a general crime matching framework. Proced Comput Sci 872–880
21. Lin YL, Yen MF, Yu LC (2018) Grid-based crime prediction using geographical features.
ISPRS Int J Geo-Inf 7(8)
812 J. Jeyaboopathiraja and G. Maria Priscilla
22. Feng M, Zheng J, Han Y, Ren J, Liu Q (2018) Big Data Analytics and Mining for crime data
analysis, visualization and prediction. in International conference on brain inspired cognitive
systems, pp 605–614
23. Chauhan T, Aluvalu R (2016) Using Big Data analytics for developing crime predictive model.
In RK University’s first international conference on research and entrepreneurship, pp 1–6
24. Pramanik MI, Lau RY, Yue WT, Ye Y, Li C (2017) Big Data analytics for security and criminal
investigations, Wiley interdisciplinary reviews: data mining and knowledge discovery, vol 7,
No 4
25. Stalidis P, Semertzidis T, Daras P (2018) Examining Deep Learning architectures for crime
classification and prediction, arXiv preprint arXiv:1812.00602
26. Kang HW, Kang HB (2017) Prediction of crime occurrence from multi-modal data using deep
learning. PloS one, vol 12, No. 4
27. Shermila AM, Bellarmine AB, Santiago N (2018) Crime data analysis and prediction of perpe-
trator identity using Machine Learning approach. In 2nd international conference on trends in
electronics and informatics (ICOEI), pp 107–114
28. Lin YL, Chen TL, Yu LC (2017) Using machine learning to assist crime prevention. In 6th
IIAI international congress on advanced applied informatics (IIAI-AAI), pp 1029–1030
29. Krishnan A, Sarguru A, Sheela AS (2018) Predictive analysis of crime data using Deep
Learning. Int J Pure Appl Math 118(20):4023–4031
30. Gosavi SS, Kavathekar SS (2018) A survey on crime occurrence detection and prediction
techniques. Int J Manage Technol Eng 8(XII):1405–1409
31. Wang B, Yin P, Bertozzi AL, Brantingham PJ, Osher SJ, Xin J (2017) Deep learning for real-
time crime forecasting and its ternarization. In International symposium on nonlinear theory
and its applications, pp 330–333
32. Mowafy M, Rezk A, El-bakry HM (2018) General crime mining framework for unstructured
crime data prediction. Int J Comput Appl 4(8):08–17
33. Ivan N, Ahishakiye E, Omulo EO, Wario R (2017) A performance analysis of business
intelligence techniques on crime prediction. Int J Comput Inf Technol 06(02):84–90
34. Wang M, Zhang F, Guan H, Li X, Chen G, Li T, Xi X (2016) Hybrid neural network mixed
with random forests and perlin noise. In 2nd IEEE international conference on computer and
communications, pp 1937–1941
35. Wang Z, Ren J, Zhang D, Sun M, Jiang J (2018) A deep-learning based feature hybrid framework
for spatiotemporal saliency detection inside videos. Neurocomputing 287:68–83
36. Yan Y, Ren J, Sun G, Zhao H, Han J, Li X, Marshall S, Zhan J (2018) Unsupervised image
saliency detection with Gestalt-laws guided optimization and visual attention based refinement.
Pattern Recogn 79:65–78
Improved Density-Based Learning to
Cluster for User Web Log in Data Mining
Abstract The improvements in tuning the website and improving the visitors’ reten-
tion are done by deploying the efficient weblog mining and navigational pattern
prediction model. This crucial application initially performs data clearing and initial-
ization procedures until the hidden knowledge is extracted as output. To obtain good
results, the quality of the input data has to be promisingly good, and hence, more focus
should be given to pre-processing and data cleaning operations. Other than this, the
major challenge faced is the poor scalability during navigational pattern prediction.
In this paper, the scalability of weblog mining is improved by using suitable pre-
processing and data cleaning operations. This method uses a tree-based clustering
algorithm to mine the relevant items from the datasets and to predict the navigational
behavior of the users. The algorithm focus will be mainly on density-based learning
to cluster and predict future requests. The proposed method is evaluated over BUS
log data, where the data is of greater significance since it contains the log data of
all the students in the university. The conducted experiments prove the effectiveness
and applicability of weblog mining by using the proposed algorithm.
N. V. Kousik (B)
Galgotias University, Greater Noida, Uttarpradesh 203201, India
e-mail: nvkousik@gmail.com
M. Sivaram
Research Center, Lebanese French University, Erbil 44001, Iraq
e-mail: sivaram.murugan@lfu.edu.krd
N. Yuvaraj
ICT Academy, Chennai, Tamilnadu 600096, India
e-mail: nyuvaraj89@gmail.com
R. Mahaveerakannan
Hindusthan College of Engineering and Technology, Coimbatore 110070, India
e-mail: mahaveerakannan10@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 813
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_59
814 N. V. Kousik et al.
1 Introduction
In the present scenario, the entire world is relying on the website to interact with
the other end. The institutions, organization, and industries retain their clients by
using many ways to make their website more efficient and reliable. This is achieved
using the auditing operation, which can be performed in two ways. The first way is
to evaluate the browsing history of a specific user and the collected information is
used for enhancing the website structure or feedback contents received from the user
and it can be used to improve the website experience. The second way is to record
the navigational history of the client and this is used to improve the experience of
the user. The second option is used widely since it does not rely on voluntary inputs
of the client and it also automates the analysis of user navigational history. This
is referred to as web usage mining (WUM) or weblog mining (WLM). The WLM
finds its application in many fields and it includes web content personalization,
recommender system [15], prefetching, and catching [14]. The benefits of weblog
mining find its major usefulness in e-commerce applications in browsing, where the
clients are targeted with relevant advertisements and products.
Such a web access file is created automatically by the web server and it includes
each view of the object, image, or HTML document, which is logged by the user. Each
weblog file of a website is a single line text obtained due to each view and it has two
log files, namely common log file and extended log file. Data in the file contains the
navigation patterns of single or multi-user of a single or multiple website browsing
behaviors of entire web traffic. The general characteristic other than a source of
collection of a weblog file includes a text tile with the identical format, contains
only single HTTP request, and support information like IP address, file name, HTTP
response status and size, request date and time, URL, and browser data.
Weblog mining consists of three processes: pre-processing or data cleaning,
mining the pre-processed data to extract hidden knowledge, and then analyzing the
results obtained after extraction. The weblog mining deals mostly with huge datasets,
and hence, issues occur due to the availability of space and run time. Apart from such
issues, the other challenges arise due to the nature of the log file [1]. The web server
logs poorly track the user navigational history since it provides the entire control
over server capacity and bandwidth. Here, a problem arises for the data mining algo-
rithm due to the web access log accumulated from the server, obtained from user
navigational history, which is used to extract the user’s navigational pattern.
In this paper, the main aim is to analyze the WLM process and pattern prediction
of the online navigational history of the user. The present work considers the access
log file to process and extract the hidden knowledge. Such usage data is also collected
from the user through browser cookies; however, it is not of prime concern, since it
raises privacy concerns associated with the user. The major contribution of the paper
includes pre-processing and cleaning operations under three stages. The second is the
usage of a tree-based clustering algorithm for mining the user navigational pattern.
The final contribution is the use of an effective way to predict the navigational
behavior of online users and test the effectiveness of the methods.
Improved Density-Based Learning to Cluster for User Web Log … 815
2 Related Works
There are many similar approaches for improving WLM, which enhances pattern
sequence identification in the data stream. In [1], a task-oriented weblog mining
behavior is used for identifying online browsing behavior in PC and mobile plat-
forms. This method uses footstep graph visualization to navigate the patterns using
sequential rule mining in clickstream data. The purchase decision is predicted using
sequence rules in exploration-oriented browsing behavior.
In [2], pre-processing, knowledge discovery, and analyzing the pattern are used to
extract weblog data. The extraction process is carried out using a neuro-fuzzy hybrid
model. This method uncovers the hidden patterns from WUM on a college website.
In the case of [3], the removal process is carried out using mutually supervised and
unsupervised descriptive knowledge mining. The clustering using association rule
and subgroup knowledge discovery is carried out in an extra pure olive oil commercial
website.
The taxonomy is used as a constraint to WLM [4], in which the transaction data
or user information is extracted using a weblog mining intelligent algorithm. This
method helps to enable the third party straight right of entry on website functionalities.
A rational recommendation method based on the action is used for WLM [5]. This
method uses lexical patterns for itemset generation and better recovery of hidden
knowledge.
WLM is carried out using a tool [6], which evaluates the pedagogical process to
identify instructor’s and student’s attitudes in a web-based system. This web-based
tool provides support to measure various parameters both at the micro- and macro-
level, i.e., for instructors and policymakers, respectively [6]. The study to evaluate the
self-care behavior of the participants, who are elderly, is carried out using a self-care
service system. This system provides service and analysis of elder people on daily
basis using WLM activity. Here, various self-care services are analyzed statistically.
Then, an interest-based representation constructs the assembly of the elders using
the ART2-enhance K-mean algorithm, which clusters the patterns. Finally, sequence-
based representation with Markov models and ART2 K-mean clustering scheme is
used for mining cluster patterns [7].
Form the webpages, web user ocular movement data is captured using an eye-
tracking tool in [8]. This eye-tracking technology is used for the classification of key
objects in those websites, where conventional techniques surveying are eliminated.
In this technique, a web user’s eye position data is identified in a monitor screen
and the weblog sequence’s total page visits are combined with this. It also extracts
significant behavior approaching on user activities. Temporal property is used in [9]
for obtaining connection knowledge, where the temporal property is attained in this
816 N. V. Kousik et al.
3 Proposed System
The weblog data accumulate successful hits from the Internet. Hits are defined as
requests made by the user for viewing a document or an image in an HMTL format.
Such weblog data is created automatically and stored moreover in a client-side server
or proxy server from the organization database. Weblog data has details like computer
making query request’s IP address details, request time with details, user ID, the status
field for defining whether the request is successful or not, transferred file size, URL,
browser name and version.
Data cleaning as well as pre-processing steps involve page views creation and
session generations. The session operations identification is based entirely on time-
dependent heuristics. This kind of time-dependent approach decides session time out
using time duration the threshold. This helps to attain better quality output using the
proposed approach.
3.1 Pre-processing
This is an initial step to clean weblog content, which converts log data in an
unformatted version to be accepted as an input to the cluster mining process. The
cleaning as well as pre-processing operation mainly involves three major steps, which
include: cleaning of data, user identification, and session identification. Data cleaning
and user identification involve data integration, data anonymization, data cleaning,
feature selection, scenario building, feature generation, and target data extraction.
The cleaning operation helps to improve the unwanted entries, which is quite an
important step with regards to analysis or mining. The weblog data pre-processing
has nine steps, which are shown in Fig. 1.
Data Integration
Weblog source is acquired from a weblog server for 6 to 12 months duration. The
dataset has various fields that are integrated from multiple sources and used in eval-
uation for proving the proposed method’s effectiveness. Here, considered the BUS
dataset with student’s records with multiple fields, which is shown in Table 1.
818 N. V. Kousik et al.
Data Cleaning
The cleaning operation involves three steps: missing as well as noisy data detection,
automated filling with a global constant value if missed/duplicated record is removed.
The negative values presence in the dataset makes the model’s performance uncom-
fortably. Hence, negative values in credit and duration features are replaced with
suitable positive values using binning as well as smoothening operations.
Feature Selection
This process eliminates redundant and irrelevant features using the Spearman corre-
lation analysis. It identifies features, which are correlating among them. In the present
dataset, features elimination like duration and kill_reason have taken place, since the
duration feature can also be obtained from login_time as well as logout_time, and
feature reason has elevated correlation through kill_reason. Another feature similar
to static_ip is eliminated, as it is established irrelevant to the current target.
Scenario Building
The proposed system is defined with two scenarios to analyze user behavior based
on university regulations. It includes the details of connected student’s identification
in the network and learning student’s behavior from any hotspots within a college
campus on a holiday.
Feature Generation
According to consider scenarios, required feature sets are generated by this. So,
student_username is divided into four features, which are shown in Table 2
820 N. V. Kousik et al.
New validity feature creation presents with two values as per scenario one, i.e.,
no feature is set, if logout or login time of students is not set, else there would not be
any string in the value. With seven values, created day features as per scenario two.
Every day is allocated with a value.
Target Data Extraction
The target data is extracted from the above pre-processing operation and schema is
shown in Table 3. Depending upon the scenario two, the number of network connec-
tions from Ras_description is computed and stored. When network connection count
is more than the threshold for a student, the activity is considered as unusual behavior.
If a medical student connects to the engineering faculty hotspot, it is regarded as
unusual behavior.
This step is used for separating potential users from the BUS dataset and interested
users are identified using the C4.5 decision tree classification algorithm. Decisions
rules are set to extract potential users from the dataset and the algorithm avoids the
entries updated via network manager. The network manager normally collects and
updates information by crawling around webpages. Such crawling collects huge log
files and creates a negative impact while extracting knowledge from user navigational
patterns. The proposed method resolves the issue by identifying the entries made by
the network manager’s prior segmentation of the potential users.
The weblog entries updated by the network manager’s are identified using its IP
address; however, this knowledge finds it difficult to discover the search engine and
agents. Alternatively, a root directory of the website is studied, since the network
managers read the root files prior to website access. The weblog files containing the
website access details are given to each manager before crawling to know its rights.
However, the access to network managers cannot be relied on, since the exclusion
standard of the network manager is considered voluntary and they try to detect and
eliminate all the entries of network manager, which has accessed the weblog file.
And also detects to eliminate all the network manager access within midnight. This
leads to the elimination of the network manager entries in head mode and computes
browsing speed and excludes network manager’s speed less than threshold value T
and also when the total visited pages exceeding a threshold value.
Improved Density-Based Learning to Cluster for User Web Log … 821
Browsing speed is estimated based on total pages count browsed and total session
time. For handling total entries count by network managers, a set of decision rules
are applied. This helps to group the user into potential user and non-potential users.
Using valid weblog attributes, the classification algorithm classifies users based on
training data. Attributes selection is carried out within 30 s and session time for
referring total pages is 30 min. Further, the decision rule for identifying the potential
user is set less than 30 min and the total pages right of entry is predetermined to less
than 5. The access code post is used for classifying users and it reduces weblog file
size, which helps to improve clustering prediction and accuracy.
The proposed method uses an evolving tree-based clustering algorithm, which groups
the potential users with its navigational pattern. The evolving tree graph sets connec-
tivity between the webpages, where the graph edges assign the weights based on
session time, connectivity time (is the measure of total visits in two webpages for a
particular session), and frequency.
N
Ti f x (k)
Tx y f y (k)
i=1
C x,y = (1)
N
Ti
Tx y
i=1
where
T i —time duration of the session, i in both x and y webpages.
T xy —requested time difference between x and y webpages in a specific session.
At kth position, if the webpage appears, it is denoted as f (k) = kand the frequency
measure between the x and y webpages is given as,
Nx y
Fx,y = (2)
max N x , N y
where
N xy —total sessions containing x and y webpages,
N x —sessions count at page x,
N y —sessions count at page y.
The values Cx,y and Fx,y are used to normalize the time and frequency values
between 0 s and 1 s. Hence, the degree of connectivity between the two webpages is
calculated using,
822 N. V. Kousik et al.
2C x,y Fx,y
wx,y = (3)
C x,y + Fx,y
The weights are stored in an adjacency matrix (m) and each entry of m has wx,y
values, as per Eq. (3). The increased use of edges in graphs is eliminated by discarding
the lesser correlated threshold values with minimum frequency contribution.
Evolving Tree Fundamentals
Figure 2 shows the structure of the tree with N node nodes in the network, where each
node is N l, j , where l is the node identity and the j is a parent node, l = 1, 2,…,
N node and j = 0,1,…, and i = j. Consider an example, where the node N 2,1 has the
parent node as N 1,0 . On other hand, the weight vector of each node is given as wl
= {wl ,1 ,wl,2 , …, wl,n } with n as the number of features and bl as hit counter. The hit
counter has the total counts; a node becomes the best matching pair (match) between
the webpages. The size and depth of the tree are determined by N node nodes and
maximum layers, e.g., size of the tree is 9 and its depth is 4 as shown in Fig. 1.
The evolving tree has three types of nodes, namely the root node (is the first node,
N 1,0 ), trunk node (blue circle other than N 1,0 ), and leaf nodes (green circle, N lf , where
lf ∈ l). The root node is the first layer of the evolving tree and does not have a parent
6000 Users
Potential Users
5000
4000
3000
2000
1000
0
Day - 1 Day - 2 Day - 3
Improved Density-Based Learning to Cluster for User Web Log … 823
node (j = 0). The trunk node is found between the leaf and root node, which is a
static node and acts as an interconnection node between the leaf nodes. The leaf node
does not have any child nodes and a minimum trunk node is used to determine the
distance between two leaf nodes. For example, the total trunk nodes between N 7,3
and N 4,2 are 3, and hence, the tree distance between N 7,3 and N 4,2 is also 3 or (N 7,3 ,
N 4,2 ) is 3.
Evolving Tree-Based Clustering Algorithm
The evolving tree has two parameters, namely splitting threshold, θ s and the creation
of child nodes during the split process, θ c . Consider a training sample data with a
total number of features as n, where X(t) = [x 1 , x 2 ,…, x n ] and the entire algorithm
takes seven steps to learn the objects from the weblog data, which is the training
sample data.
Initially, the training data is fetched, and if the training model is available, the
dataset is loaded into it, or else, a new training model is created. This operation
takes place in the root node and then the process moves to the leaf node. The total
number of best matching pair in the training dataset is found. The distance between
the best matching pair is found using Euclidean similarity value and then the child
node is matched at layer 2 using E{X(t)}. Finally, the shortest distance between
the child and the best matching pair is found using the minimum distance equation
from Eq. (4). Check if the leaf node and the match2 values are same since it leads
to the calculation of N match value from X(t). When the leaf node does not match
with match2 values, then the child node is estimated. The overall process is repeated
until the entire leaf node (N lf ) is found. Check if the value of N lf is greater than one
and the score(d(X(t),W l )) are the same, then the matching pair is chosen Randomly.
On the other hand if N lf value is lesser than or equal to one and score(d(X(t),W l ))
are dissimilar, then the process goes to step 13. Once the matching pair is chosen
using weighted values, the weight vector of the best matching pair is found and
updated. Then the weight vector of the best matching pair is updated and the process
is repeated until the best-weighted pair is updated. After updating the matching pair
(wx,y or wmatch ), the weight vector is updated using Kohonen learning rule given in
Eq. (5). The neighborhood function of the tree is calculated using Eq. (6) to obtain
the expansion of tree structure. Finally, the updated N match value is chosen as a parent
node and the weighted of the parent node is considered as a child node. Once all the
values of tree and child nodes are known, the tree is updated, and hence, training
model is updated further. This is used as a learning model for new training data.
Algorithm 1: Evolving tree-based clustering algorithm
1: Fetch a training data, X(t) = [x 1 , x 2 ,…, x n ]
2: If trained model is available
3: Load the trained model (θ s ,θ c )
4: Else
5: Create a new model (θ s ,θ c )
6: End
7: While process move from root to leaf node
824 N. V. Kousik et al.
wl f (t + 1) = wl f (t) + h Nmatch_l f (t) X (t) − wl f (t) (5)
2
−dT Nmatch , Nl f
h Nmatch_l f (t) = α(t)e (6)
2σ 2 (t)
where
dT Nmatch , Nl f - distance between the N match and N lf
α(t) - learning rate,
σ (t) - Gaussian kernel width, which is monotonically reduced with t
28: The tree is considered growing
29: If bmatch = θ s
30: Update N match as a parent node with θ c
31: Initialize w(θ c ), such that weight of parent node is same as child nodes
32: Else
33: GoTo step 26
Improved Density-Based Learning to Cluster for User Web Log … 825
34: End
35: Training model is updated
36: Learn new training data
37: End
The prediction engine classifies the user navigation patterns and the future request
of the user is predicted based on this engine classifier. The longest subsequence
algorithm is utilized for such prediction process and it finds the common longest
subsequence bout the entire sequence and the algorithm consists of two properties:
• When two sequence x and y in a webpage ends up with similar element, the
common longest subsequence (cls) is founded by eliminating the end element
and then the shortened sequences are found.
• When two sequence x and y in a webpage do not end up with similar element, then
the longest sequence between x and y in a webpage is found, as x is cls(x n ,ym-1 )
and y is cls(x n-1 ,ym )
The cls is thus calculated using, Eq. (7), which is given by,
⎧
⎨ 0 if i or j = 0
cls(xi , yi ) = cls(xi−1 , yi−1 ), xi if ẍi = ÿ j (7)
⎩
long(cls(xi−1 , yi ), cls(xi , yi−1 )) if ẍi = ÿ j
The cls common to x i and yj is found by comparing its elements ẍi and ÿi . The
sequence cls(x i-1 ,yj-1 ) can only be extended by ẍi , when both x i and yj are equal. If
x i and yj are not equal the longer sequence in cls(x i-1 ,yj ) and cls(x i ,yj-1 ) is found, and
if cls(x i-1 ,yj ) and cls(x i ,yj-1 ) are same, then both values are retained, provided both
the values are not identical.
The following algorithm 2 consists of the following steps: The webpages are
assigned with URL, and for each web pair, the weight is calculated. The edge weight
is computed over entire nodes in the graph and the edges with minimum frequency
are removed. The remaining high-frequency nodes are further used to form the cluster
using a depth-first search. Finally, the cluster with the minimum size is removed from
the graph.
Algorithm 2: Prediction Engine Algorithm
1: URL is assigned over the list of webpages, L[p] = P
2: For each pair (Pi , Pj ) ∈ L[p] do // webpage pair
3: Then M ij = w(Pi , Pj ); // weight is computed using Eq. (3)
Edgeij = M ij
4: End for
5: For Edgeu, v ∈ GraphE,V do //Edge weight with minimum frequency is removed
826 N. V. Kousik et al.
This section evaluates the proposed method through a series of experiments and the
BUS dataset (Collected from Bharathiyar University, India) is used for experimenting
with the testing environment. The dataset consists of 1,893,725 entries and 533,156
webpages.
The algorithms are implemented with Java programming. The results related to
the pattern discovery are obtained using the proposed evolutionary tree-based clus-
tering. The paper also indicates the gaining performance concerning user navigational
pattern’s accurate prediction and run time. Initially, the process starts by discarding
or filtering the noisy weblog data, implicit request, and error entries and network
manager entries. Then the clustering process is carried out to group the potential
users and the longest subsequence algorithm is used for best predictive response for
future request.
The improvements are suggested by proposing that returning users without repeated
user requests do not help in identifying the knowledge related to the navigational
pattern. Once the sessions are detected using threshold timing, say 30 min, the
checking is done to detect whether the user pattern is shared by the same user or
not. When the user navigation shared pattern exists, then the identified sessions are
approved; otherwise, the sequences getting split to sessions are skipped. The inves-
tigations are carried out to find the effects associated with the quantity and quality
of the sessions identified.
Improved Density-Based Learning to Cluster for User Web Log … 827
The present investigation carries out two-time thresholds, 10 and 30 min with
an equal set of experiments. The timing threshold is carried out with three different
minimum lengths of patterns (lsp), 1 to 3. The test length is tested with a different set
of variables that ranges from 10–100%. As the variable value increases, the sessions
associated with the patterns share well. Using the values of Table 3, the ratio for a
different time and lsp is evaluated.
The proposed algorithm is tested over the dataset to find the benefits of the proposed
algorithm, which uses the proposed navigational pattern. The raw data is sent through
the process of pre-processing, cleaning, and session identification prior to clustering.
Table 3 provides the results of total transactions and memory used for storing the
weblog cleaned data. Figure 3 shows the evaluated results of identified potential
users after the pre-processing operation and it is found that many irrelevant items are
removed that result in high-quality potential users identified (Table 5).
The clustering result is used to find the different forms of information, which is
extracted from the weblog data. It contains the total number of visits attempted over
a website, traffic of the webpage, frequency of the page viewed, and behavior of
navigational user pattern. The weblog data is considered with 100 unique webpages,
where the clarity is improved by assigning it with codes. The total visits made over
24 h on these 100 pages are tested and this is used to test the proposed system’s
performance. The minimum frequency or the threshold value is used to remove the
correlated edges with lower value and the size of the minimum cluster is set as one.
It is clear from Table 4 that the threshold value of 0.5 shows the optimal results with
the associated dataset. The test is repeated against different weblog data sizes and the
results are found. Thus, the clustering results are 0.5 and are used for the prediction
process.
|P(an , T ) ∩ Evaln |
accuracy = (8)
|P(an , T )|
Improved Density-Based Learning to Cluster for User Web Log … 829
|P(an , T ) ∩ Evaln |
coverage = (9)
|Evaln |
coverage(P(an , T )) × 2 × accuracy(P(an , T ))
F1 − measur e = (10)
coverage(P(an , T )) + accuracy(P(an , T ))
where
an–navigation pattern for active session and.
T–threshold value.
P(an,T)–prediction set.
Evaln–evaluation set.
The prediction rate is increased, since the accuracy of the prediction is increased
with the threshold values and the best accuracy is obtained is 92%.
References
1. Raphaeli O, Goldstein A, Fink L (2017) Analyzing online consumer behavior in mobile and
PC devices: a novel web usage mining approach. Electron Commer Res Appl 26:1–12
2. Shivaprasad G, Reddy NS, Acharya UD, Aithal PK (2015) Neuro-fuzzy based hybrid model
for web usage mining. Procedia Computer Science 54:327–334
3. Carmona CJ, Ramírez-Gallego S, Torres F, Bernal E, delJesús MJ, García S (2012) Web usage
mining to improve the design of an e-commerce website: OrOliveSur. com. Expert System
Appl 39(12):11243–11249
4. Devi BN, Devi YR, Rani BP, Rao RR (2012) Design and implementation of web usage mining
intelligent system in the field of e-commerce. Procedia Engineering 30:20–27
830 N. V. Kousik et al.
5. Lopes P, Roy B (2015) Dynamic recommendation system using Web usage mining for e-
commerce users. Procedia Comput Sci 45:60–69
6. Cohen A, Nachmias R (2011) What can instructors and policy makers learn about Web-
supported learning through Web-usage mining. Int Higher Educ 14(2):67–76
7. Hung YS, Chen KLB, Yang CT, Deng GF (2013) Web usage mining for analysing elder self-care
behavior patterns. Expert Syst Appl 40(2):775–783
8. Velásquez JD (2013) Combining eye-tracking technologies with web usage mining for
identifying Website Keyobjects. Eng Appl Artif Intell 26(5):1469–1478
9. Matthews SG, Gongora MA, Hopgood AA, Ahmadi S (2013) Web usage mining with
evolutionary extraction of temporal fuzzy association rules. Knowl-Based Syst 54:66–72
10. Sha H, Liu T, Qin P, Sun Y, Liu Q (2013) EPLogCleaner: improving data quality of enterprise
proxy logs for efficient web usage mining. Procedia Computer Science 17:812–818
11. Tao YH, Hong TP, Su YM (2008) Web usage mining with intentional browsing data. Expert
Syst Appl 34(3):1893–1904
12. John JM, Mini GV, Arun E (2012) User profile tracking by Web usage mining in cloud
computing. Procedia Engineering 38:3270–3277
13. Huang YM, Kuo YH, Chen JN, Jeng YL (2006) NP-miner: A real-time recommendation
algorithm by using web usage mining. Knowl-Based Syst 19(4):272–286
14. Adeniyi DA, Wei Z, Yongquan Y (2016) Automated web usage data mining and recommenda-
tion system using K-Nearest Neighbor (KNN) classification method. Applied Computing and
Informatics 12(1):90–108
15. Tao YH, Hong TP, Lin WY, Chiu WY (2009) A practical extension of web usage mining with
intentional browsing data toward usage. Expert Syst Appl 36(2):3937–3945
16. Hong TP, Huang CM, Horng SJ (2008) Linguistic object-oriented web-usage mining. Int J
Approximate Reasoning 48(1):47–61
17. Yin PY, Guo YM (2013) Optimization of multi-criteria website structure based on enhanced
tabu search and web usage mining. Appl Math Comput 219(24):11082–11095
18. Cho YH, Kim JK (2004) Application of Web usage mining and product taxonomy to
collaborative recommendations in e-commerce. Expert Syst Appl 26(2):233–246
19. Zhang X, Edwards J, Harding J (2007) Personalised online sales using web usage data mining.
Comput Ind 58(8):772–782
20. Musale V, Chaudhari D (2017) Web usage mining tool by integrating sequential pattern
mining with graph theory, 1st International Conference on Intelligent Systems and Information
Management (ICISIM), Aurangabad, India. https://doi.org/10.1109/ICISIM.2017.8122167
21. Liu J, Fang C, Ansari N (2016) Request dependency graph: A model for web
Spatiotemporal Particle Swarm
Optimization with Incremental Deep
Learning-Based Salient Multiple Object
Detection
Abstract The recent developments in the computer vision application will detect
the salient object in the videos, which plays a vital role in our day-to-day lives. Diffi-
culty in integrating spatial cues with motion cues makes the process of a salient object
detection more difficult. Spatiotemporal constrained optimization model (SCOM) is
provided in the previous system. Since the better performance is exhibited in the
detection of single salient object, the variation of salient features between different
persons is not considered in this method and more general agreement related to
their significance is met by some objects. To solve this problem, the proposed
system designed a spatiotemporal particle swarm optimization with incremental
deep learning-based salient multiple object detection. In this proposed work, incre-
mental deep convolutional neural network (IDCNN) classifier is introduced for a suit-
able measurement of success in a relative object saliency landscape. Spatiotemporal
particle swarm optimization model (SPSOM) is used for performing the ranking
method and detection of multiple salient objects. In this system to achieve global
saliency optimization, local constraint temporal as well as spatial cues is exploited.
Prior video frame saliency map and change detection motion history are done using
SPSOM. Moving salient objects are distinguished from diverse changing background
regions. When compared with existing methods, better performance is exhibited
using proposed method as shown in results of experimentation concerning recall,
precision, average run time, accuracy and mean absolute error (MAE).
M. Indirani (B)
Assistant Professor, Department of IT, Hindusthan College of Engineering and Technology,
Coimbatore 641032, India
e-mail: mindirani2008@gmail.com
S. Shankar
Professor, Department of CSE, Hindusthan College of Engineering and Technology, Coimbatore
641032, India
e-mail: shanx80@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 831
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_60
832 M. Indirani and S. Shankar
1 Introduction
In recent days, video salient object detection (VSOD) has gained more interest.
In general, during free-viewing to understand the underlying mechanism of HVS,
it is essential to have video salient object detection (VSOD) and it is also used
in various real-time applications like weakly supervised attention, robotic interac-
tion, autonomous driving, video compression, video captioning, video segmentation
[1–5].
Due to the challenges in video data like large object deformations, blur, occlu-
sions and diverse motion patterns and human visual attention behavior’s inherent
complexity like attention shift, selective attention allocation and great difficulties
are presented in VSOD in addition to its practical and academic significance. So, in
the past few years, research interest is increased apparently. Detection of the salient
object is a task that is based on the mechanism of visual attention, where the algorithm
aims for exploring more attentive or objects other than scene or images surrounding
area.
From the background, foreground objects are identified in the detection of a salient
object in video and image. The assumption that objects can be distinctive, pattern,
motion, textured on when compared with background forms base for this technique
[6]. The saliency map is outputted in one frame, where probability of pixel belonging
to a salient object is represented by every value. Potential objects are identified using
pixels having high probability.
Difficulty in integrating spatial cues with motion cues makes the process of detec-
tion of a salient object is more difficult and it is also difficult to deal with static adjacent
frames and unavailability of motion feature. Complicating factors like cluttered back-
ground, large background motion, shadowing due to illumination changes, intensity
variation influence the acquired video quality. In a video, for spatiotemporal salient
object detection, various methods have been proposed until now [7, 8].
The ability of high-level semantic feature representation of deep convolutional
neural network (CNN) makes it more suitable in recent days. To detect a salient object,
various CNN-based methods are proposed and they have produced better results [9–
11]. But, CNN output has non-sharp boundaries and coarse due to the presence of
pooling layers and a convolutional layer having a huge amount of receptors [12].
Learning of new classes is allowed using effective DCNN with incremental growing
and training method while sharing the base network’s part [12]. According to relative
saliency’s hierarchical representation, proposed an IDCNN classifier.
The paper is organized as, different salient object detection methods are discussed
in Sect. 2, and for multiple salient object detection, a model is proposed in Sect. 3,
experimentation and analysis are presented in Sect. 4 and Sect. 5 concludes the
research work.
Spatiotemporal Particle Swarm Optimization with Incremental … 833
2 Related Works
Le and Sugimoto (2018) present a technique for recognizing notable items in record-
ings, where worldly data notwithstanding spatial data is completely considered. The
system introduced a new set of Spatio Temporal Deep (STD) features that exploit
local and global contexts over frames. Furthermore, Spatio Temporal Conditional
Random Field (STCRF) is proposed to calculate a saliency from STD features.
STCRF is the augmentation of CRF to the worldly space and portrays the connections
among neighboring areas both in a casing and over edges.
STCRF prompts transiently reliable saliency maps over casings, adding to the
precise discovery of notable items’ limits and clamor decrease during identifica-
tion. The structured strategy first fragments an info video into numerous scales and
afterward registers a saliency map at each scale level utilizing STD highlights with
STCRF. The last saliency map is registered by intertwining saliency maps at various
scale levels [13].
Chen et al. (2018) present a model for video notable item discovery called
spatiotemporal constrained optimization model (SCOM). It misuses spatial and tran-
sient signs, just as a neighborhood limitation, to accomplish a worldwide saliency
enhancement. For a robust motion computation of salient objects, present the scheme
to modelling the motion cues from optical flow field, the saliency map of the prior
video frame and the motion history of change detection. It can able to differentiate
the moving salient objects from diverse changing background regions.
Moreover, a viable objectness measure is structured with natural geometrical
translation to extricate some solid item and foundation districts, which gave as the
premise to characterize the closer view potential, foundation potential and the imper-
ative to help saliency engendering. These possibilities and the limitation are planned
into the structured SCOM system to create an ideal saliency map for each edge in a
video [14].
Qi et al. (2019) structured a fast video striking article identification strategy at
0.5 s each casing (counting normal 0.32 s for optical stream calculation). It mostly
comprises of two modules, the underlying spatiotemporal saliency module and the
connection channel-based remarkable worldly spread module. The previous one
coordinates the spatial saliency by powerful least hindrance separation and limit
balance prompt with fleeting saliency data from movement field. The last one joins
relationship channels to keep the saliency consistency between neighboring edges.
The over two modules are at last combined in a versatile manner [15].
Wu et al. (2018) structured a spatiotemporal notable article recognition technique
by incorporating saliency and objectness, for recordings with entangled movement
and complex scenes. The underlying notable article identification result is first based
upon both the saliency map and objectness map. A short time later, the district size of a
remarkable item is acclimated to acquire the casing shrewd striking article discovery
result by iteratively refreshing the article likelihood map, which is the mix of saliency
map and objectness map.
834 M. Indirani and S. Shankar
3 Proposed Methodology
For a given video sequence, visual and temporal detection of salient objects in every
frame Ft of a video, a sequence is a major goal, where frame index is represented as t.
The assumption that, for a given video sequence, by analyzing spatial and temporal
cues, background or salient objects of some reliable regions can be found is used
in this proposed saliency model, and from these detected reliable regions, saliency
Spatiotemporal Particle Swarm Optimization with Incremental … 835
Input video
Frame F Frame F
Motion Energy
Object-like regions
seeds can be derived for achieving global optimization of salient detection of the
object.
In a video sequence, for every frame, superpixels are generated for modeling
saliency using segmentation with SLIC and there exist around 300 pixels approxi-
mately in every superpixel. Superpixel labeling problem corresponds to the detection
of the salient object. In a frame, for every superpixel ri (i = 1, 2, ·... N), saliency
landspace value si ∈ [0, 1] is assigned in this technique.
Minimization of constrained energy function E(S) is formulated from superpixel
labeling using a system model, where saliency label’s configuration is represented
as S = {s1 , s2 ,..., s N }. Initially, reliable labels are assigned to some superpixels and
it has three potentials, namely smoothness potential , background potential and
foreground potential .
N N
min E(S) = (si ) + (si ) + si , s j (1)
i=1 i=1 (i, j∈N )
s.t.(S) = k
Four classes (C1−C4) are used for training base network and after training, discarded
training data of those classes. Then, two classes (C5, C6) are given as an input sample
data and these data has to be accommodated in the network while maintaining those
initial four classes knowledge [18].
So, it requires an increase in capacity of network and network is retained only
with new data (of C5 and C6) in an effective way for classifying tasks’ classes
(C1−C6) using an updated network. The classification process is termed as a task-
specific classification if tasks are classified separately and it is termed as a combined
classification if they are classified together. Incremental learning model overview is
shown in Fig. 2.
Design Approach
In the same DCNNs, there exist classifier as well as feature extractor with several
layers, which makes the superiority of DCNNs. In the proposed training method,
fixed feature extractor corresponds to sharing convolutional layers and classifier
Spatiotemporal Particle Swarm Optimization with Incremental … 837
Fig. 2 Incremental learning model: the network needs to grow its capacity with the arrival of data
of new classes
corresponds to fully connected layer and there would not be any sharing of it. Process
of reusing learned network parameters for learning new classes set is termed as
sharing.
In every case, newly available data only used for learning new classes and there
is an assumption that there will be a similar feature in old as well as new classes. A
single dataset is split as various sets in a designed system for using them as an old
and new task data with multiple classes in the network update process.
Figure 3 shows that, in convolutional layers, around ∼60% of learning parameters
are shared with respective ReLu and batch normalization layers and ∼1% accuracy of
Fig. 4 Overview of the DCNN incremental training methodology with partial network sharing
Fig. 5 Spatiotemporal
particle swarm optimization Initialize the number of super pixels
model (SPSOM)
Update Pbest
Update Gbest
Is the
stopping
No condition
Yes
accuracy and branch is again trained for comparison. Sharing is decreased, if the
reference value is greater than new accuracy and branch is again trained for compar-
ison. Based on required quality values, optimum sharing configuration is finalized
after little iteration.
Optimum sharing point shows the fraction of sharing, and after that, degradation
of accuracy to increased sharing is greater than the threshold of quality. With the
minimum loss of quality, maximum benefits can be achieved using this method. At
last, with core and demo set, the base network is retained for enhancing base network
features due to the availability of core and demo sets.
(A) Foreground potential
Using spatial–temporal visual analysis, few reliable object regions O can be
obtained. This is a major assumption used for defining foreground potential; these
regions are in salient object part. In a frame Ft , for every superpixel ri , foreground
potential is defined in the system as,
840 M. Indirani and S. Shankar
where
1 distg2 (ri , r0 )
k
A(ri ) = exp − (4)
N 0−1 2σ 2
N
Md (ri ) = pt r j − μi 2 vi j (5)
j=1
where superpixel r j ’s normalized centroid is represented as pt r j ;superpixel ri ’s
color similarity weighted centroid is represented as μi and is expressed as,
Spatiotemporal Particle Swarm Optimization with Incremental … 841
N
j=1 vi j pt r j
μi = N (6)
j=1 θi j
Between superpixels and other pixels of color optical field, color discriminative-
ness and spatial distance are measured using motion distribution Md .
For frame Ft, motion edge Me and MHI Mh is defined using generated motion
distribution map Md. In Ft, for superpixel, motion energy term M(ri) is defined using
the integration of the prior frame’s saliency St − 1as,
1 distg2 (ri , rb )
ωb (ri ) = exp(− (10)
|B| 2σ 2
where
distc2 ri , r j
ωi j ri , r j = exp − , (i, j) ∈ N (12)
2σ 2
12
Where, within a frame, every spatially adjacent superpixels are available in neigh-
borhood set N. In CIE-Lab, color space between superpixelsri and rj is represented
as distc (ri ,rj ) and it also represents color features Euclidean distance, thus between
sup, erpixelri and rj , appearance similarity is measured using wij (ri , rj ).
(D) Reliable regions O and B
According to reliable background regions B and reliable object regions O, back-
ground potential and foreground potential are proposed. Computation of reliable
regions B and O is represented in this work. Salient object detection performance is
mainly defined using these regions.
In object-like regions K, superpixels are clustered using this system and objects
are more similar to superpixels near to the center of the cluster. The ri ’s cluster
intensity is represented as,
I (r )i = δ V (ri ) − V r j − dc (13)
ri ,r j ∈K
where, for object regions, cluster intensities spanning extent is expressed as to and
background regions, cluster intensities spanning extent is expressed as tb .
The relative salience of detected objects regions is considered in the proposed work
for predicting salient objects total count. For reliable object regions based on saliency
propagation, from K superpixels ro ∈ O, affinity matrix Woi ∈ R N ×N is defined for
every N superpixels ri ∈ S, so that,
where
distc2 (r0 , ri )
ωoi (ro , ri ) = exp − , (ro , ri ) ∈ N (18)
2σ 2
and 1 for an object. Balance parameters are represented as α and β, and in this
experiment, 0.99 is set to these parameters. Affinity matrices Woi and Wbi are not
square matrices; additional potential appending to E(S) cannot be transformed using
(S) = k.
For multiple video salient object detection, the model is presented in this work
and spatiotemporal particle swarm optimization model (SPSOM) which is a ranking
method, where local constraint, temporal and spatial cues are exploited for achieving
optimization of global saliency in multiple objects.
Social behaviors of fish schooling and birds flocking motivate PSO and it is an
evolutionary computation method. In a swarm, particles are used for representing
every solution, which is a basic principle of PSO. Particles correspond to superpixel
in this proposed work. In search space, every particle has its own position and vector
xi = (xi1 , xi2 ,…,xi D ), is used for representing it, where the search space dimension
is represented as D.
To search optimal salient object in search space, superpixels are moving with
velocity. The velocity of superpixel is indicated as vi = (vi1 , vi2 ,…,vi D ). Based
on every particles experience and its neighboring pixels experience, velocity and
position are updated by every particle. Object corresponds to a distance between
superpixels as assumed in proposed work.
Particles best position corresponds to its previous best position which is recorded
and represented as pbest and gbest corresponds to the best position of the population
achieved so far. The optimum solution is searched by PSO using gbest and pbest,
where, based on the following expressions, every particles position and velocity are
updated.
t+1
xid = xid
t
+ vid
t+1
(21)
vid
t+1
= ω ∗ vid
t
+ c1 ∗ r1 pid − xid
t
+ c2 ∗ r2 pgd − xid
t
(22)
4 Experimental Results
The dataset used for evaluation is presented in this section with parameters used
for evaluating salient object detection performance. Three benchmark datasets are
used in experimentation, which includes, Freiburg-Berkeley Motion Segmentation
(FBMS) datasets, which is commonly used and collected from https://lmb.informatik.
uni-freiburg.de/resources/datasets/moseg.en.html. In FBMS, drastic camera move-
ment is involved in various videos, and for extracting motion feature, large motion
noise is introduced by these movements. Testing and training set is formed by split-
ting FBMS dataset randomly. Figure 6 shows input images. The Mean Absolute Error
(MAE) performance of the proposed SPSOM with IDCNN method is compared with
the existing DSS and SCOM approaches which are shown in Fig. 7.
To detect multiple salient objects, system uses three various standard metrics for
measuring performance, which includes, precision-recall (PR) curves, accuracy,
average run time, mean absolute error (MAE). Deeply supervised salient object
detection (DSS), SCOM and SPSOM with IDCNN approaches are compared. The
proposed and existing methods performance is represented in Table 1.
Mean Absolute Error (MAE)
Absolute errors average defines mean absolute error | ei | =|yi − xi |{\displaystyle
|e_{i}|=|y_{i}-x_{i}|}, where yi {\displaystyle y_{i}} is the prediction
and xi {\displaystyle x_{i}} the true value. The mean absolute error is given
846 M. Indirani and S. Shankar
by
n
i=1 y i − x i
MAE = (23)
n
The performance of the proposed SPSOM with IDCNN method is compared with
existing DSS and SCOM approaches in terms of mean absolute error(MAE). In
x-axis, methods are represented and MAE is represented in the y-axis. In proposed
work, incremental deep convolutional neural network (IDCNN) classifier is proposed
Spatiotemporal Particle Swarm Optimization with Incremental … 847
0.08
0.07
0.06
MAE (%)
0.05
0.04
0.03
0.02
0.01
0
DSS SCOM SPSOM with IDCNN
Methods
Table 1 Performance
Methods Metrics
comparison
MAE (%) Accuracy (%) Average Run
Time (s)
DSS 0.07 80 0.432
SCOM 0.069 87 37.5
SPSOM with 0.04 91 0.28
IDCNN
for measuring success in a relative object saliency landscape. It reduces mean absolute
error. From experimental results, it is concluded that proposed SPSOM with IDCNN
approach achieves 0.04% when other methods such as DSS and SCOM attain 0.07%
and 0.069%, respectively.
Figure 8 shows the accuracy of proposed SPSOM with IDCNN approach and
existing DSS and SCOM approaches. In x-axis, methods are represented and accu-
racy is represented in the y-axis. In proposed work, spatiotemporal particle swarm
optimization model (SPSOM) is introduced for achieving multiple objects global
saliency optimization. The distance between the superpixels is considered as an
objective function. Due to this optimization, the accuracy of the proposed system is
improved. From the graph, it can be concluded that the proposed system achieves
91% of accuracy when other methods such as DSS and SCOM attain 80% and 87%,
respectively.
Figure 9 shows the PR of proposed SPSOM with IDCNN approach and existing
DSS and SCOM approaches. In x-axis, recall value is represented and precision
is taken as the y-axis. From results, it shows that proposed SPSOM with IDCNN
approach achieves great performance than existing SPSOM with IDCNN approaches.
The average run time of the proposed SPSOM with IDCNN approach is compared
with the existing DSS and SCOM approaches. In x-axis, methods are taken
and average run time is taken as the y-axis. From Fig. 10, it is concluded that
848 M. Indirani and S. Shankar
100
90
Accuracy (%) 80
70
60
50
40
30
20
10
DSS SCOM SPSOM with IDCNN
Methods
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1
Recall
Fig. 9 PR curves
proposed SPSOM with IDCNN method achieves 0.28 s when other methods such
as DSS and SCOM attain 0.432 s and 37.5 s, respectively. The performance of the
proposed SPSOM with IDCNN approach achieves better performance than existing
approaches.
Spatiotemporal Particle Swarm Optimization with Incremental … 849
40
5 Conclusion
References
1. Wang W, Shen J, Porikli F (2015) Saliencyaware geodesic video object segmentation. In: IEEE
CVPR, pp 3395–3402
2. Xu N, Yang L, Fan Y, Yang J, Yue D, Liang Y, Price B, Cohen S, Huang T (2018) Youtube-vos:
Sequence-to-sequence video object segmentation. In: ECCV, pp 585–601
3. Pan Y, Yao T, Li H, Mei T (2017) Video captioning with transferred semantic attributes. In:
CVPR, pp 6504–6512
4. Guo C, Zhang L (2010) A novel multiresolution spatiotemporal saliency detection model and
its applications in image and video compression. IEEE TIP 19(1):185–198
5. Zhang Z, Fidler S, Urtasun R (2016) Instancelevel segmentation for autonomous driving with
deep densely connected mrfs. In IEEE CVPR, pp 669–677
850 M. Indirani and S. Shankar
6. Srivatsa RS, Babu RV (2015) Salient object detection via objectness measure. In: 2015 IEEE
international conference on image processing (ICIP), pp 4481–4485, IEEE
7. Wang W, Shen J, Porikli F (2015) Saliency-aware geodesic video object segmentation. In:
Proceedings of the conference on computer vision and pattern recognition, pp 3395–3402
8. Yang J, Zhao G, Yuan J, Shen X, Lin Z, Price B, Brandt J (2016) Discovering primary objects
in videos by saliency fusion and iterative appearance estimation. IEEE Trans Cir Syst Video
Technol 26(6):1070–1083
9. Chen T, Lin L, Liu L, Luo X, Li X (2016) DISC: deep image saliency computing via progressive
representation learning. IEEE TNNLS
10. Li X, Zhao L, Wei L, Yang MH, Wu F, Zhuang Y, Ling H, Wang J (20165) DeepSaliency:
multi-task deep neural network model for salient object detection. arXiv preprint arXiv:1510.
05484
11. Wang L, Lu H, Ruan X, Yang MH (2015) Deep networks for saliency detection via local
estimation and global search. In: Proceedings of the IEEE conference on computer vision and
pattern recognition, pp 3183–3192
12. Zheng S, Jayasumana S, Romera Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015)
Conditional random fields as recurrent neural networks. In: ICCV
13. Le TN, Sugimoto A (2018) Video salient object detection using spatiotemporal deep features.
IEEE Trans Image Process 27(10):5002–5015
14. Chen Y, Zou W, Tang Y, Li X, Xu C, Komodakis N (2018) SCOM: Spatiotemporal constrained
optimization for salient object detection. IEEE Trans Image Process 27(7):3345–3357
15. Qi Q, Zhao S, Zhao W, Lei Z, Shen J, Zhang L, Pang Y (2019) High-speed video salient object
detection with temporal propagation using correlation filter. Neurocomputing 356:107–118
16. Wu T, Liu Z, Zhou X, Li K (2018) Spatiotemporal salient object detection by integrating with
objectness. Multimedia Tools Appl 77(15):19481–19498
17. Dakhia A, Wang T, Lu H (2019) A hybrid-backward refinement model for salient object
detection. Neurocomputing 358:72–80
18. Wang W, Shen J, Shao L (2017) Video salient object detection via fully convolutional networks.
IEEE Trans Image Process 27(1):38–49
Election Tweets Prediction Using
Enhanced Cart and Random Forest
Abstract In this digital era, the framework and working process of election and
other such political works are becoming increasingly complex due to various factors
such as number of parties, policies, and most notably the mixed public opinion. The
advent of social media has deployed the ability to converse and discuss with a wide
range of audience across the globe, whereas gaining a sheer amount of attention from
a tweet or post is unimaginable. Recent advances in the area of profound learning
have contributed to the use of many different verticals. Techniques such as long-
term memory (LSTM) perform a sentiment analysis of the posts. This can be used
to determine the overall mixed reviews of the population towards a political party
or person. Several experiments have shown how to forecast public sentiment loosely
by examining consumer behaviour in blogging sites and online social networks in
national elections. This paper has proposed a model of machine learning to predict the
chances of winning the upcoming election based on the common people or supporter
views on the web of social media. The supporter or user shares their opinion or
suggestions about the group or opposite group of their choice in social media. It has
been required to collect the text posts about election and political campaigns, and
then the machine learning models are developed to predict the outcome.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 851
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_61
852 A. Jahnavi et al.
1 Introduction
The online platform has become an enormous course for individuals to communi-
cate their preferences. Using various assessment techniques, the ultimate intent of
people can be found, for example, by eviscerating the content of the tendency, posi-
tive, negative, or truthful. For instance, assessment appraisal is always noteworthy
in a relationship to hear their client’s insights on their things by imagining eventual
outcomes of races and getting ends from film ponders. The data snatched from the
opinion evaluation is helpful for predicting the future choices. Rather than associ-
ating individual terms, the relation between the set of words is also considered. While
selecting the general assumption, each word’s ending is settled and united using a
cap. Pack of words will also ignore word demands, which prompt phrases with invali-
dation should be erroneously described. In the past decades, there has been a massive
improvement in the use of small-scale blogging stages, for instance, Twitter. Nudged
by that advancement, associations and media affiliations are continuously searching
for ways to analyze the information about what people ponder about their things
and organizations in the social platforms like Twitter [1]. Associations, for instance,
Twitratr, tweetfeel, and social mention are just an uncommon sorts of individuals
who advance tweet presumption examination as one of their organizations [2].
Although a significant proportion of work has been performed on how emotions
are expressed in different forms such as academic studies and news reports, where
significantly less study has been done [3]. Features, for instance, customized
linguistic component marks and resources, for instance, idea vocabularies have exhib-
ited the accommodation for supposition examination in various spaces, and anyway
will they also show significance for evaluation assessment in Twitter? This paper
begins to analyse this request [4].
2 Literature Survey
Notwithstanding the character goals on tweets, working out the concept of Twitter
messages is basically close to the sentence-level assumption evaluation, the
welcoming and express language used in tweets, as well as the general idea of
the local micro-blogging allows Twitter’s thinking evaluation to extend beyond the
expectation [5]. It is an open solicitation on how well the highlights and proce-
dures are utilized on continuously well-shaped information that will move to the
micro-blogging space [6].
Ref. [7] It involves measures such as data collection, pre-processing of documents,
sensitivity identification, and classification of emotions, training and model testing.
This research subject has grown over the last decade with the output of models hitting
approximately 85–90% [8].
Ref. [9] Firstly, in this paper, they have presented the method of sentiment analysis
to identify the highly unstructured data on Twitter. Second, they discussed various
Election Tweets Prediction Using Enhanced Cart and Random Forest 853
3 Methodology
The following figure shows the steps followed in the proposed model (Fig. 1).
Decision Tree
As the implementation of machine learning algorithms is mainly intended to solve
problems at the industry level, the need for more complex and iterative algorithms
is becoming as an increasing requirement. The decision tree algorithm is one such
algorithm used to solve problems in both regression and classification.
Decision tree is considered as one of the most useful algorithms in machine
learning because it can be used to solve many challenges. Here are a few reasons
why decision tree should be used:
(1) It is considered the most comprehensible machine learning algo-
rithm and can easily be interpreted.
(2) This can be used for problems with classification and regression.
(3) It deals better with nonlinear data as opposed to most machine learning
algorithms.
(4) Building a decision tree is a very quick process since it uses only one function
per node to divide the data.
854 A. Jahnavi et al.
Fetching the
raw data
Pre-processing Implementing
the Retrieved data Algorithm’s
Retaining
Accuracies from the
algorithms applied
wtf<0.5 True
False True
Election Tweets Prediction Using Enhanced Cart and Random Forest 855
Random Forest
The random forest algorithm works by aggregating the predictions from different
depths of multiple decision trees. Decision tree in the forest will be trained on a
dataset subset called the bootstrapped dataset.
The portion of samples left out when constructing each decision tree in the forest
is referred to as the Out-Of-Bag (OOB) dataset. As observed later, the model can
automatically determine its own output by running each of the samples through the
forest in the OOB dataset.
Remember how the impurity measurement is generated with each feature by using
the Gini index or entropy when deciding on the criteria with which to split a decision
tree. Nonetheless, a predefined number of features are randomly chosen as candidates
in random forest. The above would result in a greater difference between the trees
which would otherwise have the same characteristics.
If the random forest is used for classification, and a new sample is provided, the
final prediction is made by taking most of the predictions produced in the forest by
each individual decision tree. In the event, it is used for regression and a new sample
is provided; the final prediction is made by taking the average of the predictions
produced in the forest by each individual decision tree.
Logistic Regression
Key backslide is a genuine model which uses a determined ability to display a parallel
subordinate variable in its main structure, but there are many dynamically complex
developments. Determined backslide is surveying the parameters of a critical model
in backslide analysis.
Numerically, a double-determined model has a dependent variable with two
possible characteristics, for example, pass / bomb, which is represented by a marker
variable called “0” and “1 . In the vital model, the log-risks (the odds logarithm) for
the value “1 tested is an immediate mix of one independent variable (“markers”)
in any event; the free factors can be either a double factor (two classes, coded by a
pointer variable) or a constant variable (any authentic value). The relative probability
of the value called “1 will shift from 0 (irrefutably the value “0 ) to 1; from now
on the limit that changes to probability over log opportunities is the defined limit,
hence the name.
The unit of estimation for the scale of the log-chances is known as a logit, from
a given unit, hence the elective names. Essentially proportionate to models with a
different sigmoid limit as opposed to the defined limit, the probit model, for example,
can use it similarly; the usual explanation for the main model is that it increases one
of the self-sufficient factors that multiply the odds of the result at a predictable rate,
with each free factor having its own parameter; this summarizes the odds magnitude
for a double bad variable.
The twofold determined backslide model has extensions to different degrees of
the dependent variable: straight out yields with various characteristics are shown by
multinomial vital backslide, and if the various classes are mentioned, by ordinal vital
backslide, for example, the comparing chances of ordinal key model.
856 A. Jahnavi et al.
Fig. 3 Classification of the retrieved data using logistic regression, represented in a form of plot
Considering that election outcomes are very difficult to predict using other methods,
including public opinion polls, and with social media such as Facebook and Twitter
increasingly prevalent, the authors chose to use Twitter’s sentiment analysis to
forecast Indian general election results (Table 1).
Table 1 Accuracies of
Algorithm Accuracy of tweets
algorithm’s
Neg Pos
Cart 0.8789 0.9437
Random forest 0.9155 0.9493
Logistic regression 0.8845 0.9268
Election Tweets Prediction Using Enhanced Cart and Random Forest 857
From the above table, the overall accuracies of the tweets obtained using cart
algorithm are 91.13%, random forest algorithm is 93.24%, and logistic regression is
90.56%. So, from the results, it is observed that the random forest algorithm works
better on election tweets data.
5 Conclusion
Hence, in this research work, in order to expand the data set size, there may be several
other prospective fields to perform this analysis, where it includes the data from other
major social networking sites, such as Twitter. It is also found that there is a dedicated
research space to work with the training dataset by considering the model dataset
that already specifies a certain number of algorithmic features. The major downside
of this research work is that it fails to recognize the significant parameter called
emotion, when defining the polarity of a tweet. Since the data was labelled manually,
the volume was not high enough to have more precise information, so more tweets
can be obtained and marked. As a continuation of this research work, the network
size will be increased.
Reference
1. Malika M, Habiba S, Agarwal P (2018) A novel approach to web-based review analysis using
opinion mining. In: International Conference on Computational Intelligence and Data Science
(ICCIDS 2018) , Department of Computer Science and Engineering, Jamia Hamdard, New
Delhi-110062, India
2. Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau RJ (2011) Sentiment analysis of Twitter
data
3. Popularity analysis for Saudi telecom companies based on Twitter Data. Res J Appl Sci Eng
Technol (2013)
4. Liu B (2012) Sentiment analysis and opinion mining, Morgan & Claypool Publishers
5. Joshi S, Deshpande D (2018) Twitter sentiment analysis system. Department of Information
Technology, Vishwakarma Institute of Technology Pune, Maharashtra, India
6. Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau RJ (2011) Sentiment analysis of Twitter
data. Department of Computer Science Columbia University New York, NY 10027 USA
7. Gupta B, Negi M, Vishwakarma K, Rawat G, Badhani P (2017) Study of Twitter sentiment
analysis using machine learning algorithms
8. Umadevi V (2014) Sentiment analysis using weka. IJETT Int J Eng Trends Technol 8(4):181–
183
9. Techniques for sentiment analysis of Twitter data: a comprehensive survey. In: 2016
International Conference on Computing, Communication and Automation (ICCCA)
10. Caetano JA, Lima HS, Santos MF, Marques-Neto HT (2018) Using sentiment analysis to define
twitter political users’ classes and their homophily during the 2016 American Presidential
election
11. Bansala B, Srivastavaa S (2019) On predicting elections with hybrid topic based sentiment
analysis of tweets. Department of Applied Sciences, The NorthCap University, Gurugram,
India
858 A. Jahnavi et al.
12. Guerini M, Gatti L, Turchi M (2013) Sentiment analysis: how to derive prior polarities from
SentiWordNet
13. Barahate SR, Shelake VM (2012) A aurvey and future vision of data mining in educational field.
In: Proceedings 2nd International Conference on Advanced Computing and Communication
Technology, pp 96–100
14. Chen X, Vorvoreanu M, Madhavan K (2014) Mining social media data to understand students’
learning experiences. IEEE Trans 7(3):246–259
15. Twitter data sentiment analysis and visualization.Int J Comput Appl 180(20)
16. Al Amrani Y, Lazaar M, Kadiri KE (2018) Random forest and support vector machine based
hybrid approach to sentiment analysis. Author links open overlay panel
17. Venugopalan M, Gupta D (2016) Exploring sentiment analysis on Twitter data, IEEE 2015
Flexible Language-Agnostic Framework
To Emit Informative Compile-Time
Error Messages
Syntax errors constitute one of the largest classes of programming errors. Program-
mers spend a significant amount of time in correcting them [1]. Dozens of syntax
errors detecting environments like Eclipse [2] and IntelliJ [3] exist in the program-
ming world. But, there is no universal environment available for all languages. As
programmers usually use different languages based on the problem that they are
solving, they would require a language-agnostic framework to detect errors for all
the programming languages that they use.
In addition to this, popular programming languages such as Python and C [4]
produce unhelpful error messages [5–7] that provide little or no help to program-
mers in identifying these errors. For instance, "Syntax Error: invalid syntax", "error:
expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘{’ token" are common syntax
errors seen which provide minimal assistance to programmers [8].
To address these issues, we developed a language-independent framework that
provides a template to specify user-friendly error messages. A domain expert can
customize the default syntax error messages to any brief or elaborate error message
based on the use-case. A standard compiler/interpreter environment does not offer
this feature. For example, the error messages can be customized to be more elaborate
for novice users [9]. For a competitive coding environment, the strictness of syntax
error detection can be lenient. This would ensure the code passes any test case if
the logic is right without considering incorrect syntax. This sort of flexibility is not
available in any other framework for all languages.
Existing work to detect syntax errors use machine learning techniques or abstract
syntax trees. Sahil et al. [10] use a neural network trained over a dataset that consists of
assignments submitted by students. This method requires high computational power
and average training periods of 1.5 hours. The performance of the neural network is
dataset dependent. On the other hand, using an abstract syntax tree [11] can handle
only one syntax error at a time. The first error detected has to be rectified before
the system can detect other errors. It is not ideal for a large program that can have
multiple syntax errors at once. The programmer needs to rectify one error at a time
which is time-consuming.
Due to the above-mentioned limitations, we consider an alternative approach
based on compiler theory. Our framework identifies the compile-time syntax errors
in a user given program based on the grammar provided for a language.
The grammar consists of error productions for every syntax error. Our framework
provides the flexibility to supply error productions for any programming language. A
PLY [12] program (pure-Python implementation of the popular compiler construction
tools LEX and YACC) is generated using these error productions. We adopt a novel
approach where we have a Python code that automatically generates the PLY code.
The PLY program is a transformation of the error production rules to a suitable format.
The program generated PLY program is then run to give better diagnostics of the
code that is uploaded by the user. The error productions provided and the uploaded
program must be of the same programming language. As a proof of concept, we
have written error productions in Python and C to help identify syntax errors in both
these languages. Any future use of this framework would require the domain expert
to provide the error productions for the language in the required format.
Our framework has the following features:
1. Our framework allows customization of syntax error messages that enable more
instructive error messages to be emitted compared to standard compiler/inter-
preter.
Flexible Language-Agnostic Framework to Emit Informative Compile-Time . . . 861
2. The PLY program used to identify and correct the syntax errors is automatically
generated using another program.
3. Our framework converts a language-specific input to a language-agnostic PLY
program making it language independent.
4. All the syntax errors in a program are detected at once making it easy for under-
standing compile-time errors.
The rest of this paper is structured as follows. We describe the architecture of our
framework in Sect. 2. Section 3 is devoted to the results obtained. Lastly, we present
our conclusions in Sect. 4.
2.1 Architecture
Our framework can be divided into the following components as shown in Fig. 1.
Each error production consists of a non-terminal called the left-hand side of the pro-
duction, an arrow, and a sequence of tokens and/or terminals called the right-hand
side of the production. This is followed by a customized error message separated
from the error production using a delimiter(“). The customized error messages pro-
vide a clear description of the syntax errors. One of the non-terminals is designated
as the start symbol from where the production begins. One such error production in
Python to identify missing colon after a leader is as follows:
Generates
Input Input
Error PLY User
PLY
Productions Program Program
Output
Program
2.1.2 PLY
Once we have the error productions, we generate a PLY[8] program. PLY is a pure-
Python implementation of the compiler construction tools LEX and YACC. PLY uses
LALR(1) parsing. It provides input validation, error reporting, and diagnostics.
The LEX.py module is used to break input text into tokens specified by a collection
of regular expression rules. Some of the tokens defined in LEX are : NUMBER, ID,
WHILE, FOR, DEF, IN, RANGE, IF. The following assigns one or more digits to
the token NUMBER:
1 def t_NUMBER(t):
2 r’\d+’
3 t.value = int(t.value)
4 return t
YACC.py is a LALR parser used to recognize syntax of a language that has been
specified in the form of a context free grammar.
1 expr : expr ’+’ expr { $$ = node(’+’, $1, $3); }
The input to YACC is a grammar with snippets of code (called “actions”) attached
to its rules. YACC is complemented by LEX. An external interface is provided by
LEX.py in the form of a token() function which returns the next valid token on the
input stream. This is repeatedly called by YACC.py to retrieve tokens and invoke
grammar rules such as:
1 def p_1(p):
2 "Program : PRINT ’(’ NUMBER ’)’ "
3
Here, action[2] corresponds to the customized error message mentioned in the file.
Flexible Language-Agnostic Framework to Emit Informative Compile-Time . . . 863
The Python program takes grammar of any programming language and generates
the language-specific PLY code. This automatic generation of PLY program makes
the tool language independent.
This is the code that is uploaded by the user. The automatically generated PLY
program identifies the compile-time errors in this code.
The program generated PLY program takes the code that is uploaded by the user as
input and provides descriptive error messages for the compile-time errors. This PLY
program is automatically generated and no changes are made to this while detecting
syntax errors for different languages.
Some of the syntax errors which were tackled are:
– Missing or extra parenthesis
– Missing colon after leader
– Indentation errors
– Missing, misspelled or misuse of Keywords
– Mismatched quotes
– Misuse of assignment operator(=).
One of the challenges is identifying the various types of errors possible. For
example, consider the following code snippet:
1 print("Hello World")
Some of the different possible syntax errors for the above code snippet are:
1 #Missing left parenthesis
2 print "Hello World")
3
4 #Missing right parenthesis
5 print ("Hello World"
6
7 #Extra parenthesis on the left
8 print (("Hello World")
9
10 #Misspelled keyword
11 prnt ("Hello World")
12
13 #Mismatched quote
14 print (’Hello World")
3 Results
We conducted a survey with 100 engineering students who had basic knowledge of
at least one programming language. As shown in Fig. 3a, students preferred a more
elaborate error message(error message 2) emitted by our framework over standard
Flexible Language-Agnostic Framework to Emit Informative Compile-Time . . . 865
Fig. 3 Survey
Unlike the existing tools, our framework is able to detect all compile-time syntax
errors at once even for an interpreter environment. Consider the following code
snippet:
1 # check whether a given string is a palindrome or not
2 def isPalindrome(str)
3 for i in range(0, int(len(str)/2)):
4 if str[i] != str[len(str)-i-1]:
5 return False
6 return True
7 s = "malayalam"
8 ans = isPalindrome(s)
9 else:
10 print("No")
Grammars are used to describe the syntax of a programming language, and hence
any syntactically correct program can be written using its production rules. Our
framework identifies the compile-time syntax errors in a user given program using
a novel approach. This involves modifying the grammar provided for a language to
contain error production rules to detect possible syntax errors. Our approach has not
been used before. We use a Python program to generate the PLY program that detects
syntax errors.
Our framework is language independent and no code changes are required while
working with different programming languages. This makes our framework flexible.
the framework remains the same irrespective of the programming language chosen.
The only requirement is that the error productions provided and the uploaded user
program is of the same language.
The following describes the improvements and further steps that could be taken
with this framework.
This framework could be extended to detect multiple syntax errors on a single line.
For example, consider the following code snippet:
1 prnt "Hello World")
The above code snippet has two syntax errors. First, misspelled keyword, and second,
missing left parenthesis. However, since our framework scans tokens from left to
right, only the first error is detected. The second error is detected only after correcting
the first error.
Presently, the rules to identify the errors have to be provided by the domain expert.
This can be auto generated. Instead of specifying all possible error productions, the
different error productions for a given correct production can be generated automat-
ically. For example, consider the following correct production:
1 Program -> DEF SPACE ID ( Parameter_List ) : Funcbody
Using the correct production, the following error productions could be generated:
1 Program -> DEF SPACE ID ( Parameter_List ) Funcbody ‘‘ Missing Colon
2 Program -> ID ( Parameter_List ) : Funcbody ‘‘ Missing keyword ’def’
3 Program -> DEF SPACE ID (( Parameter_List ) : Funcbody ‘‘ Extra left parenthesis
References
1. Kummerfeld, Sarah K, Kay J (2003) The neglected battle fields of syntax errors. In: Proceedings
of the fifth Australasian conference on Computing education, vol 20
2. Eclipse IDE (2009) Eclipse IDE. www.eclipse.org. Last visited 2009
3. IntelliJ IDEA (2011) The most intelligent Java IDE. JetBrains. Dostupné z: https://www.
jetbrains.com/idea/. Cited 23 Feb 2016
4. https://www.tiobe.com/tiobe-index/
5. Javier Traver V (2010) On compiler error messages: what they say and what they mean. Adv
Hum-Comput Interact Article ID 602570:26. https://doi.org/10.1155/2010/602570
868 M. Nagalakshmi et al.
6. Becker BA et al (2019) Compiler error messages considered unhelpful: the landscape of text-
based programming error message research. In: Proceedings of the working group reports on
innovation and technology in computer science education, pp 177–210
7. Becker BA et al (2018) Fix the first, ignore the rest: Dealing with multiple compiler error mes-
sages. In: Proceedings of the 49th ACM technical symposium on computer science education
8. Brown P (1983) Error messages: the neglected area of the man/machine interface. Commun
ACM 26(4):246–249
9. Marceau G, Fisler K, Krishnamurthi S (2011) Mind your language: on novices’ interactions
with error messages. In: Proceedings of the symposium on new ideas, new paradigms, and
reflections on programming and software, pp 3–18
10. Sahil B, Rishabh S (2018) Automated correction for syntax errors in programming assignments
using recurrent neural networks
11. Kelley AK (2018) A system for classifying and clarifying python syntax errors for educational
purposes. Dissertation. Massachusetts Institute of Technology
12. Beazley D (2001) PLY (Python lex-yacc). See http://www.dabeaz.com/ply
Enhancing Multi-factor User
Authentication for Electronic Payments
Abstract Security is beginning to be more and more important for electronic trans-
action nowadays, and the need for security is becoming more important than ever
before. A variety of authentication techniques have been established to ensure the
security of electronic transactions. The usage of electronic payment systems has
grown significantly in recent years. To secure confidential user details from attacks,
the finance sector has begun to implement multi-factor authentication. Multi-factor
authentication is a device access management strategy that enables an individual to
move through various authentication phases. For each of the previous tasks, attempts
have been created to secure the electric payment process by using various authenti-
cation methods, and despite the advantages of theirs, each had a downside. Authenti-
cation based on password is one of the most common ways for users to authenticate
in numerous online transaction applications. Inhere, electronic payments authentica-
tion mechanism which mainly relies on the traditional password only authentication
cannot efficiently resist to the latest password wondering and password cracking
strikes. In order to handle the problem, this paper proposes an authentication algo-
rithm for electric payments by adding a multi-factor mechanism on the existing user
authentication mechanism. The enhancement concentrates on enhancing the user
authentication elements of multiple factors using the biometric technique. The soft-
ware is developed using an android simulator, and a platform that helps developers
to evaluate an application without the requirement for the device to be built on a
real device. The proposed system has two phases, namely: registration stage and an
authentication stage. The suggested authentication protocol gives users safe access
to the authorization through multi-factor authentication using their password and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 869
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_63
870 Md Arif Hassan et al.
fingerprint. In order to ensure unauthorized users cannot easily break into the mobile
application, the proposed enhancement would provide better security.
1 Introduction
The multi-factor authentication has been built on the Internet in order to improve
user authentication efficiency and make it impossible for attackers to access and the
cracking systems. It provides security of information for companies and prevents
them from crashing or losing money. When online transfers arise, consumers still
worry for hackers and anti-social activities as they transfer money from one account
to another. Nevertheless, it is essential to validate users and so as to keep user infor-
mation harmless on cloud and cryptographic techniques are required to encrypt this
sensitive data. The most significant use of multi-factor authentication is to ensure that
only authenticated or authorized users are entitled to process their financial trans-
actions in financial services such as online banking, online banking, and Internet
banking. A great change in electronic transactions has increased equal to the secu-
rity attack against electronic payment systems. Some of these attacks are managed
by the weaknesses of user authentication systems that are performed online.
The authentication process is necessary to encourage users to enter passwords,
and if it suits the current user, the user is authenticated and otherwise not permitted to
sign in to the system. The first step is always authentication for online transactions.
Authentication based on password (single-factor authentication) is one of the most
common ways for users to authenticate in numerous mobile applications which is
known as a single-factor authentication. However, password-based authentication
schemes have many issues, and the risk of using passwords in corporate application
authentication is not precise. One of the major problems with password-based authen-
tication is that the majority of users do not know how strong password is. Two-factor
authentication, the extra security action which involves individuals to get into a code
route to their email or phone, has usually been effective to maintain usernames and
passwords protection from attacks. The usage of two-factor applying these factors
has reduced fraud but has not stopped [1]. The uses of two-factor authentication
are using way too many tokens, token forgery, token costs, and shed tokens [2, 3].
Nevertheless, security industry specialists have confirmed an automated phishing
attack which may cut through that additional level of protection—even called 2FA—
perhaps, trapping unsuspicious users into sharing their private credentials [4]. The
evolution of authentication techniques from SFA to MFA can be seen in Fig 1.
Authentication strategies, which typically rely on over one component, are more
difficult to compromise than one component method. A multi-factor authentication
feature is necessary to render the solution successful and secure in order to improve
Enhancing Multi-factor User Authentication … 871
Fig. 1 Evolution of
authentication techniques
from SFA to MFA [5] Single Factor Authentication
Knowledge Factor
PIN, password, security question
Two-Factor Authentication
Ownership factor
Smartphone, key-card, one-time password
Multi-Factor Authentication
Biometric factor
Biometric factor: Fingerprint, face, iris,
voice, vein etc.
the protection of online transactions. This paper intends to design and apply an elec-
tronic payment for authentication of a secure online payment system in this system;
the payment process requires multiple authentications for a transaction rather than
sending it directly to a customer. The emerging trend is now biometric authentica-
tion. Because of the high protection level, it is really user-friendly, and the fingerprint
login option is increasing. Biometrics is the process by which a person is physio-
logic or chemical features are authenticated. For each person, these characteristics
are unique. They can never be stolen or repeated, avoid different forms of assaults,
and allow us to secure our personal records. Our proposed multi-factor authentication
is a many stages of authentication that the person has going with. The person must
be authenticated firstly through his password and fingerprint biometric to proceed
validation process. After all of the methods on the person becomes authenticated,
and then just user’s is able to access their account.
The article is divided into four parts. The first section would provide a descrip-
tion of the existing system and the formulations of the problem. The implemen-
tation of electronic payments and its related analysis, already discussed, is in the
following section. The third section explains the method overall. Section 4 outlines
the implementation of the model and its conclusion and potential research are the
final section.
872 Md Arif Hassan et al.
2 Literature Review
A variety of authentication strategy was created to make certain the protection of elec-
tronic transactions. Single-factor authentication (SFA) enables device access through
a single authentication process. A simple password authentication scheme is the most
common example of SFA. The system prompts the user for a name followed by a
password. Passwords are saved on server aspect in encrypted type or even using hash
functions, also the username as well as passwords transmit in encrypted type over
the secure connection. Therefore, if some intruder gets access over the system, there
has no be worried about leakage of information, as it will not expose any info about
real password. Though it appears protected, however, in practical, it is significantly
less secure as an assailant is able to gain original password of a customer utilizing
various assault after a couple of combinations [6]. By posting the password, one may
compromise the account right away. An unauthorized user may also try to increase
access through the use of the different kinds of attack like, brute force attack [7–9],
rainbow table attack [10, 11], dictionary attack [12–14], social engineering [15–17],
phishing [18, 19], MITMF [20, 21], password-based attack [7, 22], session hijacking
[23], and malware [19, 24, 25].
Single-factor authentication image-based schemes are an approach in Bajwa [26].
The drawback of the system is that it takes more time for authentication and shoulder
surfing is possible in this method. A common technique used in electronic payment
authentication requires drawing design codes on the display screen, which approach
by Vengatesan et al. [27]. To mitigate the problem of single-factor authentication,
the two-factor authentication was considered as a solution for securing online trans-
actions and recognizing the authenticated person and logging in to a system or appli-
cation, and many current and new companies are racing to deploy it. Two-factor
authentication is a mechanism that implements more than one factor and is considered
stronger and more secure than the traditionally implemented single-factor authenti-
cation system. Two-factor authentication using hardware devices, such as tokens or
cards, OTP has been competitive and difficult to hack to solve these problems.
There are plenty of benefits of 2FA technique, but opponents have been working
on breaking this strategy and discovered a number of ways to hack it as well as
expose the very sensitive info of the people.
Although 2FA systems are powerful, they still suffer by using malicious attacks
like lost/stolen-wise card attacks, token costs, token forgery, and lost tokens [3],
shooting a phony fingerprint from the original fingerprint, as well insider attacks
[28]. Due to increase online transaction two-factor authentication is not enough for
performing expensive transactions online [2]. There is therefore a necessity for a
strong and secure more system according to the multi-factor authentication to check
out the validity of users. Principally, MFA involves different things such as, for
instance, biological and biometrics, the sensible mobile device, token unit, along
with the smart card. This authentication program enhances the security level and
also provides for the use of identification, verification, and then authentication for
guaranteeing user authority. In order to mitigate the issue of two-factor authentication,
Enhancing Multi-factor User Authentication … 873
For each of the previous tasks, attempts have been created to secure the electric
payment process by using various methods, and despite the advantages of theirs,
each had a downside. With a bit of searching right here, several existing deficiencies
dependent on RSA encryption algorithms are solved and also a protected method is
suggested. This paper proposes a multi-factor authentication component algorithm
for electric payments.
The proposed system has two phases, namely: registration stage and an authenti-
cation stage. A comprehensive explanation of each phase is provided below. Table 1
presents the notations used in the proposed system of ours. Before making use of the
apps, the person should register the information of theirs during a procedure known
as the registration phase. Verification of that information may just be achieved by a
procedure known as an authentication phase. Each of the suggested materials and
874 Md Arif Hassan et al.
Table 1 Demographic
Notations Description
characteristics
Ui User
IDi Unique identifier of user
UiEid User email id
UiP User phone number
UiA User address
UiPass Password of the user
UiBF Biometric feature of user
DBs Server database
strategies are completed in the system during both registration process as well as the
authentication procedure, their process flow is reviewed in this area.
To be able to use the service on, the person will need to do one-time registration. In
the particular registration stage that can be used to gather all of the users’ info, by
using that info when the person would like to login into that method the server checks
if the person is legitimate. Here, the person has to register their account together with
the details; we explain the registration measures as follows (Fig. 2):
Step-1: Start
Step-2: Input user information
Ui = IDi + UiEid + UiP + UiA + UiPass + UiBF
Step-3: Combining all user information store in database
DBs = IDi + UiEid + UiP + UiA + UiPass + UiBF
Step-4: Registration complete
End: Welcome notification for registration.
When the customer tries to sign in on the method, the authentication server has to
authenticate the user. If both values are identical, then the authentication is pros-
perous. In information, the person must authenticate him/herself utilizing the many
authentication actions for the login progression as well as the transaction process.
The authentication module comprises of two primary processes: specifically, login
process as well as the authentication process. In login procedure, the person has to
login utilizing the authorized password and number, fingerprint, and IMEI for authen-
tication. After the user login to the method, the person is only able to see the account
details. On the other side, for transaction, the person needs to again authenticate their
Enhancing Multi-factor User Authentication … 875
No If all Match ?
Yes
Registration Successful
Congratulation
Notification
Stop
self-using the fingerprint authentication. Just when the person authenticates with the
fingerprint specifics, the transaction could be accomplished. The comprehensive
process is talking about in the following steps (Fig. 3):
Step-1: Start
Step-2: Input user password
Ui = UiPass
876 Md Arif Hassan et al.
Input User
registered
fingerprint
No If Match ?
Yes
Input User
password
No If Match ?
Yes
Authentication Successful
!!!
Stop
Else, step 3
Step-4: Authentication Successful!!
Step-4: End-access granted go for next module.
In this particular registration stage, which can be used to gather all the users’
info, by using that info when the person would like to login into that particular
method. In this particular stage, the moment operator enters all his personal identifi-
cation, and then simply user is able to obtain access for the method. The registration
procedure is shown in Fig. 4. A session is produced for him/her afterward he/she
becomes equipped to access materials and he can alter the personal details of his in
panel. After successful registration and owner approval, the customer will see the
profile display on the payment process. In the profile, there is going to be many
service models including wallet balances, leading set up mobile operators, shopping
as well as include money. The most as well as the latest user authentication and
account access control of Internet banking systems is based on an individual compo-
nent authentication, namely operator title as well as a password. The protection of
password-based authentication, nonetheless, is determined by the power of the user’s
selected pass, etc.
Android software development kit (SDK) engine was the primary development kit
used for this project based on the scalability of devices the application can be used
on and the rich application framework it provides, allowing users to build innovative
applications for mobile devices within a Java language environment. The primary
development kit used for this project was the Android software development kit
(SDK). The front-end style is easy to use as well as simple as soon as the solution is
begun, the person will certainly register himself, and after that, he will certainly have
the ability to login right into the system. A hardware to help the back and front end
of the device is essential for all applications to be built. The software that was used
for this development is android studio. Google created android studio for android
programming specifically. It has a comfortable interface that helps a customer to
access and check the submission. To fix bugs, the built-in SDK was used to operate
the device.
The system is developed to run particularly android studio virtual device nexus
Google pixel 2, and all various other different types of smartphone devices that
utilize this innovation. The system is independent that it can service all android-
based smartphone devices. The android platform also provides built-in database
(SQLite database) and Web services. SQLite databases have been developed into the
android SDK. SQLite is a SQL database engine which stores data in.db files [36].
Two types of android SDK storage are widely available, namely internal and external
storage. Save files in internal storage are containerized by default, so other apps on
the computer cannot access them. Such files are removed when the user uninstalls
878 Md Arif Hassan et al.
your program. At the other side, a computer compatible with android allows external
shared storage. Store can be replaced either internally or separately (such as an SD
card). Files stored in external storage can be read worldwide. Once USB mass storage
is allowed, the user is able to modify them.
The prototype is evaluated based on the registration stage and authentication
stages. The simulation is run in the Web server side on a DELL laptop computer
Enhancing Multi-factor User Authentication … 879
with Intel Core i7 CPU, 3.40 GHz CPU as well as 6 GB RAM. The operating system
is Windows 10 professional. Android is an open-source operating system built on
Linux and the android platform that makes everyday activities simple, fast, and
helpful apps for mobile devices. The android architecture offers server-to-application
compatibility with certain APIs.
Java cryptography architecture (JCA) is used to develop android cryptography
APIs. The JCA is a collection of digital signature APIs, message digests, authen-
tication of credentials and certificates, verification, key generation and control, and
stable generation of random numbers. These APIs allow security to be easily inte-
grated into their application code by developers [37]. For our implementation, we
used javax crypto APIs. Developing APIs/ GUI for use of a fingerprint reader applica-
tion program interface or graphical user interface. The authentication of fingerprint is
feasible only on smartphones, with a touch sensor for user recognition and connection
to software and program features. The execution of fingerprint authentication is an
enormous multi-step process at first. Fingerprint authentication is primarily a cryp-
tographic process involving a key, encryption cipher, and a fingerprint manager for
the authentication function. For an android device, there are several common ways
to incorporate fingerprint authentication. The Keyguard Manager and the Finger-
print Manager for Fingerprint Authentication use two system services. For using
the fingerprint analysis, a fingerprint manager is necessary. Within the Fingerprint
Manager, several of the fingerprint-based approaches can be found.
Fingerprint scanner and API source code collection have been used for developing
API/GUI for this study. A new users signs up to the application by clicking the regis-
tration switch on the welcome page to start with and after that sending his/her info in
the registration page. The user will certainly need to sign up in the application on first
usage. After registration, the client will certainly obtain a username and also pass-
word. Our proposed work based on device-based using keystore. The system stores
user data and compares it with existing databases. In this case, only authentication
will be effective if the current and existing database match. The Android Keystore is
a program that makes it easier for developers to build and store cryptographical keys
in containers. The Android Keystore is another JavaKeystoreAPI implementation,
which is a repository for certificates of authorization or public key certificates and
which uses Java-based encryption, authentication, and HTTPS-service applications
in several situations. The entries are encrypted with a password from a keystore. The
most stable and suggested form of keystore is currently a strongbox-backed Android
Keystore [38]
The signup display is revealed and listed below Fig. 4b reveals a screenshot
of the signup pages. When the registration is complete, the user requirement to
the authentication procedure begins. A confirmation message if the customer signs
successfully “successfully registered” is displayed as shown below Fig. 4c shows the
screenshots from the login phase procedure in steps 1 and 2. Various monitoring are
also held during user registration. The user has to use their fingerprint and password
for logging into the application for each time usage. The authorized user must apply
to sign up fingerprint and password otherwise if the user enters such a fingerprint
880 Md Arif Hassan et al.
or password that is not registered then the user will get a notification message that
“incorrect fingerprint or password.”
5 Conclusion
The proposed method is used for mobile application and security in an electronic
payment system, using a biometric verification feature, which is used to validate the
fingerprint model registered at the time of registration. The customer can perform the
transaction and protection will be given if the fingerprint is matched with the samples
in the databases, authentication will succeed. It provides users with access to the
authorization secure way through multi-factor authentication using their password
and fingerprint. This approach is simply to guarantee security and trust in the financial
sector. Our algorithm provides an extra protection layer that stops hackers from
targeting phishing and social engineering. The approach solution strengthens the
existing system of authentication. It greatly increases mobile banking networks of
protection by offering three-dimensional assurances from three separate areas such as
knowledge, inherent, and possession. It also increases the user interface by allowing
verification simpler for consumers. This process can be used by anyone who has a
smart device that is support biometric fingerprint authentication.
Acknowledgements The authors would like to thank the anonymous reviewers for their helpful
feedback. This research was funded by a research grant code from Ya-Tas Ismail—University
Kebangsan Malaysia EP-2018-012.
References
8. Ali MA, Arief B, Emms M, Van Moorsel A (2017) Does the online card payment landscape
unwittingly facilitate fraud? IEEE Secur Priv 15(2):78–86
9. ENISA (2016) Security of mobile payments and digital wallets, no. December. European Union
Agency for Network and Information Security (ENISA)
10. Sudar C, Arjun SK, Deepthi LR (2017) Time-based one-time password for Wi-Fi authentication
and security. In: 2017 International conference on computer communication and informatics,
ICACCI 2017, vol 2017, pp 1212–1215
11. Kogan D, Manohar N, Boneh D (2017) T/Key: second-factor authentication from secure hash
chains dmitry, pp 983–999
12. Jesús Téllez Isaac SZ (2014) Secure mobile payment systems. J Enterp Inf Manag 22(3):317–
345
13. Dwivedi A, Dwivedi A, Kumar S, Pandey SK, Dabra P (2013) A cryptographic algorithm
analysis for security threats of semantic e-commerce web (SECW) for electronic payment
transaction system. Adv Comput Inf Technol 367–379
14. Yang W, Li J, Zhang Y, Gu D (2019) Security analysis of third-party in-app payment in mobile
applications. J Inf Secur Appl 48:102358
15. Gualdoni J, Kurtz A, Myzyri I, Wheeler M, Rizvi S (2017) Secure online transaction algorithm:
securing online transaction using two-factor authentication. Proc Comput Sci 114:93–99
16. Venugopal H, Viswanath N (2016) A robust and secure authentication mechanism in online
banking. In: Proceedings of 2016 online international conference on green engineering and
technologies—IC-GET 2016, pp 0–2
17. Roy S, Venkateswaran P (2014) Online payment system using steganography and visual
cryptography. In: 2014 IEEE students’ conference on electrical engineering and computer
sciences—SCEECS 2014, pp 1–5
18. Alsayed AO, Bilgrami AL (2017) E-banking security: internet hacking, analysis and prevention
of fraudulent activities. Int J Emerg Technol Adv Eng 7(1):109–115
19. Ataya MAM, Ali MAM (2019) Acceptance of website security on e-banking—a review. In:
ICSGRC 2019–2019 IEEE 10th control and system graduate research colloquium, Proceeding,
pp 201–206
20. Kaur R, Li Y, Iqbal J, Gonzalez H, Stakhanova N (2018) A security assessment of HCE-NFC
enabled E-wallet banking android apps. In: Proceedings of international conference on software
and computer applications, vol 2, pp 492–497
21. Chaudhry SA, Farash MS, Naqvi H, Sher M (2016) A secure and efficient authenticated encryp-
tion for electronic payment systems using elliptic curve cryptography. Electron Commer Res
16(1):113–139
22. Skračić K, Pale P, Kostanjčar Z (2017) Authentication approach using one-time challenge
generation based on user behavior patterns captured in transactional data sets. Comput Secur
67:107–121
23. Ibrahim RM (2018) A review on online-banking security models, successes, and failures. In:
International conference on electrical, electronics, computers, communication, mechanical and
computing (EECCMC). IEEE EECCMC
24. Elliot M, Talent K (2018) A robust and scalable four factor authentication architecture to
enhance security for mobile online transaction. Int J Sci Technol Res 7(3):139–143
25. Shi K, Kanimozhi G (2017) Security aspects of mobile based E wallet. Int J Recent Innov
Trends Comput Commun
26. Bajwa G, Dantu R, Aldridge R (2015) Pass-pic: a mobile user authentication. In: 2015 IEEE
international conference on intelligence and security informatics: securing the world through
an alignment of technology, intelligence, humans Organ. ISI 2015, p 195
27. Vengatesan K, Kumar A, Parthibhan M (2020) Advanced access control mechanism for cloud
based E-wallet, vol 31, no. August 2016. Springer International Publishing, Berlin
28. Mohammed and Yassin (2019) Efficient and flexible multi-factor authentication protocol based
on fuzzy extractor of administrator’s fingerprint and smart mobile device. Cryptography 3(3):24
29. Nwabueze EE, Obioha I, Onuoha O (2017) Enhancing multi-factor authentication in modern
computing. Commun Netw 09(03):172–178
882 Md Arif Hassan et al.
30. Benli E, Engin I, Giousouf C, Ulak MA, Bahtiyar S (2017) BioWallet: a biometric digital
wallet. In: Twelfth international conference on information systems (Icons 2017), pp 38–41
31. Alibabaee A, Broumandnia A (2018) Biometric authentication of fingerprint for banking users,
using stream cipher algorithm. J Adv Comput Res 9(4):1–17
32. Suma V (2019) Security and privacy mechanism using blockchain. J Ubiquitous Comput
Commun Technol (UCCT) 1(1):45–54
33. Sivaganesan D (2019) Block chain enabled internet of things. J Inform Technol 1(1):1–8
34. Hassan A, Shukur Z, et al (2020) A review on electronic payments security. Symmetry (Basel)
12(8):24
35. Hassan A, Shukur Z, Hasan MK (2020) An efficient secure electronic payment system for
E-commerce. Computers 9(3):13
36. Guide MST (2020) Data storage on android—mobile security testing guide. Avail-
able: https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/
0x05d-testing-data-storage#keystore. Accessed: 27 Jul 2020
37. Guide MST (2020) Android cryptographic APIs—mobile security testing guide. Avail-
able: https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/
0x05e-testing-cryptography. Accessed: 27 Jul 2020
38. Android D (2020) Android keystore system | android developers. Available: https://developer.
android.com/training/articles/keystore. Accessed: 16 Aug 2020
39. Mridha MF, Nur K, Kumar A, Akhtaruzzaman M (2017) A new approach to enhance internet
banking security. Int J Comput Appl 160(8):35–39
40. Soare CA (2012) Internet banking two-factor authentication using smartphones. J Mobile,
Embed Distrib Syst 4(1):12–18
Comparative Analysis of Machine
Learning Algorithms for Phishing
Website Detection
Abstract Internet has become the most effective media for leveraging social inter-
actions during the COVID-19 pandemic. Users’ immense dependence on digital
platform increases the chance of fraudulence. Phishing attacks are the most common
ways of attack in the digital world. Any communication method can be used to target
an individual and trick them into leaking confidential data in a fake environment,
which can be later used to harm the sole victim or even an entire business depending
on the attacker’s intend and the type of leaked data. Researchers have developed
enormous anti-phishing tools and techniques like whitelist, blacklist, and antivirus
software to detect web phishing. Classification is one of the techniques used to detect
website phishing. This paper has proposed a model for detecting phishing attacks
using various machine learning (ML) classifiers. K-nearest neighbors, random forest,
support vector machines, and logistic regression are used as the machine learning
classifiers to train the proposed model. The dataset in this research was obtained
from the public online repository Mendeley with 48 features are extracted from
5000 phishing websites and 5000 real websites. The model was analyzed using F1
scores, where both precision and recall evaluations are taken into consideration. The
proposed work has concluded that the random forest classifier has achieved the most
efficient and highest performance scoring with 98% accuracy.
D. Sarma (B)
Department of Computer Science and Engineering, Rangamati Science and Technology
University, Rangamati, Bangladesh
e-mail: dhiman001@yahoo.com
T. Mittra
Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh
R. M. Bawm · T. Sarwar · F. F. Lima · S. Hossain
Department of Computer Science and Engineering, East Delta University, Chittagong, Bangladesh
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 883
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_64
884 D. Sarma et al.
1 Introduction
Today’s digital world is increasingly carried out in wide range of platforms from
business to health care. Massive online activities open door for cyber criminals.
Phishing is the most successful and dangerous cyber-attack observed across the
globe. Phishing attacks are dangerous and it can be avoided by simply creating
awareness and developing the habits of staying alert and continuously being on the
lookout when surfing through the Internet and by clicking links after verifying the
trustworthiness of the source links. There are also tools such as browser extensions
that notify users when they have entered their credentials on a fake site, therefore,
possibly having their credentials transferred to a user with malicious intent. Other
tools can also allow networks to lock down everything and only allow access to
whitelisted sites to provide extra security while compromising some convenience on
the user side [1–4].
A company can take several measures to protect itself from phishing attacks. But
the core problem is still relying on the employees to some extent on being careful and
alert at all. While it ensures the reliability of machines, humans are not customizable.
A mistake from one employee could be enough to lead to a vulnerability that an
attacker can skillfully exploit and cause damage to an entire company if not detected
and contained in time. Security is a significant concern for any organization [5–9, 22].
This paper decided to employ the concepts of machine learning to train a model
that would learn to detect links that could be attempting to execute a phishing attack
and allow the machine to become an expert at detecting such sites and alerting humans
without having to rely on the human minds too much. By using artificial intelligence,
this research intended to add another layer of security that would tirelessly detect
sites and get better at its performance over time given more datasets to learn from
and allow humans to share their responsibilities, regarding careful Internet surfing
with the machines.
2 Related Research
Different research works that were pertinent to phishing attacks and essential clas-
sification techniques that were practiced to detect web phishing were highlighted in
this section.
With the current boom in technology, phishing has become more popular among
malicious hackers. The first-ever phishing lawsuit was filed in 2004 when a certain
phisher created a duplicate of a popular website known as “America Online”. With
the help of this same website, he was able to get access to personal user info and
bank details of many individuals. Phishers began to focus on websites that had online
transactions with people and made legions of fake websites that began to trick unsus-
pecting people into thinking they were the real one. Table 1 shows various types of
phishing attacks.
Comparative Analysis of Machine Learning Algorithms … 885
the probability of being a spam. Phishers use spam e-mails to direct a client to their
malicious webpage and steal data.
3 Methodology
As this paper mainly employs machine learning techniques to train models that can
detect phishing websites, our first step is to understand that our research understood
how machine learning works. In a nutshell, all machine learning techniques involve
a dataset and some programmed code that perform computations allowing the code
to analyze a portion of the data and observe relationships between the features and
the classification of the data. The machine’s trained knowledge of the relationship
was then tested against the rest of the data, and its performance was measured and
scored. Based on the performance of the model, the setup of the training procedure
and the dataset preprocessing were readjusted in hopes of better results in the next
iteration of training. If a model failed to provide satisfactory results, other techniques
were employed if found relevant to the dataset. If the model performs better than
all other trained models; however, the model was stored and used on new unknown
datasets to verify its performance furthermore.
It is important to note that different datasets could be in different formats, and
therefore new datasets introduced to the model might require preprocessing to
maintain optimal performance from the model.
To better demonstrate the model, Fig. 1 demonstrates the process.
3.1 Dataset
The dataset in this research was obtained from the public online repository Mendeley.
The dataset contained 48 features extracted from 5000 phishing websites and 5000
real websites. An improved feature extraction technique was employed to this dataset
by using the browser automation framework. The class label indicated two outcomes
where 0 was a phishing website, and 1 was a real website.
Any collected dataset usually comes with errors, variable formats, different features,
incomplete sections, etc. If the dataset is used directly to train a model, it could lead
to unexpected behavior, and the results would rarely ever satisfy the expected needs.
Comparative Analysis of Machine Learning Algorithms … 887
Step 6. Feature scaling: Feature scaling was used to set a limit to the range
of variables to allow for comparison on common grounds. But it was need not to
implement for our dataset.
3.3 Classifiers
The model picked K-nearest neighbors, random forest, support vector machines,
and logistic regression as the machine learning techniques to train our model. The
models after having been trained and could then be analyzed using F1 scores that
took into account both precision and recall evaluations of the models. The model was
judged based on how much bias it contained on predicting the labels for a sample of
data, and how much difference existed in the fit of the data when compared between
the test set results and the train set results and a measurement that was referred to as
variance.
Random Forest
Random forests are made using decision trees, so it is important to understand
decision trees before understanding random forests.
Decision trees were made out of data by analyzing the features in a dataset and
creating a root using the feature that has the most impact on the label to the feature
set. These can be measured using different scoring techniques like the Gini Index.
Once a root had been decided on, the rest of the features were analyzed and similar
to the selection of the root, and the features were scored and the most significant
feature among the rest was added as a child to the root. This technique was repeated
until all of the features were added to the tree.
When a label was to be decided, the root feature was selected and then its prob-
ability was used to determine the path to take from its node, and similarly, it has
decided the intended next path from the next node and its corresponding feature. The
process was repeated until reached a leaf node that was the end of the tree, where
the decision was finalized, and therefore a label was provided by our model.
Although decision trees are good at predicting labels from the dataset used to
create them, they are not so good at predicting labels on an entirely new set of features
that are considered to be somewhat inaccurate at their predictive capabilities. This
inaccuracy can be minimized by using random forests.
Comparative Analysis of Machine Learning Algorithms … 889
The first step in generating a random forest was using the dataset to create a
bootstrapped dataset. This new dataset would contain samples from the original
dataset but would be randomly selected and placed into the new table, with the
possibility of some samples existing in the new table more than once.
The second step was to select a random subset of the features and analyze those,
using our chosen scoring technique, to generate the root of the decision tree. To add
children to the root, another random subset from among the rest of the features was
once again selected and was analyzed to pick the next child. The second step was
repeated several times to generate a wide variety of decision trees which increased
the accuracy of our model compared to using one individual decision tree.
The process of labeling an unlabeled sample of data was using all of the decision
trees to predict labels according to each of them, and then keeping track of the labels
produced by each tree, and finally selecting the label that was predicted by the most
number of decision trees. The most popular label was selected as the final predicted
label and was usually more accurate then what would have achieved from using a
single decision tree.
While random forests are deterministic, another model called extremely random-
ized tree can also be used which introduces more randomness in its generation of
trees. The splits in ERTs are not optimal, and therefore, can lead to more business.
Variance is reduced as well because of the extra randomness. While both random
forests and extremely randomized trees perform quite similarly, ERTs are usually
more inaccurate but also understandably faster in computation. But ERTs should be
avoided if the dataset contains a lot of noisy features which can reduce its functionality
even more [18–20].
When a new unlabeled sample is provided, it can simply be plot within the graph
and compare its position with that of the support vector classifier to observe which
side of the separator it falls into, and therefore, classify the sample accordingly.
Support vector classifiers also have other versions of it. For example, while linear
SVC only attempts to fit a hyperplane within a data to best separate the different
categories of the data, a Nu-SVC uses a parameter to control the number of support
vectors [21, 23].
Logistic Regression
Logistic regression is based on the concept of linear regression, where a line is
plotted against a given dataset and its axes. This line is drawn such that the squared
differences between this line and the plotted points are at their minimum. The line
and calculated R2 are used to determine whether the features are correlated. The p-
value was also calculated to verify that this value was actually statistically significant.
Finally, the line was used to plot any sample of data and finds a label’s corresponding
value according to this line.
Logistic regression uses a similar concept but is different such that it can only
classify two labels and no more. Another difference is that it does not use a straight
line, but rather an S-shaped curve which goes from 0 to 1. It tells the probability of
a given sample to belong in one of these two labels.
Logistic regression CV uses cross-validation over logistic regression to further
improve the quality of our model. When cross-validation was applied, sections of
data from the dataset were resampled in separate sessions to achieve multiple results.
It could calculate the mean probability which can label the data and can get more
accurate results.
To reach the right equation of the line, stochastic gradient descent was used
which used gradients of the loss function at each iteration as an indication to lead
to the proper values to be placed within the line equations constants, and therefore
minimizing the loss in the process and deriving the optimal line equation for our
dataset.
4 Result
Precision recall, F1 score, and success rate are widely used to measure the perfor-
mance of the supervised machine learning algorithms [24–27]. Classification report
of our model is described below. In all the tables, the row indicates 1 as a real website
and 0 as a phishing website.
Table 3 presents the classification report of the support vector machine. The preci-
sion and recall for predicting a real website are 0.920 and 0.898. These scores were
used to calculate the F1 score for predicting a real website which was 0.909. Simi-
larly, the precision and recall for predicting a phishing website are 0.895 and 0.917.
Using these both scores, the F1 score was measured for predicting a phishing website
and is 0.906. It is to be noted that the precision for predicting a real website is higher
Comparative Analysis of Machine Learning Algorithms … 891
while the recall for predicting a phishing website is higher. The F1 scores are similar.
F1 score was compared to other algorithms to find the optimal one.
Table 4 represents the classification report of the non-uniform support vector
classifier. The precision and recall for predicting a real website are 0.897 and 0.851.
These scores were used to calculate the F1 score for predicting a real website which is
0.874. Similarly, the precision and recall for predicting a phishing website are 0.851
and 0.896. Using these both, the F1 score was measured for predicting a phishing
website which is 0.873. The scores for these are significantly lower than support
vector machine.
Table 5 presents the classification report of the linear support vector classifier. The
precision and recall for predicting a real website are 0.900 and 0.970. These scores
were used to calculate the F1 score for predicting a real website which is 0.933.
Similarly, the precision and recall for predicting a phishing website are 0.965 and
0.885. Using these both scores, the F1 score was measured for predicting a phishing
website that is 0.923. The F1 scores in here are significantly higher than support
vector classifier.
Table 6 represents the classification report of KNN. The precision and recall for
predicting a real website are 0.854 and 0.905. These scores were used to calculate
the F1 score for predicting a real website which is 0.879. Similarly, the precision
and recall for predicting a phishing website are 0.893 and 0.836. Using these both,
the F1 score is measured for predicting a phishing website is 0.864. The F1 scores
in here are significantly lower than linear support vector classifier.
Table 4 Nu-SVC
Precision Recall F1
classification report
1 0.897 0.851 0.874
0 0.851 0.896 0.873
Table 7 presents the classification report of logistic regression. The precision and
recall for predicting a real website are 0.897 and 0.898. These scores are used to
calculate the F1 score for predicting a real website which is 0.878. Similarly, the
precision and recall for predicting a phishing website are 0.892 and 0.891. Using
these both, the F1 score is measured for predicting a phishing website is 0.892. The
precision, recall, and F1 scores for both 1 and 0 are remarkably close to each other
indicating that this algorithm works well for both precision and recall. However, the
F1 scores are still lower than linear SVC, so it cannot be considered as the best one.
Table 8 presents the classification report of logistic regression CV, where CV
stands for cross-validation. The precision and recall for predicting a real website are
0.937 and 0.948. These scores were used to calculate the F1 score for predicting a real
website which is 0.942. Similarly, the precision and recall for predicting a phishing
website are 0.944 and 0.932. Using these both scores, the F1 score was measured for
predicting a phishing website which is 0.938. The precision, recall, and F1 scores for
both 1 and 0 are remarkably close to each other, indicating that this algorithm works
well for both precision and recall. The F1 scores in here were better than linear SVC
so this is the best score so far.
Table 9 presents the classification report of stochastic gradient descent (SGD). The
precision and recall for predicting a real website are 0.966 and 0.826. These scores
were used to calculate the F1 score for predicting a real website which is 0.891.
Similarly, the precision and recall for predicting a phishing website are 0.841 and
0.969. Using these both scores, the F1 score was measured for predicting a phishing
website which is 0.900. The F1 scores in here were lower than logistic regression
CV so it was also rejected.
Table 10 represents the classification report of random forest classifier. The preci-
sion and recall for predicting a real website are 0.977 and 0.984. These scores were
used to calculate the F1 score for predicting a real website which is 0.980. Similarly,
the precision and recall for predicting a phishing website are 0.983 and 0.975. Using
these both scores, the F1 score was measured for predicting a phishing website which
is 0.979. Here, the precision, recall, and F1 scores are remarkably high than all the
other models. Hence, this is considered the best one yet.
Table 11 presents the classification report of bagging classifier. The precision and
recall for predicting a real website are 0.972 and 0.977. These scores were used to
calculate the F1 score for predicting a real website which is 0.974. Similarly, the
precision and recall for predicting a phishing website are 0.975 and 0.971. Using
these both scores, the F1 score was measured for predicting a phishing website is
0.973. Here, the precision, recall, and F1 scores are remarkably high than all the
other models except random forest classifier.
Table 12 represents the classification report of extra trees classifier. The precision
and recall for predicting a real website are 0.984 and 0.979. These scores were used
to calculate the F1 score for predicting a real website which is 0.982. Similarly, the
precision and recall for predicting a phishing website are 0.978 and 0.984. Using
these both, the F1 score was measured for predicting a phishing website is 0.981.
Here, the precision, recall, and F1 scores are the highest, and this is the best score of
all.
It is to be noticed from the above classification reports (Table 13) that all the
classifiers under random forest did remarkably well for detecting phishing websites
and real websites.
5 Conclusion
This study went in great detail and an in-depth explanation of machine learning
techniques and their performances when used against a dataset, containing data
regarding websites, in order to detect phishing websites. This technique is not
894 D. Sarma et al.
commonly described in great detail in this paper but also showed how each of the
models performs by using plotted charts to demonstrate and compare each individual
algorithms.
This report aims to be useful to its readers to provide a conclusive analysis of
these methods and to verify our observations regarding random forest classifier’s
optimal performance. The graphs and details that were added to this paper aimed
to help others to carry out further experimentation progressing from where it was
concluded.
It is intended to carry on the proposed research work with further modifications to
the dataset and applies other machine learning techniques with modified parameters
to hopefully open more possibilities in improving the global defense against the
cyber attackers.
Comparative Analysis of Machine Learning Algorithms … 895
References
1. Da Silva JAT, Al-Khatib A, Tsigaris P (2020) Spam e-mails in academia: issues and costs.
Scientometrics 122:1171–1188
2. Mironova SM, Simonova SS (2020) Protection of the rights and freedoms of minors in the
digital space. Russ J Criminol 14:234–241
3. Sethuraman SC, Vijayakumar V, Walczak S (2020) Cyber attacks on healthcare devices using
unmanned aerial vehicles. J Med Syst 44:10
4. Tuan TA, Long HV, Son L, Kumar R, Priyadarshini I, Son NTK (2020) Performance evaluation
of Botnet DDoS attack detection using machine learning. Evol Intell 13:283–294
5. Azeez NA, Salaudeen BB, Misra S, Damasevicius R, Maskeliunas R (2020) Identifying
phishing attacks in communication networks using URL consistency features. Int J Electron
Secur Digit Forensics 12:200–213
6. Iwendi C, Jalil Z, Javed AR, Reddy GT, Kaluri R, Srivastava G, Jo O (2020) KeySplitWater-
mark: zero watermarking algorithm for software protection against cyber-attacks. IEEE Access
8:72650–72660
7. Liu XW, Fu JM (2020) SPWalk: similar property oriented feature learning for phishing
detection. IEEE Access 8:87031–87045
8. Parra GD, Rad P, Choo KKR, Beebe N (2020) Detecting internet of things attacks using
distributed deep learning. J Netw Comput Appl 163:13
9. Tan CL, Chiew KL, Yong KSC, Sze SN, Abdullah J, Sebastian Y (2020) A graph-theoretic
approach for the detection of phishing webpages. Comput Secur 95:14
10. Anwar S, Al-Obeidat F, Tubaishat A, Din S, Ahmad A, Khan FA, Jeon G, Loo J (2020)
Countering malicious URLs in internet of things using a knowledge-based approach and a
simulated expert. IEEE Internet Things J 7:4497–4504
11. Ariyadasa S, Fernando S, Fernando S (2020) Detecting phishing attacks using a combined
model of LSTM and CNN. Int J Adv Appl Sci 7:56–67
12. Bozkir AS, Aydos M (2020) LogoSENSE: a companion HOG based logo detection scheme
for phishing web page and E-mail brand recognition. Comput Secur 95:18
13. Gupta BB, Jain AK (2020) Phishing attack detection using a search engine and heuristics-based
technique. J Inf Technol Res 13:94–109
14. Sonowal G, Kuppusamy KS (2020) PhiDMA—a phishing detection model with multi-filter
approach. J King Saud Univ Comput Inf Sci 32:99–112
15. Zamir A, Khan HU, Iqbal T, Yousaf N, Aslam F, Anjum A, Hamdani M (2020) Phishing web
site detection using diverse machine learning algorithms. Electron Libr 38:65–80
16. Rodriguez GE, Torres JG, Flores P, Benavides DE (2020) Cross-site scripting (XSS) attacks
and mitigation: a survey. Comput Netw 166:23
17. Das A, Baki S, El Aassal A, Verma R, Dunbar A (2020) SoK: a comprehensive reexamination
of phishing research from the security perspective. IEEE Commun Surv Tutor 22:671–708
18. Adewole KS, Hang T, Wu WQ, Songs HB, Sangaiah AK (2020) Twitter spam account detection
based on clustering and classification methods. J Supercomput 76:4802–4837
19. Rao RS, Vaishnavi T, Pais AR (2020) CatchPhish: detection of phishing websites by inspecting
URLs. J Ambient Intell Humaniz Comput 11:813–825
20. Shabudin S, Sani NS, Ariffin KAZ, Aliff M (2020) Feature selection for phishing website
classification. Int J Adv Comput Sci Appl 11:587–595
21. Raja SE, Ravi R (2020) A performance analysis of software defined network based prevention
on phishing attack in cyberspace using a deep machine learning with CANTINA approach
(DMLCA). Comput Commun 153:375–381
22. Sarma D (2012) Security of hard disk encryption. Masters Thesis, Royal Institute of Technology,
Stockholm, Sweden. Identifiers: urn:nbn:se:kth:diva-98673 (URN)
23. Alqahtani H et al (2020) Cyber intrusion detection using machine learning classifica-
tion techniques. In: Computing science, communication and security, pp 121–31. Springer,
Singapore
896 D. Sarma et al.
24. Hossain S, et al (2019) A belief rule based expert system to predict student performance under
uncertainty. In: 2019 22nd international conference on computer and information technology
(ICCIT), pp 1–6. IEEE
25. Ahmed F et al (2020) A combined belief rule based expert system to predict coronary artery
disease. In: 2020 international conference on inventive computation technologies (ICICT), pp
252–257. IEEE
26. Hossain S et al (2020) A rule-based expert system to assess coronary artery disease under uncer-
tainty. In: Computing science, communication and security, Singapore, pp 143–159. Springer,
Singapore
27. Hossain S et al (2020) Crime prediction using spatio-temporal data. In: Computing science,
communication and security. Springer, Singapore, pp 277–289
Toxic Comment Classification
Implementing CNN Combining Word
Embedding Technique
Abstract With the advancement of technology, the virtual world and social media
have become an important part of people’s everyday lives. Social media allows people
to connect, share their emotions and discuss various subjects, yet it also becomes
a place or cyberbullying, personal attack, online harassment, verbal abusing and
other kinds of toxic comments. Top social media platform still suffering from fast
and accurate classification to remove this kind of toxic comment automatically. In
this paper, an ensemble methodology of convolution neural networking (CNN) and
natural language processing (NLP) is proposed which segments toxic and non-toxic
comments in first phase, and then it classifies and labels in six types based on the
dataset of Wikipedia’s talk page edits, collected from Kaggle. The proposed archi-
tecture is structured following data preprocessing applying data cleaning processes,
adopting NLP techniques like tokenization, stemming and converted word into vector
by word embedding techniques. Ensembling the preprocessed dataset and best word
embedded method, CNN model is applied that scores ROC-AUC 98.46 and 98.05%
accuracy for toxic comment classification which is higher than compared existing
works.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 897
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_65
898 M. I. Pavel et al.
1 Introduction
Social media today, on account of its attainability and accessibility, has ended up
becoming the region for discussions regarding information and queries about places.
It has given scope to each person to express themselves more than ever and enhanced
communication and sharing in online platform. Unfortunately, this platform is also
turning into a platform hating speech and verbal attacking, even putting people at risk
of violence, who support diversity in race, ethnicity, gender and sexual orientation.
Cyberbullying and harassment have become serious issues, which affect a wide range
of people, sometimes, inflicting severe psychological problems such as depression
or even suicide. The abusive online content will fall into more than one toxic cate-
gory, such as hating, threatening, insulting based on identity [1]. According to the
2014 poll [2] of the PEW Research Institute, 73% of people on the Internet have seen
someone being harassed online, 45% of Internet users have all been harassed and 45%
were exposed to substantial harassment. More than 85% of databases are completely
non-toxic and the concentrations of toxicity are totally not seen in Wikipedia. In
contrast to 2010, teenagers were 12% [3] more likely to be subjected to cyberbul-
lying, which obviously indicates negative part of social media. Corporations and
social media platforms are trying to track down abusive comments toward users and
are also looking for ways to automate the said process. This paper utilizes deep
learning to examine whether social media comments are abusive or not, and to clas-
sify them further into various categories such as toxic, severely toxic, obscene, vulgar,
profane and hatred toward different identity. In this paper, we are using two neural
networks based methods, convolutional neural networks (CNN) and natural language
processing (NLP) which have been applied combining with word embedding without
any syntactic or semantic expertise. We have evaluated our models and have used
accuracy tests to see how well the models performed.
To represent the research work, the sections of the paper are arranged as follows:
Sect. 2 addresses the relevant work in this area; Sect. 3 outlines the proposed method-
ology. Section 4 presents the experimental analysis alongside implementation process
and results of the procedure and Sect. 5 concludes the purpose of research work as
well as further scopes of development.
2 Related Works
With the massive increment of Internet and social media; toxic comments, cyberbul-
lying and verbal abuse have become a major issue of concern and several studies have
been conducted to resolve this by adopting classification techniques. Georgakopoulos
Toxic Comment Classification Implementing … 899
et al. [4] used CNN model for solving toxic comment classification problem using
the same dataset we have used in our methodology. In their solution, they demon-
strated using a balanced subset of the data without tackling imbalance dataset as well
as did only binary classification to identify the comments are toxic or not, but did
not predict the toxic level. To improve the issue, Saeed et al. [5] applied deep neural
network architectures with a good accuracy. One of the best part of this research was
their classification framework does not need any laborious text preprocessing. They
used CNN-1D, CNN-V, CNN-I, BiLSTM, BiGRU models and analyzed calculating
F1, ROC-AUC, precision, recall scores and claims that Bi-GRU had showed the
highest F1 score and scored great in precision and recall. In another work, authors
of [6] demonstrated capsule network-based toxic comment classifier which imple-
mented based on single model capsule network along with focal loss along. Their
model scored 98.46% accuracy with RNN classifier on TRAC dataset, known as
a Hindi-English combined dataset of toxic and non-toxic comments. Furthermore,
Kandasamy et al. [7] adopted natural language processing technique (NLP) inte-
grated with the implementation of URL analysis and supervised machine learning
techniques social media data where it scored 94% accuracy. Anand et al. [8] presented
different deep learning techniques such as convolution neural network (CNN), ANN,
long short-term memory cell (LSTM) and these are with and without word GloVe
embeddings, where GloVe pre-trained model is applied for classification. Most of the
research works showed the binary classification of toxic and non-toxic comments,
but labeling classes of toxic comments after identification are still missing in the
previous works. To solve this issue, we proposed a methodology that classifies toxic
comments and also predicts the toxicity based on classes.
3 Methodology
To reduce the irregularity from the dataset, data cleaning is effectively needed for
achieving better outcomes and faster processing. We adopt various data cleaning
processes like stop word removing [9], removing punctuations and case folding
where all words will be converted into lower case, removing duplicate words, URL,
emoji or short codes of emoji, numbers, one character-based word removing and
symbol removing. Overall, there are lot of words that are unnecessarily used to
emphasize a meaningful sentence. Those words are needed to be removed from the
list such as I, is, are, a, the, of and many more. They have no value or in use for
the list. In addition, these are the pronouns, conjunctions and relational words which
contribute to almost 500 stop words list of the English language. Natural Language
Toolkit (NLTK) [10] is a library for Python, for many different languages that we
have used in our model for better classification and accurate result.
3.1.2 Tokenization
It is the most common and important part of NLP where a sentence full of words are
separated or split to form different individual words [11] which is considered as a
token. Figure 2 shows how the sentence is broken down to a segmented form. Here,
a model named fastText is used for mapping the word to a vector number. In the first
step, chunk of words will be separated from a big sentence or content of information
such as [“We hate toxic comment”] to [“we”, “hate”, “toxic”, “comment”] and in
Toxic Comment Classification Implementing … 901
second, the words will be embedded with some numbers to represent word vector-
ization. It mainly compares the group of vector words that are in vector space and
finds the mathematical similarity like man to boy and woman to girl.
3.1.3 Stemming
Connotation of words has a different sense in distinct form in English language and
sometimes they have similar word for describing a variety of things in using all parts
of speech. In order to find the root which is also called lemma, stemming is used for
preparing those words by removing or diminishing the inflection forms of the words
like playing, played and playful. These words have suffix, prefix, tense, gender and
other grammatical forms. Moreover, when we compare group of words and find a
root that is not the same kind then we would take it into consideration as a different
category of that word as lemma. The method of lemma is used in our model for better
output [12].
Convolutional neural networks or CNN have been commonly applied to image classi-
fication [18] problems because of its internal capacity to use two statistical properties
named “local stationarity” and “compositional structure”. To implement CNN for
toxic comment classification [19–22], the initial rule is that before being feeded [23]
to CNN architecture, sentence needs to be encoded and to improve the scenario, the
approach of applying vocabulary in a medium of index containing words which has
sets of texts that is mapped into integer length from 0 to 1. Afterward, the padding
technique is utilized to fill with zeros the document matrix with a view to gain the
highest length as CNN architecture needs constant input dimensionality. Next, the
next stage includes translating the encoded documents into matrices, in which each
row corresponds to a single term. The matrices generated move by the embedding
layer in which a dense vector transforms any term (row) into a representation of low
dimensions [24]. The operation then goes as per the standard CNN research method.
The word embedding technique is chosen for the low-dimensional representation
of each word during this point. Embedding method is the use of fixed dense word
vectors, generated utilizing word such as fastText, Word2Vec and GloVe which are
mentioned in the previous section. Our CNN architecture is built includes kernel size
five in 128 filters for 5 word embeddings along with 50 unit fully connected (dense)
layer. Figure 3 shows the setup of CNN for our toxic comment classification model.
The proposed architecture for CNN designed in 10 layers which is shown in Fig. 4
where it begins with input layer where we input the dataset, then an embedded layer
which is pre-trained with chosen word embedding technique, convolution layer to
learn feature map which captures relationships with nearby elements, max pooling
layer that helps to reduce dimensionality by segments and takes the max value, two
dropout regularization layer to deduct the problem of overfitting, two dense layers
where first one learn the weights of the input to identify outputs and second one
improves the weight and one flatten layer (fully connected) and finally one output
layer that generates the predicted class.
To train our model, we adopt ADAM optimizer [25] and binary cross-entropy
loss, and evaluated with binary accuracy in first phase, then proceed with multi-class
classification for toxic leveling. We use four epochs for high computation power with
Toxic Comment Classification Implementing … 903
spearing training data set into mini-batches of 64 examples where 70% is training
and 30% data is for testing purpose.
4 Experimental Analysis
In this section, first, we describe the used dataset from Kaggle and visualize the cate-
gories with their correlations. After that, the performance analysis of the proposed
system on this dataset for toxic comment classification is shown. Finally, a demon-
stration of the proposed methodology on random toxic comments is presented for
leveling the toxic categories. For this experimental analysis, we used a computer
built AMD Ryzen 5 with 16 GB RAM and 256 GB SSD ROM, Nvidia’s GTX 1665
GPU and coded in Python 3.6 in Anaconda which is based on Spyder IDE.
904 M. I. Pavel et al.
4.1 Dataset
The dataset we have used in our research is acquired from Kaggle which is very
popular publicly available dataset named “Wikipedia Talk Page Comments annotated
with toxicity reasons” [26] which content almost 1,60,000 comments with manually
labeling. The dataset contains total six classes (toxic, severe_toxic, obscene, insult,
threat, identity_hate) which are described down below in Fig. 5.
The correlation matrices in Fig. 6 shows that “toxic”, comments are most strongly
correlated with “insult” and “obscene” class. Moreover, “toxic” and “thread” have
the only weak correlation. Further, there is very weak correlation between “obscene”
and “insult” comments are also highly correlated, which makes perfect sense. It also
shows the class “threat” has the weakest correlation with all classes.
After doing the preprocessing and word embedding, we used CNN model with fast-
Text embedding technique for binary classification in the initial stage after prepro-
cessing with tokenization and stemming. We utilize three separate structures of
convolution, utilizing three separate structures of convolution at the same time where
dense vector dimension 300 with filter size width 128. For increasing convolutional
layer, filter width is equal to the vector dimension, and its height was 3, 4 and 5.
A cumulative pooling operation is implemented after each convolutional layer. A
complete layer attached is the output of the pooling layer, while the softmax feature
refers to the ending layer. Finally, we implementing this in four epochs where at
first, model loss was 6.08% and ROC-AUC was 98.46%, a gradually decreasing
starts from second epoch where loss was 3.63%, then 2.71 and 2.6% in final epoch as
well as validation score-based AUC reached maximum at 98.63% for toxic comment
classification.
In this experimental, AUC reached maximum at 98.63% for toxic comment clas-
sification. Figure 7 presents the training and testing loss for each epoch where it
visualizes that the training loss decreases from 0.0618 to constant 0.0368. Further-
more, Table 2 shows the demonstration of toxic leveling on some random vulgar and
toxic comments where it is shown the predicted toxicity based on the six classes. It
906 M. I. Pavel et al.
Fig. 7 Loss function on each epoch for train and test set
Table 3 Accuracy
References Method Accuracy (%)
comparison
[2] CNN and bidirectional GRU 97.93
[27] SVM and TF-IDF 93.00
[28] Logistic regression 89.46
[1] Glove and CNN and LSTM 96.95
Proposed model CNN with fastText 98.05
levels each classified toxic words into subclasses with prediction where some of the
sentence can be in multiclasses or can be specifically scores high in one class that
makes sense.
Table 3 shows comparisons of others proposed work and our methodology where
authors of [2] used CNN and bidirectional GRU model and achieved 97.93% accu-
racy, [27] implemented SVM and TF-IDF and got 93.00% test accuracy after, [28]
scored 89.46% applying logistic regression, [1] got 96.95% accuracy implementing
GloVe word embedding and combining CNN and LSTM; and our proposed method-
ology with fastText word embedding works better these works showing 98.05%
accuracy and 98.46% ROC-AUC score. We adopt fastText word embedding tech-
nique was it shows highest accuracy for our model where with GloVe the accuracy
was 96.68 and 93.45% with Word2Vec.
5 Conclusion
In this paper, we represent a toxic comment classification system which is a vital issue
as with the growing social media, and it is also necessary to prevent cyberbullying,
vulgar or toxic comments because preventing this kind of things are still challenging.
However, we successfully achieve a higher accuracy comparing with other existing
works, implementing CNN with fastText word embedding technique after processing
using natural language processing including data cleaning (i.e., stop word removing),
tokenization and stemming. Firstly, it classifies comments are toxic or non-toxic
with 98.05% accuracy and 98.46 ROC-AUC score. Following that, it labels the
toxic classified comments into five other subclasses. Thus, the proposed work not
only fetches toxic comments but also clarify which subclasses it may belong that is
essential for practical implementation.
Though the accuracy of this proposed methodology is high, it can be improved
more by improving the quantity of the dataset where there are some imbalance in
class distributions as well as quantities and training more cases. Further, we are
planning to deploy it in online teaching platform chatbox and social media as these
two platforms cause a major amount of toxic comments.
908 M. I. Pavel et al.
References
21. Wang S, Huang M, Deng Z (2018, July) Densely connected CNN with multi-scale feature
attention for text classification. IJCAI 4468–4474
22. Carta S, Corriga A, Mulas R, Recupero DR, Saia R (2019, September) A supervised multi-
class multi-label word embeddings approach for toxic comment classification. In: KDIR, pp
105–112
23. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language
processing (almost) from scratch. J Mach Learn Res 12:2493–2537
24. Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent
neural networks. In: Advances in neural information processing systems, pp 1019–1027
25. Zhang Z (2018, June) Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM
26th international symposium on quality of service (IWQoS), pp 1–2
26. Toxic Comment Classification Challenge. (n.d.). Retrieved February 9, 2020, from https://
www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data
27. Dias C, Jangid M (2020) Vulgarity classification in comments using SVM and LSTM. In:
Smart systems and IoT: Innovations in computing, pp 543–553. Springer, Singapore
28. Kajla H, Hooda J, Saini G (2020, May) Classification of online toxic comments using machine
learning algorithms. In: 2020 4th international conference on intelligent computing and control
systems (ICICCS), pp 1119–1123
A Comprehensive Investigation About
Video Synopsis Methodology
and Research Challenges
1 Introduction
The exponential increase in technological enrichment demands the need for video
surveillance almost in all areas. Video surveillance plays an important role in terms
of security mostly for monitoring process, transport, public security, education field,
and many more [1]. There are some challenges of video surveillance that needs to
address. The enormous amount of data produced is hard to monitor continuously, and
processing of this data within a short period of time is a major challenge [2]. As the
surveillance camera is continuously tracking the events, there is a huge requirement
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 911
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_66
912 S. Jagtap and N. B. Chopade
of memory for storage. Thus, data browsing for certain activities from this data will
take hours/days. Therefore, video browsing becomes tedious and time-consuming
task as a result most of the videos are not watched. A probable solution is a need for
a method that summarizes the video and can convert hours of video into mins. These
methods are called video condensation, which is further divided as frame-based,
object-based. Video summarization is a frame-based method defined as a process of
making and representing a detail abstract view of the whole video withinthe shortest
time period. The video summarization can be categorized into two parts: a static
video summary (storyboard) and a dynamic video summary (video trailer). The
static video summarization selects the key frames of a video sequence and mostly
used for indexing, browsing, and retrieval [3].
Dynamic video summary consists of video skimming and video fast forward.
Video skimming method selectsthe smallest dynamic portion called video skims
of audio and video to generate the video summary [4]. The movie trailer is one of
the most popular video skims in exercise [5]. Fast-forwarding methods processes
the frame depending upon the manual control or automatic setting of speed [6].
The video summarization methods condense the video data in a temporal domain
only. The spatial redundancy is not considered for condensation that reduces the
compression efficiency.
Figure 1 illustrates the difference between video summarization and video
synopsis. Video summarization extract the key frame based on the features such
as texture, shape, motion, where compression is achieved in temporal domain only.
While, video synopsis is a object-based compression algorithm that extracts the
activity from the original video that is represented in the form of tubes. The proper
rearrangement of tubes with same chronological order gives the compression in
temporal as well as in a spatial manner producing more compactness.
The paper is organized as follow: Section II covers the various video synopsis
approaches. The complete synopsis process flow and methodology is explained in
Fig. 1 a Video summarization extracting only the key frame. b Video synopsis displaying the
multiple objects from different time interval [7]
A Comprehensive Investigation About Video Synopsis … 913
section III. The evaluation parameters and datasets are reviewed in section IV. Section
V covers the research challenges in the field of video synopsis. Finally, section VI
covers the conclusion and discussion on future research.
Table 1 (continued)
Studies Optimization Optimization Activity Camera topology Input
type method clustering (single/multicamera) domain
(Online/Offline) (pixel or
compressed)
Zheng et al. [20] Online Simulated No Single Compressed
annealing
method
Video synopsis is used to condense the size of the original video which makes the
data retrieval easy. Figure 2 describe the steps which involves in video synopsis
process. The initial step for video synopsis is to detect and track the moving object.
This is a preprocessing step and very important for further processing. The next step
involves activity clustering in which the clustering of the same object trajectories is
done. The next step is the main part of the synopsis algorithm called optimization.
It involves the optimal tube rearrangement for collision avoidance and to get the
compressed video. The tube rearrangement can be done based on user’s query which
helps to target the synopsis video depending on query given. After the optimum tube
rearrangement, the background is generated depending on surveillance video, and
the rearranged tubes are stitched with the background to get the compact view of the
original video.
Object detection is the preliminary phase of any video synopsis process. Object
detection is followed by segmentation and tracking of trajectories of same object
called activity and represented as a tube. There are many challenges in tracking the
real-time object. Occlusions and illumination variations are the primary challenges.
The object tracking for synopsis involves appropriate segmentation of the object
in each frame. The activities present in the original video should be detected and
tracked correctly else it produces the blinking effect that is the sudden appearance
and disappearance of objects. Some of the approach with respective studies are listed
in Table 2
The evaluation parameter of video synopsis directly gets affected by the qualities
of object detection and tracking. To avoid this challenge, Pappalardo et al. [23]
introduced a toolbox to generate a dataset needed for testing with annotation.
3.3 Optimization
Optimization is the main process of video synopsis. It is the process of the optimum
rearrangement of the tube to obtain the collision free and chronological arranged
compacted video. The rearrangement of foreground objects is expressed as reduc-
tion of energy in terms of activity cost, collision cost, and temporal consistency
cost of object trajectories. The activity cost assures the maximum number of object
trajectories in a video synopsis. The temporal consistency cost is used to preserve
the temporal order of the activities; therefore, the breakage of temporal order is
penalized. The collision cost helps to avoid spatial collisions between activities with
providing the better visual quality. Some of the optimization approaches are listed
in Table 4
The optimization methodology helps to rearrange the tube optimally; some
approaches focus on collision avoidance while some try to improve the compression
ratio. Improvement in all the parameters cannot be achieved at the same time.
918 S. Jagtap and N. B. Chopade
3.5 Stitching
This is the former step of video synopsis flow, where the tubes are stitched with
the generated time lapse background. The stitching does not affect effect on the
efficiency of video synopsis but it improves the visual quality. Many approaches
employed Poisson image editing to stitch a tube into the background by changing
the gradients.
Parameters are used to assess the quality of video synopsis. Some of the parameters
are listed below.
1. Frame condensation ratio (FR) is defined as the ratio between the number of
frames in original video and the synopsis video [10].
2. Frame compact ratio (CR) is defined as the ratio between the number of object
pixel in original video and total pixels in synopsis video. It provides the informa-
tion about the spatial compression and measures the object occupying the spatial
space in synopsis video [10].
3. Non-overlapping ratio (NOR) is defined as the ratio between the number of pixels
that the object is occupying and sum of each object mask pixels of synopsis
video. It provides the information about the amount of collision between tubes
in a synopsis video [10].
4. Chronological disorder (CD) is defined as the ratio between the number of tubes
in reverse order and total number of tubes. It measures the chronological sequence
of the tube [10].
5. Runtime (s): Time required for generation of video synopsis.
6. Memory requirement: The memory utilization is measured using peak memory
usage and average memory usage.
7. Visual quality: It gives the visual appearance of the synoptic video which should
include all the activity that occurred in the original video
8. Objective evaluation: some of the approaches [31] also conduct the survey based
on the result in the objective way to validate the synopsis result by comparing
the visual appearance. The original video, proposed synopsis video, and existing
method synopsis videos are shown to fixed participants and certain question
based on appeared, compactness is asked. Based on the answers, the efficiency
of proposed synopsis is calculated.
4.1 Dataset
The datasets are needed to validate the performance of different methodology. The
presence of proper datasets helps to check the quality of results by the proposed
920 S. Jagtap and N. B. Chopade
methodology. The performance of video synopsis can be evaluated using the publicly
available dataset and outdoor videos. Table 5 lists the available dataset.
In some of the studies, the datasets are created by the researcher to check the
evaluation parameters. However, these datasets cannot be used to compare the result.
The assessment of the evaluation parameter is a tough task as the standard dataset
is not available. In some of the studies, the evaluation parameters are taken in reverse
ratio. Therefore, the comparison of different results will be a problematic task.
Video synopsis technology has overcome many challenges in the area of video inves-
tigation and summarization, but there are many glitches within the scope of appli-
cation by itself. Some of the challenges in the field of video synopsis are given
below.
1. Object Interactions
The interaction between the object trajectories should be preserved while converting
original video into compacted form. For example, if two people are walking side by
side in a video. The tracking of tubes is done separately in optimization phase and
for collision avoidance; these tubes are rearranged in a way that they not ever met in
the final synopsis [32]. The rearrangement of the tubes should be implemented with
proper optimization algorithm, so the original integration can be preserved.
2. Dense Videos
A Comprehensive Investigation About Video Synopsis … 921
Another challenge is the crowded public places, where the source video is highly
dense with objects occupying all the location repeatedly. In this situation, the required
object can be kept alone in synopsis video but this may affect the chronological order
of the video and may create misperception to the user browsing for the particular
event or object in the resultant video. Also reduces the visual quality of the video.
The selection proper segmentation and tracking algorithm will help to overcome with
the challenge.
3. Camera Topology
The synopsis video quality may get affected by the camera topology used. Object
segmentation and tracking are an important phase of video synopsis in which the
source videos can be fetched using still camera or moving camera. The synopsis
generation will be difficult with moving camera as the viewing direction is constantly
varying. The background generation and the stitching step will be difficult for moving
camera as there will be continuous changes in background appearance. The multi-
camera approach will be another challenge in the generation of video synopsis as
object tracking and segmentation will be difficult as the number of inputs will be
more, and changeable background shift will be tough to predict.
4. Processing Speed
The faster real time speed can be achieved by system using a multi-core CPU imple-
mentation. The GPU further reduces the processing time and enhances the speed of
processor giving reduced value of runtime.
5. It is an optional step in video synopsis process flow but added advantage for
quick data retrieval and browsing. It increases the computational complexity but
can be used for many applications depending upon the user’s query. Depending
upon the user query, the clustering of the similar tubes can be generated, and
synopsis video is produced based on the clustering.
6 Conclusion
Video synopsis has gained more demand with the increases in CCTV and techno-
logical enrichment in the video analysis field. It is an emerging technology used to
represent the source video in compacted form based on the activities which can be
used in many applications. There are several approaches of video synopsis in which
online approach is used for real-time video steaming. Multicamera and compressed
domain approach need to explore for enhancing the efficiency of related parameters.
The video synopsis process flow starts with object detection, trajectories tracking,
activity clustering, tube rearrangement, background generation, and stitching. The
accuracy of tracking and segmentation of object trajectories can affect the quality of
synopsis video. The compression ratio can be improved by optimum arrangement
tubes. The proper chronological order and less collision between the tubes help to
enhance the visual quality.
922 S. Jagtap and N. B. Chopade
References
20. Wang S, Wang Z-Y, Hu R-M (2013) Surveillance video synopsis in the compressed domain
for fast video browsing. J Vis Commun Image Represent 24:1431–1442
21. Li X, Wang Z, Lu X (2016) Surveillance video synopsis via scaling down objects. IEEE Trans
Image Process 25(2):740–755
22. Huang C-R et al (2014) Maximum a Posteriori probability estimation for online surveillance
video synopsis. IEEE Trans Circ Syst Video Technol 24:1417–1429
23. Pappalardo G et al (2019) A new framework for studying tubes rearrangement strategies in
surveillance video synopsis. In: 2019 IEEE international conference on image processing (ICIP)
24. Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. IEEE Conf Comput Vis
Pattern Recogn (CVPR) 2017:6517–6525
25. Chien-Li C et al (2015) Coherent event-based surveillance video synopsis using trajec-
tory clustering. . In: 2015 IEEE international conference on multimedia & expo workshops
(ICMEW)
26. Lin W et al (2015) Summarizing surveillance videos with local-patch-learning-based abnor-
mality detection, blob sequence optimization, and type-based synopsis. Neurocomputing
155:84–98
27. Rav-Acha A, Pritch Y, Peleg S (2006) Making a long video short: dynamic video synopsis.
In: 2006 IEEE computer society conference on computer vision and pattern recognition
(CVPR’06), vol 1, pp 435–441
28. He Y et al (2016) Graph coloring based surveillance video synopsis. Neurocomputing 225
29. Nie Y et al (2019) Collision-free video synopsis incorporating object speed and size changes.
IEEE Trans Image Process: Publ IEEE Signal Process Soc
30. Li K et al (2016) An effective video synopsis approach with seam carving. IEEE Signal Process
Lett 23(1):11–14
31. Fu W et al (2014) Online video synopsis of structured motion. Neurocomputing 135:155–162
32. Namitha K, Narayanan A (2018) Video synopsis: state-of-the-art and research challenges
Effective Multimodal Opinion Mining
Framework Using Ensemble Learning
Technique for Disease Risk Prediction
V. J. Aiswaryadevi (B)
Dr NGP Institute of Technology, Coimbatore, India
e-mail: aiswarya.devi@live.com
S. Kiruthika · G. S. Priyanka · M. S. Sruthi
Sri Krishna College of Technology, Coimbatore, India
e-mail: kiruthika.s@skct.edu.in
G. S. Priyanka
e-mail: priyanka.g@skct.edu.in
M. S. Sruthi
e-mail: sruthi.ms@skct.edu.in
N. Nataraj
Bannari Amman Institute of Technology, Sathyamangalam, India
e-mail: nataraj@bitsathy.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 925
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_67
926 V. J. Aiswaryadevi et al.
1 Introduction
Many researchers are working on the construction of the multimodal opinion mining
framework. A clinical decision support system is employed with automation and
without human intervention using machine learning algorithms and deep learning
algorithms in recent days massively. Deep learning networks are also used along
with the ensemble-based extreme learning machines due to the problem of overfitting
in several depths leading to a sparse density of traversal towards the goal state.
Traditional random forest works with simple and quite effective accuracy in object
recognition and goal state prediction. Here, the traditional random forest is sampled
with its input data sampled with goal-based constraints. The seeded sample is taken
into random forest module execution for opinion mining framework construction.
SVM machine learning algorithm is also trained on the same set of samples trained by
random forest classifier for the random sampling on the data. The prediction results
and parameter are discussed based on observations noted.
Section 2 briefs about the data set under analysis, and Section 3 describes the goal-
based ensemble learning algorithm with the set of goal constraints and its predicates.
Section 4 speaks about the ensemble-based learning algorithm and its effectiveness.
Section 5 depicts the results derived by using the ensemble-based opinion mining
framework.
2 Related Works
generated using Doc2Vec data set of 100 Telugu songs (audio + lyrics). From the
experimental results, the recognition rate is observed to be in between 85 and 91.2%.
The percentage of lyric sentiment analysis can be improved by using rule-based and
linguistic approach shown in [6]. The USC IEMOCAP database [6] was collected
to study multimodal expressive dyadic interactions in 2017 by [7]. Another exper-
imental study showed that while using CNNSVM produced a 79.14% accuracy, an
accuracy of only 75.50% was achieved using CNN. Multimodal sentiment analysis,
data set, multimodal emotion recognition data set and the visual module of CRMKL
[8] obtained 27% higher accuracy than the state of the art. When all modalities were
used, 96.55% accuracy was obtained outperforming the state of the art by more than
20%. The visual classifier trained on the MOUD which obtained 93.60% accuracy
[9] got 85.30% accuracy on the ICT-MMMO data set [10] using the trained visual
sentiment model on the MOUD data set. Many historical works failed to reduce the
overfitting caused due to deep neurons and decision levels.
Chord bigrams of piano-driven and guitar-driven musical strings are extracted and
transformed into the linguistic form in the visualizing musical data set. Bigram
features such as B-flat, E-flat and A-flat chords are frequently occurring musical
strings in the piano and guitar music. YouTube trending data set contains the number
of comments, likes and shares expressed for each video in the data set. Using
prefiltering algorithms goal-specific rules the needful information alone is extracted
from the videos and musical strings of the data set. Histogram segmentation is used
for video sequences, and Mel-frequency spel spectrum is used for musical sequences
for sparse filtering.
The detailed multimodal opinion mining frameworks are expressed in terms of five
basic steps, namely collection of raw data, pre-processing and filtering, classification
of filtered data, sentiment polarity extraction and analysing the performance param-
eter. Goal-based features are alone extracted for analysis. The following flowchart
provides the flow of opinion mining framework for multimodal sentiment analysis
(Fig. 1).
928 V. J. Aiswaryadevi et al.
Preprocessing
and filtering
SenƟment
Polarity
ExtracƟon
Analye the
performance
parameter
Goal based
machine
learning
Goal-based data mining algorithms [11] are used for forming the decision trees
[12]. Bootstrapped decision trees are constructed using 150 samples under random
sampling and 10 features sampled using feature sampling and bagged using a majority
of the polarity expressed by the bootstrapped data. A simple model for end-stage liver
disease risk prediction [13] is implemented using the ensemble-based random forest
algorithm with 500 bootstrapped decision trees and achieved the accuracy of 97.22%
with Gaussian filter normalization [14] with the random sampling rate of Gaussian
Naïve Bayes Classifier [15] specified below in Eq. 1. An MDR data set is developed
as like MELD data [16] set using the normalized short feature vectors from YouTube
trending videos [17] and visualizing musical Data [18].
2
1 xi − μ y
P(xi /y) = exp − (1)
2π σ y
2 2σ y2
Random sampling is done at the seed samples of 2019 among 11,110 data entries
using Gaussian normalization distribution (Fig. 2).
Effective Multimodal Opinion Mining Framework … 929
SVMs are used for setting the UB and LB sample rate by soft margin random sampling
amongst the entire data set handpicked from the multimodal MDR [19] data set. Utter-
ances are neglected for sampling. Only the sentiment score and polarity expressed
in the data are taken into account. Hypervector parameter [20] is used for effective
classification of random samples with the seeding rate of 200 per vector (Figs. 3 and
4).
Fig. 4 SVM feature sampling and scaling vectors expressed by the data set transformation
Feature sampling is done at 80 samples per vector, and the sampled features are
bootstrapped using the decision tree algorithms. The accuracy rate expressed by the
random forest generated using the random sampled and feature sampled chords is
discussed in the results below.
SoftMax classifier is used with performance measure indices reflected through the
confusion matrix describing the true positive rate, false positive rate, true negative
rate and false negative rate. A confusion matrix for an actual and predicted class is
formed comprising of TP, FP, TN and FN to evaluate the parameter. The significance
of the terms is given below TP = True Positive (Correctly Identified) TN = True
Negative (Incorrectly Identified) FP = False Positive (Correctly Rejected) FN =
False Negative (Incorrectly Rejected). The performance of the proposed system is
measured by the following formulas:
The results and performance metric indices derived are expressed as follows: The
number of samples present before random sampling and feature sampling is 11,110
records, whereas, after random sampling with the seed of 200 seeds, the sample
frames are created with 660 records. Amongst the true positive rate, the accuracy
rate obtained is demonstrated below with a dot plot and confusion matrix.
Number of trees 5700
No. of variables tried at each split 500
OOB 4.76%
Confusion matrix
TP FN Error (Class)
Positive Polarity 250 5 0.01960784
Negative Polarity 14 130 0.09722222
In Fig. 5, the disease risk prediction rate of random forest is well expressed with
the sample rate at each bagging node. The accuracy level increases and Gini index
increases for maximum accuracy on the classification.
References
1. Poria S, Cambria E, Gelbukh A (2015) Deep convolutional neural network textual features and
multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proceedings of
the 2015 conference on empirical methods in natural language processing, pp 2539–2544
2. Chaturvedi I, Ragusa E, Gastaldo P, Zunino R, Cambria E (2018) Bayesian network based
extreme learning machine for subjectivity detection. J Franklin Inst 355(4):1780–1797
3. Tran HN, Cambria E (2018) Ensemble application of ELM and GPU for real-time multimodal
sentiment analysis. Memetic Computing 10(1):3–13
4. Poria S, Majumder N, Hazarika D, Cambria E, Gelbukh A, Hussain A (2018) Multimodal
sentiment analysis: addressing key issues and setting up the baselines. IEEE Intell Syst
33(6):17–25
5. Hu P, Zhen L, Peng D, Liu P (2019) Scalable deep multimodal learning for cross-modal
retrieval. In: Proceedings of the 42nd international ACM SIGIR conference on research and
development in information retrieval (SIGIR’19). Association for Computing Machinery, New
York, NY, USA, pp 635–644. https://doi.org/10.1145/3331184.3331213
6. Abburi H, Akkireddy ESA, Gangashetti S, Mamidi R (2016) Multimodal sentiment analysis
of Telugu songs. In: SAAIP@ IJCAI, pp 48–52
7. Poria S, Peng H, Hussain A, Howard N, Cambria E (2017) Ensemble application of convo-
lutional neural networks and multiple kernel learning for multimodal sentiment analysis.
Neurocomputing 261:217–230
8. Busso C, Deng Z, Yildirim S, Bulut M, Lee CM, Kazemzadeh A, Lee S, Neumann U, Narayanan
S (2004) Analysis of emotion recognition using facial expressions, speech and multimodal
information. In: Proceedings of the 6th international conference on multimodal interfaces.
ACM, pp 205–211
9. Poria S, Chaturvedi I, Cambria E, Hussain A (2016) Convolutional MKL based multimodal
emotion recognition and sentiment analysis. In: 2016 IEEE 16th international conference on
data mining (ICDM). IEEE, pp 439–448
10. Calhoun VD, Sui J (2016) Multimodal fusion of brain imaging data: a key to finding the missing
link(s) in complex mental illness. Biological pysychiatry. Cogn Neurosci Neuroimaging
1(3):230–244. https://doi.org/10.1016/j.bpsc.2015.12.005
11. Lin WH, Hauptmann A (2002) News video classification using SVM-based multimodal clas-
sifiers and combination strategies. In: Proceedings of the tenth ACM international conference
on multimedia. ACM, pp 323–326
12. Falvo A, Comminiello D, Scardapane S, Scarpiniti M, Uncini A (2020) A multimodal
deep network for the reconstruction of T2W MR Images. In: Smart innovation, systems
and technologies. Springer, Singapore, pp 423–431. https://doi.org/10.1007/978-981-15-5093-
5_38
13. Kim Y, Jiang X, Giancardo L et al (2020) Multimodal phenotyping of alzheimer’s disease with
longitudinal magnetic resonance imaging and cognitive function data. Sci Rep 10:5527. https://
doi.org/10.1038/s41598-020-62263-w
14. Rozgić V, Ananthakrishnan S, Saleem S, Kumar R, Prasad R (2012) Ensemble of SVM trees
for multimodal emotion recognition. In: Proceedings of the 2012 Asia Pacific signal and
information processing association annual summit and conference. IEEE, pp 1–4
15. Xu X, He L, Lu H, Gao L, Ji Y (2019) Deep adversarial metric learning for cross-modal
retrieval. World Wide Web 22(2):657–672. https://doi.org/10.1007/s11280-018-0541-x
Effective Multimodal Opinion Mining Framework … 933
16. Kahou SE, Bouthillier X, Lamblin P, Gulcehre C, Michalski V, Konda K, Jean S, Froumenty P,
Dauphin Y, Boulanger-Lewandowski N, Ferrari RC (2016) Emonets: multimodal deep learning
approaches for emotion recognition in video. J Multimodal User Interfaces 10(2):99–111
17. Jin K, Wang Y, Wu C (2021) Multimodal affective computing based on weighted linear fusion.
In: Arai K, Kapoor S, Bhatia R (eds) Intelligent systems and applications. IntelliSys 2020.
Advances in intelligent systems and computing, vol 1252. Springer, Cham. https://doi.org/10.
1007/978-3-030-55190-2_1
18. Ranganathan H, Chakraborty S, Panchanathan S (2016) Multimodal emotion recognition using
deep learning architectures. In: 2016 IEEE winter conference on applications of computer vision
(WACV). IEEE, pp 1–9
19. Majumder N, Hazarika D, Gelbukh A, Cambria E, Poria S (2018) Multimodal sentiment
analysis using hierarchical fusion with context modeling. Knowl-Based Syst 161:124–133
20. Soleymani M, Garcia D, Jou B, Schuller B, Chang SF, Pantic M (2017) A survey of multimodal
sentiment analysis. Image Vis Comput 65:3–14
Vertical Fragmentation
of High-Dimensional Data Using Feature
Selection
1 Introduction
Owing to the need of today’s business world, many organizations run in distributed
manner and hence stores data in distributed databases. Banking systems, consumer
supermarkets, manufacturing companies, etc. are some examples. These organiza-
tions have branches working in different locations and therefore stores their data in
a distributed manner. Fragmentation is a design technique in distributed databases,
in which instead of storing a relation entirely in one location, it is fragmented into
different units and stored at different locations. Fragmentation provides data to the
user from the nearest location as per the user’s requirement. Fragmentation increases
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 935
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_68
936 R. Ramachandran et al.
efficiency by reducing the size of the table, and hence, the search time and also
provides security and privacy to the data. The fragmentation process has three cate-
gories: Horizontal, vertical, and hybridized fragmentation. Diagrammatic represen-
tation of these is shown in Fig. 1. Fragmentation is partitioning off a relation F into
fragments F 1 , F 2 , …, F i , containing enough information to reconstruct the original
relation F.
In horizontal fragmentation, data is fragmented tuple wise based on minter predi-
cates. This helps all related data to fall in a particular fragment. In this case, most of the
time user queries need search in minimum fragments [1]. Whereas, in vertical frag-
mentation, data is fragmented attribute wise. This is with the assumption that users
access certain related attributes together, and hence, if they are kept in one fragment,
then user queries can be executed faster. Our system considers vertical fragmenta-
tion. The benefit of vertical fragmentation is that only a few related attributes are
stored in each site, comparing to the original attribute set. Also, attributes are stored
according to the access frequency of attributes at different sites. All these factors
reduce the query processing time in distributed databases.
In the case of hybrid fragmentation, data is fragmented vertically as well as hori-
zontally. This method creates fragments with minimal information, attribute wise as
well as tuple wise [2].
Today’s world mainly deals with a large volume of data called big data. Big data
is a collection of data which is a large size and yet increasing day by day. It contains
high dimensions and needs a large amount of space for its storage.
When it comes to storing big data in a distributed manner, even after fragmentation,
each fragment will be large. As the size of fragments increases, time for query
execution also increases [3]. So, if the fragment size can be reduced as much as
possible, then that will speed up the query execution process.
When high-dimensional data is considered, it can be seen that all those dimensions
may not be important or they may be interrelated, and that redundancy occurs in the set
of attributes. Removing this irrelevant or repeated dimensions will reduce the attribute
size of dataset and hence that of fragments produced in vertical fragmentation. This
paper proposes a novel approach for vertical fragmentation of high-dimensional data
using feature selection.
Dimensionality reduction techniques are divided into two categories—feature
selection and feature extraction. Feature selection is used for reducing the attribute
size before vertical fragmentation. Feature selection is the technique which allows
us to select the most relevant features. It is done according to the relative importance
of each feature on the prediction. It eventually increases the accuracy of the model
by removing irrelevant features. Even though there exists different types of feature
selection methods, random forest algorithm (supervised) is focused on because its
efficiency is better compared to other feature selection methods [4].
The rest of the paper is organized as follows. Section 2 discusses the major work
already done in the vertical fragmentation as well as in feature selection. Our proposed
method of vertical fragmentation based on feature selection is explained in Sect. 3.
Experimentation conducted on various datasets, and their result analysis is done in
Sect. 4, and the paper concludes in Sect. 5.
2 Literature Review
are implemented successfully for the stand-alone system. But only update queries
are handled in the paper; for delete and alter extensions are being needed.
A similar work has been done by Rahimi et al. [7]. Here, fragmentation is
performed in a hierarchical manner using the applied bond energy algorithm with
a modified affinity measure and then calculates the cost of allocating fragments to
each site and allocates fragments to the correct site. The hierarchical method results
in more related attributes which enable better fragments. However, the cost function
considered for fragment allocation is not an optimized one.
Dorel Savulea and Nicolae Constantinescu in their paper [8] uses a combination
of a conventional database containing fact and a knowledge base containing rules
for vertical fragmentation. The paper also presents us with the implementation of
different algorithms related with fragmentation and allocation methods, namely RCA
rules for clustering, OVF for computing overlapping vertical fragmentation, and
CCA for allocating rules and corresponding fragments [9]. Here, attribute clustering
in vertical fragmentation is not determined by attribute affinity matrix, as usual, but
it is done using the rule to attribute dependency matrices. The algorithm is efficient
but a small number operation can only perform [10].
A case study of vertical fragmentation is done by Iacob (Ciobanu) Nicoleta—
Magdalenaa [11]. The paper explains briefly about of distributed database and the
importance of fragmentation and its strategies. A comparison between different types
of fragmentation is also done here. This case study is done by implementing the e-
learning platform for the academic environment using vertical fragmentation [12].
The paper explained how vertical fragmentation increases concurrency and thereby
causes an increase in throughput for query processing.
Feature selection helps to reduce overfitting and reduces its size by removing
irrelevant features. There are mainly three types of feature selection; they are wrapper
method, filter method, and embedded method [13]. In the wrapper method, subsets
of features are generated; then, the features are deleted or added in the subset. In the
filter method, feature selection is done based on the scores of the statistical test. The
embedded method combines both the features of the wrapper method and the filter
method. Random forest classifier comes under the wrapper method [4].
Jehad Ali, Rehanullah Khan, Nasir Ahmad, Imran Maqsood on their paper random
forest and decision tree made a comparative study on the classification result of
random forest and decision tree by using 20 datasets available in UCI repository. They
made a comparative study based on correctly classified instances in both decision
tree and random forest by taking a number of instance and number of attributes [14].
On the comparison, the paper concluded that the percentage of correctly classified
instances is high in random forest, and incorrectly classified instances are lower than
that of a decision tree. The comparison is also done on recall, precision, and F-
measure. In the comparison, random forest has increased classification performance,
and the results are also accurate [15].
The study on the random forest is done by Leo Breiman in his paper named
random forests. The paper gives a high theoretical knowledge of random forest, and
it includes the history of random forest. The complete steps of the random forests are
explained by computation. The random forest for regression is formed in addition to
Vertical Fragmentation of High-Dimensional Data … 939
classification [16]. It is concluded that random features and random inputs produce
better results in classification than regression. But only two types of randomness are
used here that are bagging and random features; other injected randomness gives a
better result.
The application of the random forest algorithm in computer fault diagnosis is
given by Yang and Soo-Jong Lee. The paper describes a technique that helps to
diagnose rotating machinery fault. In this, a classifier for a novel assembly constructs
a significant amount of decision tree. Even though there exist many fault diagnosis
techniques, the random forest methodology is considered to be better because of its
executed speed. Here, the randomness like bagging is used, the bootstrap acronym
which is a meta-algorithm that enhances classification [17]. However, a minor change
in the training set in a randomized procedure can trigger a major difference between
the component classifier and the classifier trained in the whole dataset.
One proposal is made by Ramon Casanova Santiago Saldana, Emily Y. Chew,
Ronald P. Danis, Craig M. Greven, Walter T. Ambrosiu in their paper implementa-
tion of random forest methods for diabetic retinopathy analyzes. Early detection of
retinopathy diabetic can prevent the chances of becoming blind. The approach used
by 3443 participants in the ACCORD-Eye analysis is random forest and logistic
regression classificatory on graded fundus photography and systematic results. They
concluded that RF-based models provided a higher classification of ion accuracy
than logistic regression [18]. The result suggests that the random forest method can
be one of the better tool to diagnose diabetic retinopathy analysis and also evaluating
its progression. But here different degrees of retinopathy are not evaluated.
Even though there exist many applications of feature selection in the big data
area, it has not yet been used in distributed databases for vertical fragmentation, to
the best of our knowledge.
3 Proposed Method
Step 2: Create access frequency matrix of queries for each class for each site.
Step 3: By using access frequency and method usage matrix, affinity matrix is
determined.
Step 4: The clustered matrix is built from an affinity matrix.
Step 5: Partitioning algorithm is used to obtain partitions.
Partition point is the point that divides the attributes into separate classes to allow
the multiple sites to be allocated. Two-way partitioning was done, i.e., division of
the attributes must be assigned to two locations in two classes. Attributes to the left
Vertical Fragmentation of High-Dimensional Data … 941
of the partition point belong to one site while attributes to the right belong to another
site [20].
The fragments produced using the BEA can be allocated to various nodes of the
distributed database using the allocation algorithm.
Experimentation of our proposed method is done using various parameters like time,
no. of fragments as well as the average number of dimensions in each fragment.
For experimentation purpose, five datasets have been taken from the UCI repository.
Details of the datasets are given in Table 1. The complexity and space consumption
are reduced by using a feature selection method is seen.
Time taken for fragmenting the dataset with and without feature selection is shown
in Fig. 3. As seen from the graph, when the high-dimensional data is reduced to low-
dimensional data using feature selection, it also reduces the fragmentation time. As
the dimensionality of big data increases, a considerable reduction in fragmentation
time will be got, if remove irrelevant or dependent features before fragmentation.
Table 2 Number of
Dataset Number of fragments Average no. of dimensions
dimension in each fragments
D1 10 5
D2 18 6
D3 22 8
D4 28 9
D5 32 10
Vertical Fragmentation of High-Dimensional Data … 943
5 Conclusion
References
Abstract The phase of life has been constantly developing. From a bicycle to the
fastest car, the future is led by the latest breakthroughs in the field of science,
medicine, space research, marine explorations, and many more. One such break-
through is robotics. People are familiarized with robots by watching them on televi-
sion, computers, and less likely in real life. Robotics revolutionize the very purpose
of humans and their needs. Based on the panorama obtained, this paper includes
profound research and applications made in the field of robotics, which put forward
the machine dominance in the industry.
1 Introduction
Robots can leave constrained industrial environments and reach out to unexplored and
unstructured areas, for extensive applications in the real world with substantial utility.
Throughout history, there has always been a forecast about robotics thriving and being
able to manage tasks and mimic human behavior. Today, as technological advances
continue, researching, designing, and building new robots serve various practical
purposes in fields like medicine, space research, marine exploration, manufacturing
and assembling, data analytics, armory, and so on.
In fields like manufacturing, assembly, medical and surgical implementations,
robotics essentially minimizes human flaws, and to increase accuracy. On the other
hand, in fields like space and marine exploration, robotics make it possible for us to
reach unbelievable heights in areas that are practically impossible to reach. With the
existing technologies, various applications have already been made. However, the
future of robotics has a lot in the hold.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 945
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_69
946 D. V. S. P. Karthik and S. Pranavanand
The use of robotics in the medical sector has been constantly upgraded to meet the
accuracy and demand in surgeries. A 16-segment biomechanical model [1] of the
human body is made, and its 3D model realization is done using the SolidWorks
medium to facilitate the movement according to the task but for an arm or any limb
to move similar to a real one. One should know the geometric and mass-inertial char-
acteristics of body segments, to gain an overview of these properties a mathematical
model which predicts the inertial properties of the human body in any fixed body
position (Sitting) is made, and the model is used to develop a design. The model is
used to determine the human behavior in space, ergonomics, criminology, and other
areas.
Brain–Machine Interface (BMI) [2] is an interactive software which helps in
communication with the robot and the environment. It can be used for a wide range
of patients. The information is taken from the user’s electroencephalographic (EEG)
signals (Fig. 1) and adapts accordingly to the user’s daily requirement by providing
almost the same inputs as with the real limb.
In the European Commission (EC), the directorate general of information society
and media is provoking the use of technology, which has been proven useful in the
health care sector [3].
A brain tumor is a deadly chronic disease, be it for a child or an adult. The most
efficient way of locating a tumor is with the help of an Magnetic Resonance Imaging
(MRI). MRI, in coordination with robotics, has a better scope of success rate [4].
For example, the tumor may be neglected and may spread to different parts of the
body, which may be complicated for the naked eye and present equipment to detect.
But with the help of continuum robots, this probability is reduced to the maximum.
Microbots [5] are deployed into the affected area which provides a first-person review
and is efficient in taking decisions and accordingly performing the required activity,
where there is no accountability of human caliber and approach. The robot uses
distances. Yet, it is more than effectively used on earth. Not to forget, there is a slight
communication gap between the surgeon and the patient.
The very recent invention in the field of robotics is Xenobots [Based on, https://
en.m.wikipedia.org/wiki/Xenobots] the living and self-healing robot made from the
stem cells of a frog (Xenopus laevis). It is a completely different species of robots,
perfectly small to travel in the human body. They can go without food or water for
about a month or two. Generally, tiny robots made of iron and plastic are harmful once
they decay in the body. On the contrary, Xenobots are almost degradable compared
to the other ones. They can be used for targeted drug delivery or elimination of any
disease on a microscopic level.
Space is another milestone that mankind has achieved in the previous decade. But
space can be as challenging as a game of shogi—without a proper strategy, and a
plan it is very complicated to cruise through. In places where there might be a risk for
human landing, robots can be used as testbeds [13] and test the landing site before
any human interactions. Humanoid robots [14] can be used for unmanned and deep
space missions and retrieve information. The main objective of the robot should be
able to perform these actions accordingly.
1. Communication dish: Adjust pitch and yaw alignment of a communication dish.
2. Solar array: Retrieve, deploy, and connect a new solar panel to the existing solar
array.
3. Air leak: Climb the stairs to a habitat entrance, enter the habitat, find an air leak
using a leak detector tool, and repair the air leak using a patch.
DRC-Hubo [14] the humanoid robot that can recognize color and position of LED
which displays on a console panel, can press the button that opens the door, and can
walk through the doorway.
Not all astronauts are blessed to be the customers of the robots on the International
Space Station (ISS). People on ISS face many problems such as repetitive work and
difficulty performing their experiments in microgravity. To conquer this problem,
NASA invented a robot named Astrobee (Fig. 2) [15], which can perform repetitive
tasks, carry loads, and provide a platform for the spacemen to conduct their research.
It has a hinge that can cling on to the rails of the ISS, thereby increasing the standby
time and reduction of fuel consumption to oppose gravity. It is more than speculating
to have a robot co-worker or a co-pilot by your side.
Many people may have watched films, where the robot steers the space shuttle
and later retrieve information, where there is no possibility of human existence. Out
of the many, ESA’s METERON SUPVIS Justin is the experiment, where astronauts
on-board the International Space Station (ISS) commanded a humanoid robot Rollin’
Justin [16] in a simulated Martian environment on earth. This type of operation is
highly recommended for the spacemen as they have trouble controlling their motor in
Extrapolation of Futuristic Application … 949
microgravity apart from their mental load due to any uninvited problems on board.
Another advantage is that the robot can freely be controlled by a variety of User
Interface (UI) such as a tablet and such.
Ever feared that one-day debris or a spaceship came crashing into the atmosphere?
Well, one shouldn’t freak out because humans have been so smart to come with a
countermeasure. A machine that is fitted with a kinematically redundant robot [17]
its main principle is based on target motion prediction. It moves based on a reference
trajectory provided by the ground control and constantly corrects its trajectory with
the help of a tracking controller and finally takes the grasp. The duration of grasping
is selected based on the initial contact forces that pass through the center of mass
of the chaser (The robot which grabs the target) to minimize the aftereffects caused
by a change in altitude. It either delivers the object down to earth or may set it back
in its trajectory, thereby reducing ballistic impact. This method can also be used to
remove space debris from the earth’s orbit.
The space is much known than our waters. Half our oceans remain unexplored, and
there might be a cure for every disease that has struck mankind, lost knowledge, and
much more right under our noses. Things that strike our mind when think of water are
the vessels, ships, and boats. Out of all ways, the worst way to lose your vessel is to let
950 D. V. S. P. Karthik and S. Pranavanand
it sink in the deep waters. A hull is the main part of the ship, and its design is the only
way it is to float and resist the friction. Due to humidity and it constantly cruising in
the waters, rust is invited which in turn corrodes the metal leading to leakages in the
vessel and resulting in the sinking. To counter this problem, researchers have come
up with an idea of a swarm of deep-water robots [18] which detect the breaches in
the hull and then notifies the crew on board in emergencies. They form a cluster, and
they rearrange themselves in the area of the water infiltration. They are the physical
representation of the quote “United we stand, divided we fall.” They have a higher
resistance to sensor noise, and the probability of the robotic population going haywire
is near to zero, thereby reducing casualties and economy in the marine industry.
Deep marine exploration has been possible only due to the use of robots and
their ability to transfer the information among themselves and take necessary action.
Crabster (CR200) [19] is a six-legged crab-like robot (Fig. 3) made for deep-sea
explorations that can withstand turbidity and underwater waste. It has been currently
tested in a water tank which is simulated to a scenario to that of the wild sea currents.
A platform named OceanRINGS [20] technologies which can be associated with
almost any Remotely Operated Vehicle (ROV) independent of the size. Tests were
conducted with different support vessels off the North, South, and West coast of
Ireland, in Donegal, Bantry Bay, Cork Harbor, Galway Bay, Shannon Estuary, and
La Spezia, Italy. It is also provided with the ground to a prototype communication
system in real-time. It is based on the principle of remote presence technology.
Marine renewable energy (Oil and gas) is equally important as oxygen in our
lives, making our lives possible. Sometimes, the oil or gas forms offshore, and for
such purposes, OceanRINGS have put forward the idea of building robotic systems
for inspection of offshore sub-sea marine renewable energy [21]. They are capable
of resisting extremely harsh weather conditions and send information related to the
location and amount of reserves to the Virtual Control Cabin (VCC) on the ground,
making renewable energy available to the population. This smart technology could
lead to significant savings in time, maintenance, and operational costs.
In the manufacturing industry, human hours are not so efficient. Humans prefer
to work in a safe and flexible environment, but cannot always be provided with
luxury. [22] Therefore replacing them with robots increases efficiency and production
compared to human labor. They reduce the cost of manufacturing. They can carry out
any work for up to 12 hrs straight and can rectify many human errors and mistakes
in quality control, thereby proving themselves fitter for the job than humans. For
example, if a person needs to lift an object of about 25 kg he /she will experience
pain in the back.
People tend to forget the places they put their things. To improve the stacking and
retrieving of things, people have come up with a robot which stacks [23] the required
object at a particular place and at a particular time, and it makes a note of it, which
later uses this information to retrieve the object back from, where it was placed.
The Statue of Liberty was gifted by French to the United States on account of
their independence, but it was imported in parts by ships. It would have certainly
taken about four months or so just to assemble it. Imagine if it were to be gifted in
this era, where there is a constraint for space and labor. Pinning these limitations in
mind, a new approach is put forward, where the fixtures and the tooling are all taken
care of by the coordinated mobile robots [24]. The mobility of the robot’s results in
reevaluating the assembly space and reducing the cost and effort of labor. Sometimes
robots need to undergo complex coordination to get the parts, and the tools at the
designated place and time to obtain the desired assembly. The assembling process
is cut down to these four basic points: (1) mobile manipulator hardware design,
(2) fixture-free positioning, (3) multi-robot coordination, and (4) real-time dynamic
scheduling.
Additive manufacturing [based on https://en.m.wikipedia.org/wiki/3D_printing]
or commonly known as 3D printing has greatly influenced the mass production
market. It is used to manufacture the complex of parts which are difficult to manu-
facture and produce (Fig. 4). There is a bundle of advantages when it comes to 3D
printing, which includes:
1. Rapid prototyping: As the name suggests, it aids faster production. It just takes
hours to produce unlike the usage of other typical methods which may result in
days.
2. A quick analyzing technique: Manufacturing an experimental product to check
its properties and its functions, thereby having an awareness of the pros and cons
of the product when going for large-scale production.
952 D. V. S. P. Karthik and S. Pranavanand
3. Waste reduction: The material is used according to the product only, and the
remaining material is later used.
4. Custom: Every product designed can be customizable in size, shape, color, and
structure.
5. Precision: In some fields of work a millimeter plays an important role in the
machine’s efficiency. For example, springs in watches are of very small size, and
they require great time and precision to craft them by other means. However,
here, it is done with pinpoint accuracy and in a short time.
One of the most budding and enriching technologies which are on par with both AI
and robotics are big data, otherwise known as the cloud, dew, and fog computing
[25]. The robots have advanced to such a state, where all the information and data
are stored, verified, and then sent to the user. To store and execute such large data and
algorithms, robots need a much larger storage space apart from hard drives. This is
where cloud and fog come into the picture. With their immense storage calculations,
executions of functions are performed at higher speeds to meet the demands of the
growing population and the corporate world. C2RO (Collaborative Cloud Robotics)
[26] is a cloud platform that uses a stream processing technology to connect the
city to mobile devices and sensors. This technology boosts the intelligence of robots
to a larger scale by being able to perform complicated tasks such as simultaneous
Extrapolation of Futuristic Application … 953
During a natural calamity or disaster, most of the time loss of human life are
inevitable. In such situations, drones can carry out search and rescue missions. The
safest way to get in or out of a forest is to follow the existing trail generally made
by hikers and mountaineers. The robot needs to look for the trail and then make
an effort to stay on the trail. A machine learning approach to the visual perception
of forest trails and gathering information is made by training the neural network
with the various real-world dataset and testing it by operating on a single image, the
system outputs the main direction of the trail compared to the viewing direction. The
probable direction is determined using Deep Neural Networks (DNNs) as an image
classifier [27], which operates reading the image’s pixels. It is used to determine the
actions and avoid obstacles in the wilderness. It is mainly made to navigate in places,
where humans cannot reach with their existing approach. Finding people who have
lost their way into dense forests or maybe a rugged terrain might not be completely
impossible, but for a robot of the size of an arm that might be a cakewalk.
War might not be a good thing to point our views onto, but have an opportunity of
introducing unmanned vehicles controlled by robots to reduce casualties on a large
scale. Unmanned Aerial Systems (UAS) [28] or can be simply quoted as drones have
proven worth of themselves in unmanned missions. They can substitute missiles,
torpedoes resulting in decreasing sacrificial and suicidal missions. Another alterna-
tive is the use of robotic soldiers who have higher endurance and strength are capable
of the battle for longer durations, i.e., robotic ammunition, etc.
Teaching today’s kids, tomorrow’s future is more than important. Robotics can turn
ideas into reality. Inculcating it in the curriculum [29], both for schools and univer-
sities will trigger the future generations toward the budding field. If the teacher is
a robot, then students will make an effort to listen. By grabbing their attention and
which in turn results in a proper academic and social career, robots can be of great
help. Robots can train athletes and sportsmen toward glory. They can instruct the
pupils with the help of speech, the most effective form of communication. This feat
954 D. V. S. P. Karthik and S. Pranavanand
is achieved by using Artificial Neural Network (ANN) [30]. It can follow simple
instructions as of now, thereby making robotics efficient in all walks of life.
2 Conclusion
The above literature review gives us a basic clue of robotics in our daily lives and
its use in the long run. It also gives an overview of the existing and the upcoming
technology in the vast field of robotics. Apart from the cited fields of usage, there
might be many other fields in which robotics are the fundamental building block.
Reaching the human level of intelligence and exposure is currently an issue as the
robots can perform tasks only which they are programmed for. Yet, research to
achieve maximum human-like characteristics is still on the run.
References
1. Nikolova G, Kotev V, Dantchev D (2017) CAD modelling of human body for robotics
applications. In: 2017 international conference on control, artificial intelligence, robotics &
optimization (ICCAIRO), Prague, pp 45–50. https://doi.org/10.1109/ICCAIRO.2017.18
2. Schiatti L, Tessadori J, Barresi G, Mattos LS, Ajoudani A (2017) Soft brain-machine inter-
faces for assistive robotics: a novel control approach. In: 2017 International conference on
rehabilitation robotics (ICORR), London, pp 863–869. https://doi.org/10.1109/ICORR.2017.
8009357
3. Gelderblom GJ, De Wilt M, Cremers G, Rensma A (2009) Rehabilitation robotics in robotics for
healthcare; a roadmap study for the European Commission. In: 2009 IEEE international confer-
ence on rehabilitation robotics, Kyoto, 2009, pp 834–838. https://doi.org/10.1109/ICORR.
2009.5209498
4. Kim Y, Cheng SS, Diakite M, Gullapalli RP, Simard JM, Desai JP (2017) Toward the devel-
opment of a flexible mesoscale MRI-compatible neurosurgical continuum robot. IEEE Trans.
Rob. 33(6):1386–1397. https://doi.org/10.1109/TRO.2017.2719035
5. Ongaro F, Pane S, Scheggi S, Misra S (2019) Design of an electromagnetic setup for independent
three-dimensional control of pairs of identical and nonidentical microrobots. IEEE Trans Rob
35(1):174–183. https://doi.org/10.1109/TRO.2018.2875393
6. Schlenk C, Schwier A, Heiss M, Bahls T, Albu-Schäffer A (2019) Design of a robotic instrument
for minimally invasive waterjet surgery. In: 2019 International symposium on medical robotics
(ISMR), Atlanta, GA, USA, pp 1–7. https://doi.org/10.1109/ISMR.2019.8710186
7. Stogl D, Armbruster O, Mende M, Hein B, Wang X, Meyer P (2019) Robot-based training for
people with mild cognitive impairment. IEEE Robot Autom Lett 4(2):1916–1923. https://doi.
org/10.1109/LRA.2019.2898470
8. Brun C, Giorgi N, Gagné M, Mercier C, McCabe CS (2017) Combining robotics and virtual
reality to assess proprioception in individuals with chronic pain. In: 2017 International confer-
ence on virtual rehabilitation (ICVR), Montreal, QC, pp 1–2. https://doi.org/10.1109/ICVR.
2017.8007491
9. Meghdari A, Alemi M, Khamooshi M, Amoozandeh A, Shariati A, Mozafari B (2016) Concep-
tual design of a social robot for pediatric hospitals. In: 2016 4th international conference on
robotics and mechatronics (ICROM), Tehran, pp 566–571. https://doi.org/10.1109/ICRoM.
2016.7886804
Extrapolation of Futuristic Application … 955
10. Florez JM et al (2017) Rehabilitative soft exoskeleton for rodents. IEEE Trans Neural Syst
Rehabil Eng 25(2):107–118. https://doi.org/10.1109/TNSRE.2016.2535352
11. Sun T et al (2017) Robotics-based micro-reeling of magnetic microfibers to fabricate helical
structure for smooth muscle cells culture. In: 2017 IEEE international conference on robotics
and automation (ICRA), Singapore, 2017, pp 5983–5988. https://doi.org/10.1109/ICRA.2017.
7989706
12. Takács Á, Jordán S, Nagy DÁ, Tar JK, Rudas IJ, Haidegger T (2015) Surgical robotics—
born in space. In: 2015 IEEE 10th Jubilee international symposium on applied computational
intelligence and informatics, Timisoara, pp 547–551. https://doi.org/10.1109/SACI.2015.720
8264
13. Backes P et al (2018) The intelligent robotics system architecture applied to robotics testbeds
and research platforms. In: 2018 IEEE aerospace conference, Big Sky, MT, 2018, pp 1–8.
https://doi.org/10.1109/AERO.2018.8396770
14. Tanaka Y, Lee H, Wallace D, Jun Y, Oh P, Inaba M (2017) Toward deep space humanoid
robotics inspired by the NASA space robotics challenge. In: 2017 14th international conference
on ubiquitous robots and ambient intelligence (URAI), Jeju, pp 14–19. https://doi.org/10.1109/
URAI.2017.7992877
15. Yoo J, Park I, To V, Lum JQH, Smith T (2015) Avionics and perching systems of free-flying
robots for the International Space Station. In: 2015 IEEE international symposium on systems
engineering (ISSE), Rome, pp 198–201. https://doi.org/10.1109/SysEng.2015.7302756
16. Schmaus P et al (2020) Knowledge driven orbit-to-ground teleoperation of a Robot coworker.
IEEE Robot Autom Lett 5(1):143–150. https://doi.org/10.1109/LRA.2019.2948128
17. Lampariello R, Mishra H, Oumer N, Schmidt P, De Stefano M, Albu-Schäffer A (2018)
Tracking control for the grasping of a tumbling satellite with a free-floating robot. IEEE Robot.
Autom. Lett. 3(4):3638–3645. https://doi.org/10.1109/LRA.2018.2855799
18. Haire M, Xu X, Alboul L, Penders J, Zhang H (2019) Ship hull inspection using a swarm of
autonomous underwater robots: a search algorithm. In: 2019 IEEE international symposium on
safety, security, and rescue robotics (SSRR), Würzburg, Germany, 2019, pp 114–115. https://
doi.org/10.1109/SSRR.2019.8848963
19. Yoo S et al (2015) Preliminary water tank test of a multi-legged underwater robot for seabed
explorations. In: OCEANS 2015—MTS/IEEE Washington, Washington, DC, 2015, pp 1–6.
https://doi.org/10.23919/OCEANS.2015.7404409
20. Omerdic E, Toal D, Dooly G (2015) Remote presence: powerful tool for promotion, education
and research in marine robotics. In: OCEANS 2015—Genova, Genoa, 2015, pp 1–7. https://
doi.org/10.1109/OCEANS-Genova.2015.7271467
21. Omerdic E, Toal D, Dooly G, Kaknjo A (2014) Remote presence: long endurance robotic
systems for routine inspection of offshore subsea oil & gas installations and marine renewable
energy devices. In: 2014 oceans—St. John’s, NL, 2014, pp 1–9. https://doi.org/10.1109/OCE
ANS.2014.7003054
22. Hirukawa H (2015) Robotics for innovation. In: 2015 symposium on VLSI circuits (VLSI
circuits), Kyoto, 2015, pp T2–T5. https://doi.org/10.1109/VLSIC.2015.7231379
23. Chong Z et al (2018) An innovative robotics stowing strategy for inventory replenishment
in automated storage and retrieval system. In: 2018 15th international conference on control,
automation, robotics and vision (ICARCV), Singapore, pp 305–310. https://doi.org/10.1109/
ICARCV.2018.8581338
24. Bourne et al D (2015) Mobile manufacturing of large structures. In: 2015 IEEE international
conference on robotics and automation (ICRA), Seattle, WA, pp 1565–1572. https://doi.org/
10.1109/ICRA.2015.7139397
25. Botta A, Gallo L, Ventre G (2019) Cloud, fog, and dew robotics: architectures for next gener-
ation applications. In: 2019 7th IEEE international conference on mobile cloud computing,
services, and engineering (MobileCloud), Newark, CA, USA, pp 16–23. https://doi.org/10.
1109/MobileCloud.2019.00010
26. Beigi NK, Partov B, Farokhi S (2017) Real-time cloud robotics in practical smart city appli-
cations. In: 2017 IEEE 28th annual international symposium on personal, indoor, and mobile
956 D. V. S. P. Karthik and S. Pranavanand
B. R. Arun Kumar
Abstract Artificial Intelligence (AI) techniques are applied for customer data and
that can be analyzed to anticipate customer behaviour. The AI, the big data and
advanced analytics techniques can handle both structured and unstructured data effi-
ciently with great speed and precision than regular computer technology which elicits
Digital Marketing (DM). AI techniques enable to construe emotions and connect
like a human which made prospective AI-based DM firms to think AI as a ‘business
advantage’. Marketers are data rich but insight poor is no longer enviable due to AI
tools which optimize marketing operation and effectiveness. This paper highlights the
significance of applying AI strategies in effectively reaching the customer in terms of
understanding their behaviour to find their expectations on the product features, oper-
ations, maintenance, delivery, etc. using machine learning techniques. It highlights
that such strategies enable digital marketing towards customer need-based business.
1 Introduction
Digital Marketing (DM) involves promoting efforts that use an electronic device or
the Internet utilizing digital channels such as electronic search engines, electronic
social media, email and websites. DM which uses electronic and Internet to connect
to current and prospective customers can also be denoted as ‘online marketing’,
‘Internet marketing’ or ‘web marketing’.
Online marketing strategies implemented using the Internet, and its related
communicating hardware/software devices/technologies can be referred to as digital
marketing.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 957
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_70
958 B. R. Arun Kumar
Redefining the strategy is essential to broaden the reachability of the brand, when-
ever new product/service gets introduced. Goal definition may be re-established
to bring brand awareness and goodwill among the customers using digital tools.
Changes in goals/strategies expect changes in the action plan to practically implement
on digital platforms (Fig. 2).
Reaching the customers using the Internet, electronic gadgets such as smart-
phones, social media, search engines, understanding customer behaviour, their pref-
erences, by applying analytic tools and analyzing their results are a comprehensive
emerging, and dynamic domain of the Digital Marketing (DM) is quite different
from traditional marketing. Several studies have projected that 85% of the customer-
business relationship will be maintained using AI tools [2], and the AI market is
appraised to be assets $9.88 billion by 2022.
Coviello, Milley and Marcolin define e-Marketing as ’Using the Internet and
other interactive technologies to create and mediate dialogue between the firm and
identified customers’. DM is the broad term that makes use of different marketing
strategies/tools, namely website, email, Internet, content, video, smartphone, PPC
advertising and SMS messaging. Along with digital strategies/tools, the following
basic guidelines which are the core of DM that is worth recall which is described as
essential guidelines for DM to start with.
It is difficult and challenging to get the particular website of the business which is a
top-ranked Search Engine Result Page (SERP) among nearly 14 billion searches per
month in the globe all strategies of DM should be optimized including social media
960 B. R. Arun Kumar
Despite enhanced digital marketing [5] strategies in place, their efficiencies are to
be improved using contemporary technologies such as AI to understand emotions,
behaviour, respond to human customer’s queries. AI computing could optimize DM
strategies at all cognitive levels.
Teaching the machine to learn is a subset of AI that can offer customized inputs
for marketing specialists. DL is a subclass of ML encompassed of enormously
large neural networks and an immense pool of algorithms that can replicate human
intelligence.
The yield of the direct answer by Google is driven by ML, and the return of the
’people also ask’ section is motorized by DL. Google is continuously culturing and
reflecting human intelligence without the need for humans to nourish all the answers
into its enormous database.
The paper analyses the DM and AI-ML/DL role in stimulating the business by
identifying and responding to the customer’s taste. The paper highlights the role of
artificial intelligence, ML and DL tools in digital marketing. This paper is narrative
in nature; information and examples denoted are based on the references available
AI-Based Digital Marketing Strategies—A Review 963
at some subordinate sources. The study motivates business enterprises to adopt AI-
ML/DL techniques to optimize their digital marketing strategies.
This research is carried out with a primary objective of exploring AI-based DM and
to the significance of contemporary technology such as AI, big data, data analytics
and deep learning for marketing their product and services.
2 Impact of AI and ML on DM
DM strategies/tools based on AI-ML can streamline the market, optimizing both the
business profit and satisfaction of user experience. The future of DM depends on the
ability of DM professionals in applying AI-ML techniques to effectively implement
DM strategies.
AI and ML are separate yet complementary to each other. As mentioned in [10].
’AI aims to harness certain aspects of the "thinking” mind, Machine Learning (ML)
is helping humans solve problems in a more efficient way. As a subset of AI, ML
uses data to teach itself how to complete a process with the help of AI [11] capabil-
ities’. AI-ML tools [12] can bring out hidden business intelligence from the given
consumer data which streamlines complex DM problems. It is difficult to make valid
conclusions on the implications of ML techniques. It is known that ML has started
creating an impact on DM [3].
This is because of the ability of ML tools to analyze the extremely large dataset
and present the visualization as per the requirement of the DM team for taking deci-
sions to streamline strategies. By applying ML tools, analytics outcomes enable them
to understand their customers in-depth. It may be noted that 75% of DM strategy
development as of now adopted AI functionality, and 85% of the customer interac-
tions can be effectively managed without human intervention [10]. It implies that
the ML tool can streamline DM strategies, and the business can align with AI-ML
[10] future trends. It can be noted that there are several research works, and articles
have upholder the artificial intelligence, ML and DL-based approaches for digital
marketing including [13, 2]. It is found that 90% of the sales professional expected
a substantial impact of AI on sales marketing [14].
964 B. R. Arun Kumar
from telecommunications to banking [17, 18]. Since the customers enjoy intermin-
gling with digital humans, AI-based DHs can impact digital marketing since it can
work efficiently, eventually can keep learning from its experience and reduce costs
as well. Digital services especially digital humans powered by AI when developed
to meet the expectation of the customers, customers prefer it too.
A ’Digital Human’ is the embodiment of an underlying AI-based Chatbot or digital
assistant with additional capabilities such as emotional intelligence. Like a natural
human, it can connect to individual natural humans, understand tone expression
and body language and respond with relevance giving appropriate responses. For
example, patients can take assistance from digital humans to understand their medical
problems and method of following the prescription and diet with individual empathy
[19].
An AI machine in human avatar visually realistic can blink eyes, wink, move
their lips, smiles, treats with empathy; intelligent corporate digital human ability is
highly convincing because of their modes of persuasion in handling customer-centric
services. Compared Chatbots DHs can convince with logos, ethos and pathos. Digital
assistants work 24/7 never get bored or tired. DHs are a combination of multiple
advanced technologies that can understand the user’s emotion, mood and personality
[19] (Fig. 6).
966 B. R. Arun Kumar
Table 3 Mistakes to be
Sl. no. Mistakes to be avoided
avoided during DM
streamlining [3] 1 Affecting generic and broad customer characters
2 Working with inadequate customer data
3 Neglecting performance of previous marketing
campaigns
4 Not addressing regular and returning customers
5 Generating and dissemination of irrelevant content
6 Too much dependence on gut feeling
The global machine learning market is expected to grow from $1.41 billion in
2017 to $8.81 billion by 2022, at a Compound Annual Growth Rate (CAGR) of
44.1% [3].
The forthcoming DM includes AI-ML-based smart automation solutions which
include the details given in Fig. 7.
AI-ML-based marketing strategies must avoid the following mistakes shown in
Table 3.
It can be determined that ML/DL based on extensive data processing offers the
information essential for the decision-making progression of promoting specialists.
The application of ML-driven tools into digital marketing [21, 22] acquaint with
various new challenges and opportunities. Implementation of ML applied to market
analytical tools has no obvious disadvantages [9].
3 Conclusion
boost DM. They are AI-assisted professional website development, audience selec-
tion, content crafting services, creating and customizing content, Chabot’s, customer
service, email marketing, predictive analysis and marketing, AI recommendations
for engaging the targeted customers. The future developments in AI coupled with
ML and DL tools address the concerns or limiting factors if any with the current
tools.
Reference
1. https://www.deasra.in/msme-checklist/digital-marketing-checklist/?gclid=EAIaIQobChMI
rLf4mJzM6wIV3sEWBR1f0AXnEAAYASAAEgIENPD_BwE
2. https://www.toprankblog.com/2018/03/artificial-intelligence-marketing-tools/
3. https://www.grazitti.com/blog/the-impact-of-ai-ml-on-marketing/
4. https://www.analyticsvidhya.com/blog/2017/04/comparison-between-deep-learning-mac
hine-learning/
5. https://quanticmind.com/blog/predictive-advertising-future-digital-marketing/
6. Artificial Intelligence in Action: Digital Humans, Monica Collier Scott Manion Richard de
Boyett, May 2019. https://aiforum.org.nz/wp-content/uploads/2019/10/FaceMe-Case-Study.
pdf
7. Artificial intelligence. https://en.wikipedia.org/wiki/Artificial_intelligence
8. Talib MA, Majzoub S, Nasir Q et al (2020) A systematic literature review on hardware imple-
mentation of artificial intelligence algorithms. J Supercomput. https://doi.org/10.1007/s11227-
020-03325-8
9. Miklosik A, Kuchta M, Evans N, Zak S (2019) Towards the adoption of machine learning-based
analytical tools in digital marketing. https://doi.org/10.1109/ACCESS.2019.2924425
10. https://digitalmarketinginstitute.com/blog/how-to-apply-machine-learning-to-your-digital-
marketing-strategy
11. https://www.smartinsights.com/managing-digital-marketing/how-ai-is-transforming-the-fut
ure-of-digital-marketing/
12. ]https://www.superaitools.com/post/ai-tools-for-digital-marketing
13. https://www.researchgate.net/publication/330661483_Trends_in_Digital_Marketing_2019/
link/5c4d3d6f458515a4c743467e/download
14. Top Sales & Marketing Priorities for 2019: AI and Big Data, Revealed by Survey of 600+ Sales
Professionals Business Wire|https://www.businesswire.com/news/home/20190129005560/en/
Top-Sales-Marketing-Priorities-2019-AI-Big
15. https://www.educba.com/seo-in-digital-marketing/
16. https://www.prnewswire.com/news-releases/machine-learning-market-worth-881-billion-
usd-by-2022-644444253.html
17. Digital Humans; the rise of non-human interactions, Jody shares. https://www.marketing.org.
nz/Digital-Humans-DDO18
18. Customers’ lives are digital-but is your customer care still analog? Jorge Amar and Hyo Yeon,
June 2017. https://www.mckinsey.com/business-functions/operations/our-insights/customers-
lives-are-digital-but-is-your-customer-care-still-analog
19. In 5 years, a very large population of digital humans will have hundreds of millions of conversa-
tions every day by Cyril Fiévet. https://bonus.usbeketrica.com/article/in-5-years-a-very-large-
population-of-digital-humans-will-have-hundreds-of-millions-of-conversations-every-day
20. https://ieeexplore.ieee.org/stamp/stamp.?arnumber=8746184
AI-Based Digital Marketing Strategies—A Review 969
21. https://cio.economictimes.indiatimes.com/news/strategy-and-management/ai-digital-market
ing-key-skills-to-boost-growth/71682736
22. https://www.singlegrain.com/seo/future-of-seo-how-ai-and-machine-learning-will-impact-
content/
NoRegINT—A Tool for Performing
OSINT and Analysis from Social Media
Abstract There are a variety of incidents that occur and the Open-Source Intelli-
gence (OSINT) tools in the market are capable of only collecting those specific target
data and even that is limited only to a certain extent. Our tool NoRegINT has been
specifically been developed to collect theoretically an infinite amount of data based
on keyword terms and draw a variety of inferences from it. This tool is used to gather
information in a structured format about the Pulwama attacks and draw inferences
such as the volume of data, the general sentiment of people about it and the impact
of a particular hashtag.
1 Introduction
Open-Source Intelligence (OSINT) is the method of obtaining data and other relevant
information about a specific target, usually but not limited to a person, E-mail ID,
phone numbers, IP addresses, location, etc. It makes use of openly available infor-
mation generally without the direct involvement of said target. OSINT is generally
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 971
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4_71
972 S. Karthika et al.
achieved through many tools which automate certain processes, although a prelimi-
nary analysis can be performed through manual processes. In this paper, a new tool
is proposed to collect data from various sources such as Twitter, Reddit and Tumblr
and draw inferences from it [1].
It is a well-known fact that social media is now an integral part of everybody’s life
and more often than not, people tend to post a variety of things on their social media
accounts such as details about their personal lives, their opinions about a particular
entity or incident and also pictures of themselves. This makes data collection from
such sources an extremely important task and essentially forms the core of fields
such as Open-Source Intelligence(OSINT) [2, 3].
The motivation behind the proposed work is that there have been a large number
of tools that have been developed in the field of OSINT, and these tools have although
been very useful and have had various shortcomings. A large number of them are
simple wrappers built around an Application Programming Interface (API) provided
by the particular social media website [4, 5]. There is always an upper limit set upon
the number of requests, and therefore, by extension, the amount of data that can
be collected. The other web scraping tools also limit the amount of data collected,
because of the concept of infinite scrolling in these web pages. Furthermore, the data
is often unstructured and not followed by any sort of analysis. This sort of tools leave
a lot of work up to the end user and therefore needs to be followed by a lot of cleaning
and analysis [6, 7].
In this paper, a tool has been proposed that is built upon web scraping to collect
publicly available data from social media websites without facing difficulties such
as a cap on the amount of data or dependencies upon any sort of API. This tool
will not only overcome the problem set by infinite scrolling web pages, but it will
also provide the post data collection, temporal analysis, sentiment analysis that can
be used to study social media response to certain incidents such as 9/11, Pulwama
attack. The mentioned data can be of any type, textual, pictorial, etc. The targeted
social media websites in this paper are twitter, Reddit and Tumblr [8–10].
In the following paper, Sect. 2 elaborates about related works done in this area,
and Sect. 3 discusses the methodology. Section 4 analyses the results and compares
with other APIs available, and Sect. 5 draws conclusions and discusses the future
scope of the proposed tool.
2 Related Works
This section discusses the various APIs and wrapper existing in the research area of
OSINT.
NoRegINT—A Tool for Performing OSINT and Analysis … 973
The Twitter API is an interface provided by the company itself to support the integra-
tion of their service in other applications. This is not a data collection tool but more of
an alternative to access one’s account and perform actions from there. Although one
can post data and perform various actions such as follow/unfollow and post content,
it is not an effective data collection tool as the number of requests is limited, and its
usage requires registration on the user’s part and proficient coding knowledge on the
end user’s part [2, 11].
This is an openly available tool built upon the Reddit developers’ API, which can
be used to perform various activities such as retrieving posts, metadata about posts,
ability to post content itself and also to upvote posts and follow other users [12].
However, the usage of this wrapper involves the hassle of registration as a developer
with Reddit and also requires the end user to be familiar with programming concepts
and the usage of OAuth. This prevents the wrapper from being a simple plug and
play OSINT tool [13].
2.3 Spiderfoot
2.4 Maltego
Maltego is a commonly used data mining tool, mainly used to collect data about
specific entities which may be people, companies or websites. This data is visually
represented as a set of connected graphs. Although it is one of the most effective
OSINT tools in the market, it is still not the best choice for term-wise data collection
and data collection about incidents. It can co-relate and connect data but it cannot
draw conclusive inferences from the representations [14].
974 S. Karthika et al.
The proposed NoRegINT tool is designed to overcome the various existing API-based
problems such as the number of days, amount of fetched content and the number
of requests. Figure 1 presents four major modules, namely CLI module, Twitter
scraper, Reddit scraper, Tumblr image collector and inference module involved in
the framework of the proposed tool.
It is an interface that can be used by the user. It provides a high level of abstraction
and does not require any sort of significant programming knowledge on the end user’s
part. It provides the user with two main functionalities. The collection of data based
upon a specified search term and the inferences that can potentially be drawn from
said collected data.
The Twitter scraper works upon the popular scraping package available in python
called beautiful soup. However, using this with the ’request’ function presented limi-
tations upon the amount of data collected. To overcome this limitation, an instance of
a browser in the background is created, and the javascript code is executed to simu-
late scrolling movements. This theoretically provides an infinite amount of data. The
beautiful soup uses the HTML DOM to access entities present in the page, and this
has been used to collect the tweet metadata. The result is stored in a JSON format.
This is then accessed by the inferences module.
Similar to the Twitter scraper module, this module also generates the metadata from
Reddit posts regarding a particular keyword and then stores in the same format as
the twitter results. This is then accessed by the inferences module.
The Tumblr image collector sends requests to download the images from URLs
collected from the web page using beautiful soup. These images are then indexed and
stored locally for the user to access. This can be useful in collection or accumulation
of a dataset for a given problem.
1. Volume of data
(a) Gives insight on the popularity of a topic on different social media
976 S. Karthika et al.
(b) It gives information about the number of Tweets, Reddit posts and images
scraped by the system
2. Sentiment analysis
(a) Gives the average sentiment value of all the Tweets and Reddit posts
(b) Gives info on the impact of a term on social media
The proposed tool NoRegINT has experimented for the keyword ’Pulwama’. Figure 2
presents the CLI module to obtain keyword input from the user.
After obtaining the keywords from the user, the process of scraping begins with
a new instance of selenium browser object with the built URL. Figure 3 illustrates
the process of automatic loading of posts for scraping, and Figs. 4 and 5 show the
JSON file generated by Reddit and Twitter scrapers, respectively.
Figure 6 describes the comprehensive results achieved using the tool. About 99
tweet posts, 100 Reddit posts and two Tumblr photos were collected in the fast search
method (three levels of scrolling). The tool uses Vader sentiment analysis package in
which the interval of −0.05 to +0.05 is considered as neutral sentiment, an interval
of +0.05 to +1 represents positive sentiment, whereas the interval of −1 to −0.05
is seen as a negative sentiment. The tool achieved an average sentiment of −0.28 on
Twitter scraped data and a value of −0.16 on Reddit data. This sentiment analysis
performed on the data collected based on the NoRegINT tool shows that the recent
posts about that term and its hashtag have been negative on average.
Figure 7 details on the percentage of sentimental tweets in the scraped repository
built by NoRegINT tool.
The comparison has been done based on standard features to determine the perfor-
mance of the APIs and the OSINT tools. Posted content cannot be obtained in maltego
and spiderfoot, whereas it is scraped and stored in NoRegINT in JSON Format.
Reddit wrapper, Twitter API and Maltego restrict the amount of data scraped (~3200
tweets, etc.) while NoRegINT doesn’t restrict the amount of content retrieved from
social media. APIs also have a 7 day range limit, while NoRegINT can gather any old
information from the social media posts. None of the tools/APIs in question performs
sentiment analysis on gathered data, while NoRegINT performs sentiment analysis,
giving the average compound value and graph depicting the percentage of senti-
mental tweets. None of the tools has built-in inferencing methods while NoRegINT
can report the number of tweets, posts and photos scraped from Twitter, Reddit and
Tumblr, respectively (Fig. 8).
NoRegINT—A Tool for Performing OSINT and Analysis … 979
5 Conclusion
The authors of this paper have addressed the problems such as restriction on the
volume of data, 7 day limit for content fetching and the lack of built-in inferencing
in the existing APIs and OSINT tools. The proposed tool, NoRegINT, overcome
these problems by using its features like automatic scrolling and sentiment analysis.
The scrolling facilitates the infinite fetching of data through which the authors were
able to build a complete restrictionless repository. The system is versatile in its
keyword input. The tool can also automatically generate a sentimental review on the
keyword input. This tool can be further developed to give more functionalities such
as analysing streaming of posts and photos and can also be extended to other popular
or growing social media websites.
980 S. Karthika et al.
References
1. Lee S, Shon T (2016) Open source intelligence base cyber threat inspection framework for
critical infrastructures. In: 2016 future technologies conference (FTC). IEEE, pp 1030–1033
2. Best C (2012) OSINT, the Internet and Privacy. In: EISIC, p 4
3. Noubours S, Pritzkau A, Schade U (2013) NLP as an essential ingredient of effective OSINT
frameworks. In: 2013 Military communications and ınformation systems conference. IEEE,
pp 1–7
4. Sir David Omand JB (2012) Introducing social media ıntelligence. Intell Natl Secur 801–823
5. Steele RD (2010) Human intelligence: all humans, all minds. All the Tıme
6. Bacastow TS, Bellafiore D (2009) Redefining geospatial ıntelligence. Am Intell J 27(1):38–40.
Best C (n.d.) Open source ıntelligence. (T.R.I.O), T. R. (2017). Background/OSINT. Retrieved
23 Apr 2018, from www.trioinvestigations.ca: https://www.trioinvestigations.ca/background-
osint
7. Christopher Andrew RJ (2009) Secret intelligence: a reader. Routledge Taylor & Francis Group,
London
8. Garzia F, Cusani R, Borghini F, Saltini B, Lombardi M, Ramalingam S (2018) Perceived
risk assessment through open-source ıntelligent techniques for opinion mining and sentiment
analysis: the case study of the Papal Basilica and sacred convent of Saint Francis in Assisi,
Italy. In: 2018 International Carnahan conference on security technology (ICCST). IEEE, pp
1–5
9. Michael Glassman MJ (2012) Intelligence in the internet age: the emergence and evolution of
Open Source Intelligence (OSINT). Comput Hum Behav 28(2):673–682
10. Stottlemyre SA (2015) HUMINT, OSINT, or Something New? Defining crowdsourced
ıntelligence. Int J Intell Counter Intell 578–589
11. Gasper Hribar IP (2014) OSINT: a “Grey Zone”? Int J Intell Counter Intell 529–549
12. Intellıgence Communıty Directive Number 301 (2006) National Open Source Enterprıse, 11
July 2006
13. Neri F, Geraci P (2009) Mining textual data to boost information access in OSINT. In: 2009
13th International conference information visualisation, pp 427–432. IEEE
14. Pietro GD, Aliprandi C, De Luca AE, Raffaelli M, Soru T (2014) Semantic crawling: an
approach based on named entity recognition. In: 2014 IEEE/ACM International conference on
advances in social networks analysis and mining (ASONAM 2014). IEEE, pp 695–699
Author Index
A D
Abhinay, K., 529 Dalin, G., 15
Adhikari, Surabhi, 39 De, Debashis, 333
Agrawal, Jitendra, 157 Dilhani, M. H. M. R. S., 647
Ahuja, Sparsh, 751 Dilum Bandara, H. M. N., 567
Aiswaryadevi, V. J., 925 Dushyanth Reddy, B., 851
Aleksanyan, G. K., 729
Aravind, A., 529
Arif Hassan, Md, 869 E
Arun Kumar, B. R., 957 Eybers, Sunet, 379
Atul Shrinath, B., 271
Ayyasamy, A., 127
G
Gaba, Anubhav, 39
B Ganesh Babu, C., 445, 481
Bains, Inderpreet Singh, 113 Gautam, Shivani, 285
BalaSubramanya, K., 235 Ghosh, Atonu, 333
Bansal, Nayan, 39 Gokul Kumar, S., 445
Baranidharan, V., 365 Gorbatenko, N. I., 729
Basnet, Vishisth, 751 Gour, Avinash, 305
Bawm, Rose Mary, 883 Gouthaman, P., 763, 781
Behera, Anama Charan, 739 Graceline Jasmine, S., 189
Behera, Bibhu Santosh, 739 Gupta, Akshay Ramesh Bhai, 157
Behera, Rahul Dev, 739 Gupta, Anil, 203
Behera, Rudra Ashish, 739 Gupta, Anmol, 763
Bhalaji, N., 971 Gupta, Sachin, 315
Bhati, Amit, 53 Guttikonda, Geeta, 103
Bhattacharya, Debadyuti, 971
H
C Haldorai, Anandakumar, 851
Channabasamma, 395 Harish, Ratnala Venkata Siva, 349
Chhajer, Akshat, 781 Hettige, Budditha, 691
Chile, R. H., 703 Hettikankanama, H. K. S. K., 601
Chithra, S., 971 Hiremath, Iresh, 405
Chopade, Nilkanth B., 911 Hoang, Vinh Truong, 299
Chung, Yun Koo, 551 Hossain, Sohrab, 883
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 981
S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes
in Networks and Systems 173,
https://doi.org/10.1007/978-981-33-4305-4
982 Author Index
I M
Indirani, M., 831 Mahaveerakannan, R., 1, 813
Ishi, Manoj S., 143 Maheshwari, Vikas, 305
Majumder, Koushik, 333
Malipatil, Somashekhar, 305
J Manickavasagam, L., 271
Jagtap, Swati, 911 Manjunathan, A., 481
Jahnavi, Ambati, 851 Manusha Reddy, A., 395
Jain, Rachna, 677 Marathe, Amit, 221
Jani Anbarasi, L., 189 Maria Priscilla, G., 795
Jawahar, Malathy, 189 Maruthi Shankar, B., 445
Jeeva Padmini, K. V., 567 Mathankumar, M., 481
Jeyaboopathiraja, J., 795 Matta, Priya, 751
Jha, Aayush, 39 Mehta, Gaurav, 285
Jha, Avinash Kumar, 39 Mishra, Bharat, 421
Jinarajadasa, G. M., 719 Mittra, Tanni, 883
John Aravindhar, D., 463 Mohanrajan, S. R., 271
John Deva Prasanna, D. S., 463 Mohanty, Prarthana, 739
Joshi, Abhijit R., 221 Muqith, Munim Bin, 897
Joshi, Shashank Karthik D., 235
Jotheeswar Raghava, E., 529
Jude, Hemanth, 677 N
Nagalakshmi, Malathy, 859
Nagrath, Preeti, 677
Nair, Jayashree, 173
K
Narendra, Modigari, 189
Kailasam, Siddharth, 405
Nataraj, N., 925
Kamrul Hasan, Mohammad, 869
Naveen, K. M., 365
Karthika, S., 971
Nayar, Nandini, 285
Karthikeyan, M. M., 15
Niloy, Md. Dilshad Kabir, 897
Karthik, S., 247
Nithish Sriman, K. P., 365
Karthik, V., 189
Karunananda, Asoka S., 583, 691
Katamaneni, Madhavi, 103 O
Katsupeev, A. A., 729 Olana, Mosisa Dessalegn, 551
Kikkuri, Vamsi Krishna, 173
Kiruthika, S., 925
Kombarova, E. O., 729 P
Kommineni, Madhuri, 851 Pandala, Madhavi Latha, 103
Koppar, Anant, 405 Pandey, Sanidhya, 763
Kousik, N. V., 813 Pant, Bhasker, 751
Krishanth, N., 271 Parveen, Suraiya, 259
Krishna, Harsh, 763 Patidar, Sanjay, 113
Kumara, Kudabadu J. C., 647, 665 Patil, Ajay B., 703
Kumar, Anuj, 537 Patil, Annapurna P., 513
Kumar, N. S., 859 Patil, J. B., 143
Pavan Karthik, D. V. S., 945
Pavel, Monirul Islam, 897
L Perera, G. I. U. S., 567
Li, Hengjian, 633 Pramanik, Subham, 763
Lima, Farzana Firoz, 883 Pranavanand, S., 945
Lingaraj, N., 247 Prathap, R., 365
Litvyak, R. K., 729 Praveen Kumar, N., 365
Liyanage, S. R., 719 Premjith, B., 81
Author Index 983
S U
Sabena, S., 127 Udhayanan, S., 481
Sachdeva, Ritu, 315 Uma, J., 1
Sai Aparna, T., 81
Saini, Dharmender, 677
Sai Ramesh, L., 127, 435 V
Saranya, M. D., 247 Varun, M., 405
Sarath Kumar, R., 445, 481 Vasantha, Bhavani, 851
Sarma, Dhiman, 883 Vasanthapriyan, Shanmuganathan, 601
Sarwar, Tawsif, 883 Vasundhara, 259
Satheesh Kumar, S., 247 Vemuri, Pavan, 173
Selvakumar, K., 435 Venba, R., 189
Sengupta, Katha, 897 Vidanagama, Dushyanthi, 583
Setsabi, Naomi, 379 Vidya, G., 63
Shalini, S., 513 Vikas, B., 235
Shankar, S., 831 Vivekanandan, P., 1
Shanthini, M., 63
Sharma, Nitika, 677
Sharma, Tanya, 859 W
Shrinivas, S., 235 Wagarachchi, N. M., 665
Shukur, Zarina, 869 Wang, Xiyu, 633
Shwetha, N., 497
Shyamali Dilhani, M. H. M. R., 665
Silva, R. K. Omega H., 567 Y
Silva, Thushari, 583 Yashodhara, P. H. A. H. K., 615
Simran, K., 81 Yuvaraj, N., 813
Singh, Ashutosh Kumar, 421
Singh, Bhavesh, 221 Z
Singh, Poonam, 285 Zhao, Baohua, 633
Sivaram, M., 813