95 Submission-2
95 Submission-2
Department of Artificial Intelligence and Data Science, Panimalar Engineering College, Chennai,
Tamil Nadu, Indiaa),b)
Abstract: The surge in criminal cases in well as crimes where force is used as a kind of
India has led to a growing number of coercion. Crimes involving violence can vary
unresolved incidents, posing a significant from homicide to harassment, and they may not
challenge for law enforcement agencies. always be initiated with a weapon, depending
Understanding and categorizing criminal on the jurisdiction. Violence crimes often
activity trends is crucial for effective include murders, robberies, rapes, attempted
prevention. Despite the prevalence of murders, kidnappings, thefts, riots, dowry
machine learning and deep learning deaths, dowry atrocities, etc. According to the
techniques, law enforcement struggles to actual statistics on the number of murders, 17 of
promptly address and prevent crimes. This the poorest counties were found to be in
paper explores the application of various Maharashtra, and the next two poorest counties,
machine learning methods in predicting and Andhra Pradesh and Rajasthan, were found to
analyzing crime data. The primary objective have the largest concentrations of violent crime
is to underscore the efficacy of machine districts. Violent crimes comprise the bulk of
learning in forecasting violent crimes within crimes committed against women. Opposing
specific regions, enabling law enforcement to public order as a crime was reported in 13
proactively reduce crime rates. The study districts in Rajasthan, 12 districts in Bihar, and
focuses on refining the accuracy of prediction 11 districts in Tamil Nadu, out of a total of 100
models through data processing, utilizing districts [6].
algorithms such as Logistic Regression,
Analyzing time-series data is the main
Decision Tree classification, KNN and
objective of time series analysis. This
Random Forest classification. By training
knowledge heavily dominates in forecasting
these models with a comprehensive dataset
future values based on previously seen data.
and validating them with test data, the
The growing population and continuous
research aims to provide law enforcement
urbanization are contributing to an increase
agencies with effective tools for crime
in violent crime and accidents. Utilizing
prevention in Indian society. The goal of the
Big Data Analysis (BDA) to mine through
project is to improve prediction model
these massive data sets for insights and
accuracy by means of data processing and
examine hidden patterns and their
techniques. In order to give law enforcement
interrelationships is one of the newest
officials useful tools for preventing crime in
trends in data analysis [1], [2]. Since crime
Indian society, the research trains these
is one of the most prevalent and worrisome
models using an extensive dataset and
issues in our society, it is imperative that it
validates them with test data.
be prevented [5]. The scientific examination
Keywords: Machine Learning, Decision Trees, and recognition of numerous correlations,
Logistic Regression, Random Forest, patterns, and trends in Crime analysis and
prediction are methods used to combat
1. INTRODUCTION
disorder and crime [3]. Though they happen
In recent years, numerous crimes have been everywhere, cities are the scene of most
reported worldwide. Threatening to use force crimes. It affects basic human rights and
against the victim is a serious offense known as leads to the collapse of society's social
violence. It includes crimes when the motive is structure. It is not always possible for law
a violent act like murder, rape, or robbery as enforcement officers to manually identify
areas with a high crime rate during a 2. LITERATURE SURVEY
specific period because criminal
motivations are dynamic and because The first group makes use of techniques for
current crime statistics are not being used pattern-based event extraction. Riloff [15]
properly. initially suggested such a method in 1993 to
extract terrorist occurrences from texts that
Furthermore, there is ongoing debate on the were peculiar to the domain. There are
need for software tools and systems to currently quite a few pattern-based EE
support criminal investigation and systems available for extracting different
prevention efforts based on textual content types of events, each of which is
shared online. This issue is intended to be domain-specific [16], [18]. Our research is
handled by applications such as hate speech very interested in the most recent
detection, crime pattern modeling, crime developments in EE studies related to
prediction, identification of topics relevant terrorism and criminal activity.
to crime, etc. [10], [11], [12]. To combat
and prevent crimes, it is still crucial to Several techniques for forecasting crime
utilize more efficient methods for analyzing data have been presented in recent research.
the criminal content of computer-mediated After comparing linear regression, addictive
communication (CMC). One of the hardest regression, and decision trees, Lawrence
and most successful techniques for McClendon and Natarajan Meghan
detecting and predicting criminal behavior concluded that linear regression was the
is crime-related event extraction (CREE) most accurate and effective method for
[13], [14]. predicting crime data based on the training
set input for all three algorithms [7].
While structured data can be mined for
patterns and trends using traditional According to a survey [17], crime
methods, modern data mining can extract forecasting will be improved by employing
information from both structured and effective data gathering and mining
unstructured data because of the progress in techniques to produce a better crime
technology. Data mining helps criminal prediction and by applying knowledgeable
investigators analyze the massive volume of learning to construct numerous models for a
data more precisely and efficiently [12]. single problem. Using a single classification
The advent of big data has enabled social and prediction output, which also specifies
science research and development to better the time of year and season, it is possible to
understand how social phenomena and forecast what the next crime in a certain
human behavior are related. neighborhood would be. There is a time
dimension where crime happens more
The research suggests a smart police frequently.
method that uses machine learning to
predict the kinds of crimes and their risks in Babakura et al. compared Naive Bayesian
response to these difficulties. This method and Back Propagation to predict the crime
presents an ensemble model that combines data and classified it into low, medium, and
bagging and stacking approaches, and it high levels of crime rate. Additionally
uses text-based crime summaries. By calculated were the precision, recall, and
merging several classifiers, this ensemble accuracy. When utilizing WEKA to
method—known as the Supervised Based evaluate data categorization on a crime
Crime Prediction Method (SBCPM)—aims dataset, it was discovered that the Nave
to improve prediction efficacy. The study's Bayesian performed better [9].
conclusion highlights the importance of this
K-means clustering, hierarchical clustering,
novel strategy by demonstrating the
and DBSCAN clustering were used by
potential of ensemble learning approaches
Sivaranjani et al. to analyze crime data from
in crime prediction and prevention.
cities in the Indian state of Tamilnadu.
Based on various geographic data, Liao et sociology for analysis, Christopher
al. proposed a unique approach for additionally highlighted several potential
Bayesian learning-based crime prediction difficulties [33]. [34] describes how the
[23]. author applied the crime forecasting model
and used the data mining classification
In order to identify cyber security incidents approach to perform a comparative research
from a brief, noisy text, Yagcioglu et al. to find the best method for predicting crime
[19] used a long short-term memory hotspots. Data mining approach
(LSTM) recurrent neural network in CRISP-DM, which contains six steps, was
conjunction with CNN. For deep learning in utilized by Saltos et al. For the experimental
non-Euclidean areas, graph neural networks purpose, three algorithms from the three
(GNN) employ many neurons acting on a phases#39; categories are used, and
graph structure. As a result, in [21], the evaluation is done using evaluation metrics.
authors suggested using attention-based Experiments are being conducted to
graph convolutional networks to jointly examine and assess crime frequency
extract numerous event triggers and prediction [35].
arguments. According to the significance of
each input component to the EE task, Liu et 3. PROPOSED SYSTEM
al.'s [22] attention processes in neural
models have been used to direct a neural The proposed crime data analysis system
model to treat each component of the input integrates information from law
differently. enforcement databases, emergency calls,
surveillance, and community reports. It
In one borough of a city, Xiangyu et al. makes use of text mining, machine learning
made use of dynamic trends in urban data, algorithms, and geographic analysis to
and they subsequently used transfer anticipate and classify crimes, spot trends,
learning techniques to improve the crime and keep an eye on social media for
prediction in other provinces. These possible criminal activity. Platforms that
variables are combined, and spatio- promote citizen reporting help to foster
temporal patterns are modelled for crime community involvement. Based on
prediction using a unique transfer learning anticipated crime hotspots, dynamic
framework [25]. 95% accuracy was attained resource allocation maximizes law
when the Indian numerical plates [24] were enforcement efforts. Tools for visualization
introduced using a fuzzy-based technique. support interpretation by highlighting
Images that were blurred instead of tilted ethical and private issues. Ongoing
and noisy were used. The authors improvements are driven by feedback loops
additionally advanced the idea of and continuous review, which ensures
identifying cars involved in crime flexibility to changing crime trends and
prediction by proposing a second model for technical breakthroughs. The system is a
Myanmar numbers and characters [26]. flexible, technologically advanced strategy
for improving law enforcement and
To undertake a crime analysis and use the preventing crime.
data to construct a crime matching
procedure, Keyvanpour et al. presented the It is impossible to give an exact accuracy
SOM clustering approach [30]. rate without knowing specifics about the
Hosseinkhani et al. presented a review to machine learning algorithms, training set,
identify crime hotspots, forecast crime and assessment metrics utilized in the
patterns, and extract usable information suggested crime data analysis system.
through data mining [31]. According to a Thorough testing and validation would be
framework proposed by Khalid et al [32], required to ascertain the accuracy of the
criminal cases are mapped to sociological system; this usually involves using previous
theories based on certain features. In his data that was not utilized for training. The
discussion of the integration of big data and effectiveness of the system in predicting
different kinds of crimes would be elements, particularly for models with linear
evaluated using evaluation measures like foundations. When highly correlated
precision, recall, and F1 score. The efficacy features are present, a relatively tiny value
of these elements and how well they fit the for the coefficient of some significant
ever-changing landscape of criminal features will be assigned.
activity will determine the actual accuracy
rate.
c. Data Mining(WEKA)