0% found this document useful (0 votes)
31 views37 pages

1.sasi Final Termpaper

Uploaded by

rishi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views37 pages

1.sasi Final Termpaper

Uploaded by

rishi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

CRIME ANALYSIS USING MACHINE LEARNING

A Term paper report submitted in partial fulfillment of the requirement for the Award of degree

BACHELOR OF TECHNOLOGY
in
CSE-ARTIFICIAL INTELLIGENCE & MACHINE LEARNING
Submitted
By

G. SASI VARDHAN
21341A4216
Under the esteemed guidance of

Ms. N. Krishnaveni
Assistant Professor,
Dept. of CSE-Artificial Intelligence & Machine Learning

GMR Institute of Technology


An Autonomous Institute Affiliated to JNTU-GV, Vizianagaram
(Accredited by NBA, NAAC with ‘A’ Grade & ISO 9001:2015 Certified Institution)
GMR Nagar, Rajam – 532127
Andhra Pradesh, India
December 2023
DEPARTMENT OF CSE-ARTIFICIAL INTELLIGENCE & MACHINE LEARNING

CERTIFICATE
This is to certify that term paper report titled “CRIME ANALYSIS USING MACHINE LEARNING”
submitted by G. Sasi Vardhan bearing Reg. No: 21341A4216 has been carried out in partial fulfillment
for the award of Bachelor of Technology in CSE-Artificial Intelligence & Machine Learning of
GMRIT, Rajam affiliated to JNTUGV, Vizianagaram is a record of bonafide work carried out by them
under my guidance & supervision. The results embodied in this report have not been submitted to any other
University or Institute for the award of any degree.

Signature of the Supervisor Signature of the H.O.D

Ms. N. Krishnaveni Dr. K Srividya,


Assistant Professor Associate Professor and Head
Department of CSE-AI&ML Department of CSE-AI&ML
GMRIT, Rajam. GMRIT, Rajam.

The report is submitted for the viva-voce examination held on ………………..

Signature of Internal Examiner Signature of External Examiner

i
ACKNOWLEDGEMENT

It gives me an immense pleasure to express deep sense of gratitude to my guide, Ms. N.


Krishnaveni Assistant Professor, Department of CSE-Artificial Intelligence & Machine Learning of
whole hearted and invaluable guidance throughout the report. Without his/her sustained and sincere effort,
this report would not have taken this shape. She encouraged and helped me to overcome various difficulties
that I have faced at various stages of my report.

I would like to sincerely thank Dr. K Srividya, Associate Professor & HOD, Department of
CSE-Artificial Intelligence & Machine Learning, for providing all the necessary facilities that led to the
successful completion of my report.

I take the privilege to thank our Principal Dr. C.L.V.R.S.V.Prasad, who has made the
atmosphere so easy to work. I shall always be indebted to them.

I would like to thank all the faculty members of the Department of CSE-Artificial Intelligence
& Machine Learning for their direct or indirect support and also all the lab technicians for their valuable
suggestions and providing excellent opportunities in completion of this report.

G. Sasi Vardhan
21341A4216

ii
ABSTRACT

Crime and violation are the major threats to justice and it’s important to have effective ways to control them.
An advanced crime prediction model is required to know when and where crimes might happen, this helps
the police department to prevent the crimes early and to find the areas where crimes are occurring frequently.
Accurate crime prediction is essential for improving safety and security of the public. Estimating crime
rates, types of crime, and hotspots from historical patterns presents various computational challenges and
opportunities. The use of data processing methods and having access to large databases makes crime
analysis easier and more effective. Ensemble learning methods, such as stacked generalization, have been
shown to be more reliable than single classifiers in crime prediction. Crime prediction can be achieved using
various machine learning algorithms, including logistic regression, support vector machine (SVM), k-
nearest neighbors (KNN), k-means clustering, decision tree, random forest, and eXtreme Gradient Boosting
(XGBoost), along with time series analysis using Long-Short Term Memory (LSTM). Long Short Term
Memory works with the sequences of data.

Keywords: Crime prediction, machine learning, support vector machine (SVM), k-nearest neighbor
(KNN), decision tree.

iii
INDEX
CONTENTS PAGE NO

CERTIFICATE i

ACKNOWLEDGEMENT ii

ABSTRACT iii

Chapter 1. INTRODUCTION 1

1.1 Problem Statement

1.2 Motivation

1.3 Challenges

1.4 Advantages

Chapter 2. LITERATURE SURVEY 3

2.1 Comparison Table 9

2.2 Diagrammatical Representation 14

Chapter 3. DESIGN 15

Chapter 4. METHODOLOGY 17

4.1 Stacking-Based Crime Prediction Method(SBCPM)

4.2 K-Means Clustering

4.3 Naïve Bayes

4.4 Support Vector Machine(SVM)

4.5 Linear Regression

Chapter 5. CASE STUDIES 26

Chapter 6. RESULTS & DISCUSSION 27

Chapter 7. CONCLUSION 30

REFERENCES 31
CRIME ANALYSIS USING MACHINE LEARNING 2023

1 INTRODUCTION
Crime prediction is an important area of research that aims to make our communities safer and
prevent crimes from happening. It's all about using new and smart methods to understand and deal with
crimes. Instead of doing everything by hand, which can be really hard, we use computers and big databases
of information about crimes. This helps us to predict two things: where crimes are likely to happen and
when they might happen a lot. The use of data processing methods and having access to large databases
makes crime analysis easier and more effective.

Crime analysis with machine learning is a high-tech way for police to tackle crime. Computers learn
from past crime data to spot patterns and predict future incidents. This helps law enforcement be proactive
and prevent crimes before they happen. It's like a digital detective that assists police in understanding and
addressing criminal trends. By harnessing the power of technology, we aim to enhance public safety and
make communities more secure. Machine learning algorithms analyze vast amounts of data to identify
potential hotspots and criminal behavior. This approach empowers law enforcement with valuable insights,
optimizing their efforts. Ultimately, it's about using advanced tools to stay one step ahead in the ongoing
battle against crime.

Ensemble learning methods, such as stacking-based crime prediction, have shown to be more
reliable and accurate than individual classifiers. Ensemble methods aggregate the predictions of multiple
classifiers, which helps in reducing bias and variance in crime prediction models. The collaborative
decision-making mechanism of ensemble learning allows for the identification of the most appropriate
predictions of crime by combining the strengths of different machine learning algorithms.

Ensemble learning helps in handling the dynamic nature of crimes by considering multiple
perspectives and capturing diverse patterns in crime data. The stacking ensemble model, used in the
proposed SBCPM method, has been found to have higher prediction accuracy compared to individual
classifiers, indicating the significance of ensemble learning in crime prediction.

The crime predictions are generally suggested by using machine learning techniques with respect to
what percentage of future violence is possible in crimes. This research has been done for many years, but
with some limited algorithms and small dataset. This research claims its novelty with the help of empirical
analysis of machine learning and other contributions listed in this section. Though, machine learning models
are widely used in crime prediction, but still despite of its expanding application and its gigantic potential,
there are numerous regions, where the new procedures created in the zone of artificial intelligence have not

Department of CSE – AI&ML, GMRIT Page 1


CRIME ANALYSIS USING MACHINE LEARNING 2023

been completely explored and has major drawbacks. The most common approaches which have reported
achievable accuracy in machine learning classifiers are Random Tree Algorithm, K-Nearest
Neighbor(KNN), Bayesian model, Support Vector Machine (SVM), Neural Network. Among these
algorithms, crime prediction technique is proposed by integrating a number of algorithms named as a crime
prediction ensemble model using bagging and stacked ensemble techniques, reflecting the beauty of this
research work. Ensemble model is a method for constructing a predictive model by combining multiple
models to solve a single problem to improve predictive efficiency.

Ensemble learning methods, such as stacked generalization, have been shown to be more reliable
than single classifiers in crime prediction. The proposed method in this paper is an ensemble based crime
prediction methods, which utilizes Support Vector Machine (SVM) algorithms, stack generalization and
various other algorithms to achieve accurate predictions. For example, they can reveal that certain types of
crimes tend to occur in specific areas during times of the day or week. By understanding these patterns the
police officers can work smarter. They can send more officers to places where crimes are more likely to
happen, or they can be extra careful during the times when crimes usually occur. This approach can lead to
a reduction in crime rates and improved public safety.

Department of CSE – AI&ML, GMRIT Page 2


CRIME ANALYSIS USING MACHINE LEARNING 2023

2 LITERATURE SURVEY
Kshatri, S. S., Singh, D., Narain, B., Bhatia, S., Quasim, M. T., & Sinha, G. R. (2021). An empirical
analysis of machine learning algorithms for crime prediction using stacked generalization: an
ensemble approach. Ieee Access, 9, 67488-67500.
The research paper discusses the use of ensemble learning methods for crime prediction using
stacked generalization. It highlights that ensemble classifiers are more reliable than single classifiers. The
paper mentions the proposal of an efficient crime prediction method called Stacking Based Crime Prediction
Method(SBCPM), which is based on SVM algorithms and stack generalization. It compares the
performance of the SBCPM model with other machine learning models. The results show that the SBCPM
model achieved a classification accuracy of 99.5% on testing data, outperforming the other models.

Sravani, T., & Suguna, M. R. (2022, February). Comparative Analysis of Crime Hotspot Detection
And Prediction Using Convolutional Neural Network Over Support Vector Machine with Engineered
Spatial Features Towards Increase in Classifier Accuracy. In 2022 International Conference on
Business Analytics for Technology and Security (ICBATS) (pp. 1-5). IEEE.
This paper uses classification algorithms such as Support Vector Machine (SVM) and
Convolutional Neural Network (CNN) for crime prediction. The paper compares the accuracy of crime
prediction using Support Vector Machine (SVM) and Convolutional Neural Network (CNN) algorithms.
The SVM algorithm achieves an accuracy of 94.01% for predicting the type of crime, while the CNN
algorithm achieves an accuracy of 79.98%. The paper also mentions the mean squared error (MSE) values
for SVM and CNN algorithms. The MSE for CNN is reported as 0.4770, while the MSE for SVM is reported
as 0.0823. The crime dataset is collected from the Montgomery Police Department's database, which
contains data on crime incidents from the year 2016 to 2019.

Bonam, J., Burra, L. R., Susheel, G. S. V. N. S., Narendra, K., Sandeep, M., & Nagamani, G. (2023,
July). Crime Hotspot Detection using Optimized K-means Clustering and Machine Learning
Techniques. In 2023 4th International Conference on Electronics and Sustainable Communication
Systems (ICESC) (pp. 787-792). IEEE.

The paper focuses on crime hotspot detection using optimized K-means clustering and machine
learning techniques. The dataset used for analysis is the Kaggle-obtained UCI Crime and Communities
Dataset. This system utilizes machine learning techniques such as optimized K-means clustering to explore
datasets and analyze crimes. Three classification algorithms, namely Decision Tree Algorithm, Support

Department of CSE – AI&ML, GMRIT Page 3


CRIME ANALYSIS USING MACHINE LEARNING 2023

Vector Machine, and Random Forest Algorithm, are included in the model. The paper compares the
accuracy of crime prediction using Decision Tree, Support Vector Machine and Random Forest. Decision
Tree classifier achieves an accuracy of 85.55 and Support Vector Machine achieves an accuracy of 84.06
and Random Forest achieves an accuracy of 88.08.

Vinothkumar, K., Ranjith, K. S., Vikram, R. R., Mekala, N., Reshma, R., & Sasirekha, S. P. (2023,
March). Crime Hotspot Identification using SVM in Machine Learning. In 2023 International
Conference on Sustainable Computing and Data Communication Systems (ICSCDS) (pp. 366-369).
IEEE.

The research aimed to improve the accuracy of crime prediction and recognition, ultimately
lowering the crime rate in Chicago. The paper discusses the use of Support Vector Machine (SVM)
algorithm in machine learning for crime hotspot identification. This paper mentions the comparison of
different machine learning methods, including KNN and SVM, for crime prediction on the Chicago dataset.
Data visualization techniques, including bar charts, line charts, and heat maps, are used to analyze and
understand the crime dataset. The SVM algorithm achieved the highest accuracy of 97.64% in crime hotspot
prediction, with a precision of 98.4, recall of 96.35, and F1-score of 97.39.

Kanimozhi, N., Keerthana, N. V., Pavithra, G. S., Ranjitha, G., & Yuvarani, S. (2021, March).
CRIME type and occurrence prediction using machine learning algorithm. In 2021 International
conference on artificial intelligence and smart systems (ICAIS) (pp. 266-273). IEEE.

The paper focuses on crime pattern analysis using machine learning algorithms, specifically Naïve
Bayes, to classify different crime patterns and predict the most recently occurring crimes. The study utilizes
crime data obtained from Kaggle open source to estimate the type of crime, time period, and location where
it has occurred. The data is pre-processed before applying the Naive Bayes algorithm to analyze the
independent feature effects for crime prediction. The paper reports that the accuracy achieved in classifying
different crime patterns using the Naïve Bayes algorithm is high. The study emphasizes the importance of
understanding crime patterns in order to analyze and respond to criminal activities.

Department of CSE – AI&ML, GMRIT Page 4


CRIME ANALYSIS USING MACHINE LEARNING 2023

Chahal, J. K., & Sharma, A. (2021, December). Improving Accuracy of crime data using K-Means
and Decision Tree Techniques. In 2021 IEEE International Conference on Technology, Research,
and Innovation for Betterment of Society (TRIBES) (pp. 1-4). IEEE.

The paper compares the performance of different classification and clustering techniques,
specifically K-means clustering and Decision tree classification, on crime data to achieve higher accuracy.
The combination of K-means clustering and Decision tree classification provides more accurate results
compared to using K-means clustering alone. The dataset used in the paper includes crimes from 2006 to
2012, such as Domestic Violence, Murder attempt, Child molestation, and car accidents due to rash and
careless driving. The paper highlights the importance of accuracy in crime data analysis and the need for
effective data mining techniques to handle the increasing volume of crime data.

Khatun, S., Banoth, K., Dilli, A., Kakarlapudi, S., Karrola, S. V., & Babu, G. C. (2023, March).
Machine Learning based Advanced Crime Prediction and Analysis. In 2023 International Conference
on Sustainable Computing and Data Communication Systems (ICSCDS) (pp. 90-96). IEEE.

The paper discusses the use of machine learning algorithms for crime prediction and analysis in
India. The paper utilizes various machine learning algorithms for crime prediction and analysis, including
Naive Bayes, Support Vector Machine, Linear Regression, Decision Tree, Bagging Regression, and
Random Forest Regression algorithms. The paper utilizes the Crime Analysis and Warning(CAW) dataset,
which consists of 13 columns and 18 rows, with each column representing different types of crimes. The
authors specifically mention that the Naive Bayes algorithm achieving a classification accuracy of 99.4%
on the test data.

Muñoz, V., Vallejo, M., & Aedo, J. E. (2021, August). Machine learning models for predicting crime
hotspots in medellin city. In 2021 2nd Sustainable Cities Latin America Conference (SCLA) (pp. 1-
6). IEEE.

The crime dataset of the city of Medellin is used in this paper. Machine learning models were
developed and evaluated for crime hotspot prediction in Medellin City. This study provides essential
information for authorities to plan logistical activities related to surveillance, patrolling, and resource
allocation in critical areas of the city. The paper used machine learning models, specifically Decision Trees,
Logistic Regression for crime hotspot prediction in Medellin City. The paper concludes that machine
learning models, specifically Decision Trees, Logistic Regression can be used to predict crime hotspots in
Medellin City with high accuracy and performance. The Decision Trees algorithm was found to be the most
appropriate model, achieving an F1-Score of 88.2%, an Accuracy of 87.6%, and a Recall of 90%.

Department of CSE – AI&ML, GMRIT Page 5


CRIME ANALYSIS USING MACHINE LEARNING 2023

Akil, R. M., Sarathambekai, S., Vairam, T., Krishnan, R. S., Dharaneesh, G. S., & Janarthanan, D.
(2023, March). Crime Data Analysis and Safety Recommendation System Using Machine Learning.
In 2023 9th International Conference on Advanced Computing and Communication Systems
(ICACCS) (Vol. 1, pp. 183-188). IEEE.

The paper proposes a crime data analysis and safety recommendation system using machine learning
techniques. The paper focuses on preprocessing the data and creating a separate data frame with state,
district, and cases columns. The system utilizes real-time news data to classify crimes into categories such
as drug-related crimes, violent crimes, commercial crimes, and property crimes. The data is extracted from
the NCRB (National Crime Records Bureau).

Darshan, M. S., & Shankaraiah, S. (2022, October). Crime Analysis and Prediction using Machine
Learning Algorithms. In 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon)
(pp. 1-7). IEEE.

The paper discusses the use of machine learning algorithms for crime analysis and prediction.
Various machine learning algorithms such as logistic regression, support vector machine, Naive Bayes, and
decision tree were applied to analyze crime data. The dataset used in the study includes crime types and
occurrences from 2013 to 2021. The Naive Bayes algorithm was found to be the most appropriate model,
achieving an accuracy of 90%.

Shukla, A., Katal, A., Raghuvanshi, S., & Sharma, S. (2021, June). Criminal Combat: Crime Analysis
and Prediction Using Machine Learning. In 2021 International Conference on Intelligent
Technologies (CONIT) (pp. 1-5). IEEE.

The paper mentions the use of different classification models such as Adaboost, Random Forest, and
K-nearest-neighbour for crime prediction based on location and time data. The authors test the model using
Mean Absolute Error (MAE), Median Squared Error (MSE), and Root Mean Squared Error (RMSE)
techniques. Mathematical and statistical models, such as the J48 algorithm, can aid in crime analysis and
prediction, achieving a high accuracy rate of 94.25287%. The paper utilizes crime datasets from the State
of North Carolina for the purpose of crime analysis and prediction.

Department of CSE – AI&ML, GMRIT Page 6


CRIME ANALYSIS USING MACHINE LEARNING 2023

Rao, P. V., Sunkari, S., Raghumandala, T., Koka, G., & Rayankula, D. C. (2023, March). Prediction
of Crime Data using Machine Learning Techniques. In 2023 International Conference on Sustainable
Computing and Data Communication Systems (ICSCDS) (pp. 276-280). IEEE.

The paper explains why it's really important to have correct and current records of crimes. It shows
that sometimes, not everyone keeps these records the same way, and there might not be enough resources
to do it well. The paper utilizes different pre-processing techniques to process the datasets containing crime
data. Machine learning algorithms such as Support Vector Machine, Random Forest Classifier, Decision
Tree, and K-Means are used to process the crime data and make predictions. These algorithms are used to
determine the likelihood of a crime occurring and the nature of the crime, such as whether it will be violent
or non-violent.

Bolkiah, A. H. A. A., Hamzah, H. H., Ibrahim, Z., Diah, N. M., Sapawi, A. M., & Hanum, H. M. (2022,
December). Crime Scene Prediction Using the Integration of K-Means Clustering and Support Vector
Machine. In 2022 IEEE 10th Conference on Systems, Process & Control (ICSPC) (pp. 242-246).
IEEE.

Multiple algorithms have been used in the clustering and classification process for crime prediction,
with the Support Vector Machine algorithm and K-means clustering being recommended for classification
and clustering, respectively. A predictive model was developed using various classification techniques, such
as KNN Classification, Logistic Regression, Decision Tree, Random Forest, Support Vector Machine
(SVM), and Bayesian method, with the KNN model showing promising results in predicting crime types.
The paper used a publicly available dataset from Kaggle.com, consisting of 500 records of information on
crime incidents in San Francisco. The integration of K-Means Clustering and Support Vector Machine
(SVM) was used to predict potential crime locations, with an accuracy of 0.65.

Mandalapu, V., Elluri, L., Vyas, P., & Roy, N. (2023). Crime Prediction Using Machine Learning and
Deep Learning: A Systematic Review and Future Directions. IEEE Access.

The paper discusses the use of datasets for crime prediction, including the Chicago Crime Dataset,
London Crime Dataset, Los Angeles Crime Dataset, New York City (NYC) Crime Dataset, and Philadelphia
Crime Dataset. The paper mentions the use of machine learning algorithms to analyze crime data and
identify crime hotspots, such as convolutional neural networks (CNNs) and recurrent neural networks
(RNNs). The Stacked generalization approach achieved the highest accuracy of 99.5%. The stacked
generalization is an ensemble of decision tree (DT), random forest (RF), support vector machine (SVM),
and k-nearest neighbors (KNN) algorithms.

Department of CSE – AI&ML, GMRIT Page 7


CRIME ANALYSIS USING MACHINE LEARNING 2023

Yao, S., Wei, M., Yan, L., Wang, C., Dong, X., Liu, F., & Xiong, Y. (2020, August). Prediction of crime
hotspots based on spatial factors of random forest. In 2020 15th International Conference on
Computer Science & Education (ICCSE) (pp. 811-815). IEEE.

The study wants to improve how we predict crimes. Instead of only using past crime data, it wants
to add more information about where things happen to make the predictions better. The paper compares
three prediction models, namely naive Bayes, logistic regression, and random forest for crime hotspot
prediction. The paper utilizes the crime dataset from San Francisco, obtained from the San Francisco open
data set platform. The dataset includes 878,049 examples, each containing information such as timestamp,
crime category, longitude, latitude, and address. The random forest model performs better in predicting
crime hotspots compared to the other models, demonstrating its higher prediction accuracy.

Department of CSE – AI&ML, GMRIT Page 8


CRIME ANALYSIS USING MACHINE LEARNING 2023

2.1 Comparison Table

Performance
Title year Objectives Limitations Advantages metrics
An empirical To Enhance the The proposed Ensemble Accuracy:
analysis of efficiency of crime method is based classifiers are more 99.5%.
machine prediction. on SVM reliable than single
learning algorithms, which classifiers.
algorithms for may not be the
2021
Reference 1 crime most suitable
prediction choice for all
using stacked types of crime
generalization prediction
: an ensemble problems
approach.
Comparative Increase accuracy Trained with SVM can Detect
Analysis Of of crime prediction crime data having the crime hotspots
Crime using SVM and only two more accurately.
Hotspot CNN categories of
Detection crime hotspot
And detection and
Prediction prediction of
Using crime type. SVM
Convolutional accuracy:
Neural 94.01%
Reference 2 Network Over 2022
Support CNN
Vector accuracy:
Machine with 79.98%
Engineered
Spatial
Features
Towards
Increase in
Classifier
Accuracy.
Crime To Develop a more Cannot develop a Model works on RF - 88.08%
Hotspot accurate and more accurate unbalanced
Detection efficient model for model. datasets. SVM-84.06%
using
crime prediction Requires less
Optimized K-
Reference 3 2023 using K-Means computation time DT – 85.55%
means
Clustering Clustering and
and Machine classification
Learning algorithms.
Techniques.

Department of CSE – AI&ML, GMRIT Page 9


CRIME ANALYSIS USING MACHINE LEARNING 2023

Crime Hotspot Develop machine The proposed The paper SVM


Identification learning-based crime method is based utilizes the accuracy-
using SVM in hotspot prediction on SVM Support Vector 97.64%,
Machine Learning algorithms, precision-
technique Machine (SVM)
which may not 98.4, recall-
be the most algorithm in 96.35%,
suitable choice machine learning and F1-
Reference 4 2023 for all types of to identify crime score-
crime prediction hotspots, which 97.39%.
problems. can be beneficial
for crime
analysis and
prevention.

CRIME type and Analyze crime In the situation of The paper Accuracy
occurrence patterns using absence of class utilizes machine 93.07%
prediction using machine learning labels, then the learning Precision
machine learning algorithms. probability of the algorithms, 92.53%
algorithm. estimation will specifically Recall
Reference 5 2021 be zero. Naïve Bayes, to 85.76%
classify and F1 score
predict different 92.12%
crime patterns
with high
accuracy.

Improving Compare The paper does the paper offer The


Accuracy of performance of not provide a advantages such combinatio
crime data using different detailed as improved n of K-
K-Means and classification and discussion of the accuracy, better means
Decision Tree clustering specific datasets results, and the clustering
Techniques. techniques. used for the potential for and
Reference 6 Improve accuracy of analysis. identifying high Decision
2021
crime data using crime rate years. tree
combined algorithms classificatio
n provides
more
accurate
results.

Department of CSE – AI&ML, GMRIT Page 10


CRIME ANALYSIS USING MACHINE LEARNING 2023

Use machine The report is Naive bayes


learning to predict - based on real- accuracy –
and analyze crime time news data, 99.4%
Machine Learning providing
based Advanced 2023 Lower crime rates limited state-
Reference 7 Crime Prediction and decrease wise or district-
and Analysis. wise analysis .
criminal activity

Machine learning Develop and Cannot develop a Machine Decision


models for evaluate Machine more accurate learning models, Tree F1
predicting crime Learning models for model. such as Decision score –
hotspots in crime hotspot Trees, Logistic 88.2%
medellin city. prediction in Regression, and
Reference 8
2021 Medellin. MLP, offer a
valuable tool for
predicting crime
hotspots in
Medellin City

Identifies and
focuses on the Identifies
The report is highest the highest
based on real- committed
Crime Data committed crime
Identify and focus on time news data, crime at the
Analysis and the highest providing limited areas. location.
Safety committed crime at state-wise or
Automatically Automatica
Recommendation
Reference 9 2023 the location. district-wise
notifies user of lly notifies
System Using analysis .
crime history user of
Machine
during travel. crime
Learning.
history
during
travel

Crime Analysis Predict and analyze By predicting


and Prediction crime rates using - crime rates, the Support
using Machine machine learning paper can help to vector
Learning algorithms
recognize crime- machine
Algorithms. Determine the most
Reference hotspot areas accuracy-
2022 accurate algorithm
10 and enable the 95% .
for crime prediction
and analysis. public to find
safer routes to
their destinations

Department of CSE – AI&ML, GMRIT Page 11


CRIME ANALYSIS USING MACHINE LEARNING 2023

Criminal The use of large The paper The paper


Combat: Crime Identify crime datasets may utilizes machine concludes
Analysis and patterns and factors
lead to learning that crime
Prediction affecting crime. overfitting, techniques to predictabili
Using Machine which could analyze and ty can be
Learning. Develop predictive affect the predict crime useful in
models for crime accuracy. patterns, which eliminating
Reference 11 2021
analysis and can helps in crime from
prediction. minimizing the society, and
crime rates and can helps in
protecting the creating a
communities. safer
environmen
t
Categorize Not discuss the Enables the The
algorithms based on accuracy of the identification of algorithms
accuracy. proposed crime-prevention are used to
method. hotspots, determine
allowing law the
Prediction of enforcement to likelihood
Crime Data allocate of a crime
using Machine resources where occurring
2023
Reference 12 Learning they are most and the
Techniques needed. nature of
the crime,
such as
whether it
will be
violent or
non-violent

Predicting crime Integration of K- SVM algorithm


Crime Scene locations for police Means and SVM provides
Prediction officers. only shows 0.65 accurate crime Accuracy
Using the accuracy. predictions. of
Reference 13 Integration of K-means Integration
2022 clustering helps
K-Means of K-Means
Clustering and in grouping and SVM is
Support Vector crime data. 65%
Machine.

Department of CSE – AI&ML, GMRIT Page 12


CRIME ANALYSIS USING MACHINE LEARNING 2023

Analyze approaches The paper does Development of


Crime in machine learning not provide a real-time
Prediction and deep learning critical analysis prediction Stacked
Using Machine algorithms for crime of the accuracy models that can generalizati
Learning and prediction of the different analyze crime on
Deep Learning: machine learning data in real-time approach
Reference 14 2023
A Systematic and deep and predict achieved
Review and learning future crime the highest
Future approaches used incidents, accuracy of
Directions. in crime enabling police 99.5%.
prediction. officers to act
quickly.
Only considers The random The
Improve prediction impact of spatial forest algorithm random
accuracy of random factors. overcomes the forest
forest model. Does not limitations of model
Prediction of consider impact single decision outperform
crime hotspots of temporal tree s other
Reference 15 based on spatial 2020 factors classification and models,
factors of effectively such as
random forest. avoids logistic
overfitting, and regression,
reliable in terms of
prediction model prediction
. accuracy.
Table 1: Comparison table for different reference papers

The above table consists of 15 different reference papers with published year, objective, advantages,
limitations and performance metrics.

Department of CSE – AI&ML, GMRIT Page 13


CRIME ANALYSIS USING MACHINE LEARNING 2023

2.2 Diagrammatical representation

Fig 1: Diagrammatical Representation of different authors and algorithms

Department of CSE – AI&ML, GMRIT Page 14


CRIME ANALYSIS USING MACHINE LEARNING 2023

3 DESIGN
The design of a crime prediction system using machine learning involves several key components and
considerations. Here's a structured design outline:

1. Data Collection and Preparation:

Gather historical crime data from reliable sources, including details such as time, location, type of crime,
and other relevant attributes. Ensure data quality by cleaning and preprocessing. Handle missing values,
outliers, and standardize data formats.

2. Feature Selection and Engineering:

Identify relevant features that can contribute to crime prediction (e.g., time of day, location, weather,
socioeconomic factors). Use domain knowledge and statistical techniques to engineer new features that may
enhance predictive power.

3. Machine Learning Models:

Select appropriate machine learning algorithms for crime prediction (e.g., Support Vector Machines,
Random Forest, Neural Networks). Experiment with ensemble learning methods, such as stacking, bagging,
or boosting, to improve model performance.

4. Training and Validation:

Split the dataset into training and validation sets to train and evaluate the models. Use cross-validation
techniques to ensure robustness and generalizability of the models.

5. Ensemble Learning:

Implement stacking-based ensemble methods to combine the predictions of multiple models. Fine-tune
hyper parameters and assess the performance gains achieved through ensemble learning.

6. Model Evaluation:

Evaluate the performance of individual models and the ensemble model using appropriate metrics (e.g.,
accuracy, precision, recall, F1 score). Consider the interpretability of the models to ensure they can be
effectively communicated to law enforcement.

Department of CSE – AI&ML, GMRIT Page 15


CRIME ANALYSIS USING MACHINE LEARNING 2023

7. Dynamic Adaptation:

Implement mechanisms for the model to adapt to changing crime patterns over time. Regularly update the
model with new data to ensure it remains effective in predicting emerging trends.

8. User Interface and Accessibility:

Develop a user-friendly interface for law enforcement to interact with the system. Provide visualization
tools to help users interpret predictions and understand crime patterns.

9. Ethical Considerations:

Address ethical concerns related to bias in the data or models and ensure fairness in predictions. Establish
guidelines for responsible use and potential consequences of relying on machine learning predictions.

10. Security and Privacy:

Implement robust security measures to protect sensitive crime data. Ensure compliance with privacy
regulations and consider anonymization techniques if necessary.

11. Deployment and Integration:

Deploy the crime prediction system in collaboration with law enforcement agencies. Integrate the system
with existing crime prevention workflows and technologies.

12. Monitoring and Maintenance:

Set up monitoring tools to track the system's performance over time. Establish a maintenance schedule for
updating models, addressing issues, and incorporating feedback from users.

13. Documentation and Training:

Document the system architecture, algorithms used, and any relevant decision-making processes. Provide
training to law enforcement personnel on using and interpreting the predictions.

Department of CSE – AI&ML, GMRIT Page 16


CRIME ANALYSIS USING MACHINE LEARNING 2023

4 METHODOLOGY

4.1 Stacking-Based Crime Prediction Method(SBCPM):

Kshatri, S. S., Singh, D., Narain, B., Bhatia, S., Quasim, M. T., & Sinha, G. R. (2021). An empirical
analysis of machine learning algorithms for crime prediction using stacked generalization: an
ensemble approach. Ieee Access, 9, 67488-67500.

Fig 2: Stacking-Based Crime Prediction Method

Department of CSE – AI&ML, GMRIT Page 17


CRIME ANALYSIS USING MACHINE LEARNING 2023

Stacking-Based Crime Prediction Method(SBCPM):

SBCPM initiates with the comprehensive gathering of historical crime data, encompassing details like time,
location, and crime types, ensuring a rich dataset for analysis. Prior to model training, rigorous data
preprocessing is conducted, addressing issues such as missing values and outliers to ensure the quality and
reliability of the dataset. SBCPM involves the thoughtful selection of diverse machine learning algorithms,
such as Support Vector Machines, Random Forests, and Neural Networks, each contributing distinct
perspectives to crime prediction. The chosen algorithms are individually trained on the crime dataset,
allowing them to learn from historical patterns and nuances within the data. Stacking, a key component of
SBCPM, involves combining the predictions of multiple base models. This collaborative decision-making
mechanism aims to reduce bias and variance, enhancing overall prediction accuracy. The ensemble model
generated through stacking is employed for crime prediction, offering a more refined and accurate insight
into potential crime hotspots and trends.

SBCPM is designed to adapt dynamically to evolving crime patterns over time, ensuring its
relevance and effectiveness in addressing emerging trends. The method employs cross-validation techniques
during model training and evaluation, enhancing the robustness of the predictive models and their ability to
generalize to new data. Fine-tuning of hyper parameters is undertaken to optimize the performance of
individual models and the ensemble, ensuring the best possible predictive efficiency. Rigorous evaluation
metrics are applied, including accuracy, precision, recall, and F1 score, to assess the performance of both
individual models and the stacked ensemble. SBCPM emphasizes the interpretability of its models, ensuring
that law enforcement can comprehend and act upon the predictions effectively in real-world scenarios. The
method addresses ethical concerns related to bias and fairness in the data and models, upholding ethical
standards in crime prediction and law enforcement applications. Robust security measures are implemented
to safeguard sensitive crime data, ensuring that the predictions generated by SBCPM are used responsibly
and securely. SBCPM incorporates continuous monitoring mechanisms to track the system's performance,
allowing for timely updates and adjustments to maintain optimal predictive accuracy. The successful
deployment of SBCPM involves close collaboration with law enforcement agencies, integrating the system
seamlessly into their workflow to empower proactive crime prevention strategies and enhance public safety.

Department of CSE – AI&ML, GMRIT Page 18


CRIME ANALYSIS USING MACHINE LEARNING 2023

4.2 K-MEANS CLUSTERING:

Chahal, J. K., & Sharma, A. (2021, December). Improving Accuracy of crime data using K-Means
and Decision Tree Techniques. In 2021 IEEE International Conference on Technology, Research,
and Innovation for Betterment of Society (TRIBES) (pp. 1-4). IEEE.

Fig.-3: K-MEANS CLUSTERING WORK FLOW DIAGRAM

Department of CSE – AI&ML, GMRIT Page 19


CRIME ANALYSIS USING MACHINE LEARNING 2023

K-MEANS CLUSTERING ALGORITHM:


K-means clustering is a widely utilized unsupervised machine learning algorithm designed to
uncover patterns and structures within data by grouping similar data points into clusters. The methodology
initiates with meticulous data preparation, involving cleaning and scaling features for optimal algorithm
performance. Feature selection plays a crucial role in shaping the clustering outcome, influencing the
algorithm's ability to identify meaningful patterns. Defining the number of clusters (K) is a pivotal step,
achieved through methods like the elbow method, determining the granularity of clusters. The core of K-
means lies in iteratively assigning data points to the nearest centroid, updating centroids until convergence
is reached. This iterative refinement minimizes the within-cluster sum of squares, ensuring that data points
within each group are as close as possible. Beyond descriptive analysis, K-means serves as a foundation for
predictive insights, enabling the assignment of new data points to existing clusters for predictive modeling
and decision-making. Interpretability of clusters is crucial for practical application, requiring a deep dive
into the characteristics and patterns within each group. The methodology also involves cluster validation
techniques and considerations for scalability, robustness to outliers, centroid initialization, and handling
categorical data. K-means' versatility extends to applications beyond clustering, including image
compression, anomaly detection, and recommendation systems. Human-in-the-loop exploration facilitates
interactive engagement with clusters, allowing stakeholders to validate results and contribute domain-
specific insights, fostering a collaborative and effective analysis process.
K-means clustering is a popular unsupervised machine learning algorithm used for clustering data
into groups or clusters based on their similarity. It is commonly used to discover patterns and structures in
data without the need for labeled examples. K-means works by partitioning data points into K clusters,
where K is a user-defined parameter. Each cluster is represented by its center, called a centroid, and the
algorithm iteratively assigns data points to the nearest centroid and updates the centroids until clusters
doesn’t change. The goal is to make the data points in each group as close to each other as possible. Some
key points about K-means clustering:
1)Data Preparation
2)Feature Selection
3)Define the Number of Clusters (K)
4)Apply K-means Clustering
5)Understanding Clusters
6)Predictive Analysis

Department of CSE – AI&ML, GMRIT Page 20


CRIME ANALYSIS USING MACHINE LEARNING 2023

4.3 Naive Bayes:


Khatun, S., Banoth, K., Dilli, A., Kakarlapudi, S., Karrola, S. V., & Babu, G. C. (2023, March).
Machine Learning based Advanced Crime Prediction and Analysis. In 2023 International
Conference on Sustainable Computing and Data Communication Systems (ICSCDS) (pp. 90-96).
IEEE.

Fig 4: Naive Bayes Flow Diagram

Naive Bayes:
Crime prediction using the Naive Bayes algorithm involves a systematic methodology that leverages
probabilistic principles to make predictions based on historical crime data. Following data collection, a
meticulous preprocessing step ensures the dataset's quality by handling missing values and outliers. Feature
selection focuses on identifying key variables influencing crime prediction, including factors such as time
of day and geographical location. The dataset is then split into training and testing sets, with the former used

Department of CSE – AI&ML, GMRIT Page 21


CRIME ANALYSIS USING MACHINE LEARNING 2023

to train the Naive Bayes model. Evaluation metrics such as accuracy, precision, recall, and F1 score assess
the model's performance on the testing set. Hyper parameter tuning may involve Laplace smoothing or
adjustments based on domain knowledge. The probabilistic predictions generated by Naive Bayes are
interpreted, and a suitable threshold is selected for classification. Implementation considerations include
choosing the appropriate Naive Bayes variant based on the dataset's nature, such as Gaussian or
Multinomial. Dynamic updating mechanisms ensure the model adapts to evolving crime patterns over time.
Visualization and interpretability enhancements aid in understanding how specific features contribute to
crime likelihood. Ethical considerations address potential biases and ensure responsible model use in law
enforcement. User feedback is actively sought and used for iterative improvements, refining the model's
practical utility. Finally, deployment involves collaborating with law enforcement agencies, integrating the
model into existing workflows, and providing necessary training and support for users. This comprehensive
methodology ensures the effective utilization of the Naive Bayes algorithm for crime prediction, balancing
accuracy, interpretability, and ethical considerations.

4.4 Support Vector Machine(SVM):

Darshan, M. S., & Shankaraiah, S. (2022, October). Crime Analysis and Prediction using Machine
Learning Algorithms. In 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon)
(pp. 1-7). IEEE.

Fig 5: Support Vector Machine(SVM) Flow diagram

Department of CSE – AI&ML, GMRIT Page 22


CRIME ANALYSIS USING MACHINE LEARNING 2023

Support Vector Machine(SVM):


The Support Vector Machine (SVM) algorithm is a powerful tool in machine learning, particularly
for classification tasks, including crime prediction. SVM excels at finding the optimal hyperplane that
maximally separates data points of different classes, making it particularly effective in scenarios with
complex decision boundaries. The SVM methodology begins with data collection, gathering historical crime
data with relevant features. During data preprocessing, careful attention is given to handling missing values
and scaling features to ensure the algorithm's robustness. SVM's strength lies in its ability to handle both
linear and non-linear relationships, and the choice of appropriate kernels, such as linear, polynomial, or
radial basis function (RBF), plays a crucial role in capturing the underlying patterns in the crime data.
The SVM model is trained on the labeled dataset, determining the hyperplane that best separates
crime instances into distinct classes. The algorithm's versatility allows it to handle high-dimensional data
efficiently, making it suitable for crime prediction scenarios with numerous features. Cross-validation
techniques are often employed to assess the model's generalization performance and fine-tune
hyperparameters for optimal results.
One of SVM's notable features is its ability to handle imbalanced datasets, common in crime
prediction where certain types of crimes may be infrequent. The margin maximization strategy ensures that
the SVM generalizes well to new, unseen data, enhancing its predictive accuracy. SVM's effectiveness is
not limited to binary classification; it can be extended to multi-class problems through techniques like one-
vs-one or one-vs-all.
Interpreting SVM predictions involves understanding the decision boundary and support vectors,
which are crucial data points influencing the position of the hyperplane. The SVM's kernel trick allows it to
implicitly map data into higher-dimensional spaces, capturing complex relationships that may not be
apparent in the original feature space.
Moreover, SVMs offer a regularization parameter that aids in controlling overfitting, contributing to
the model's generalization capabilities. While SVMs are known for their accuracy, they may be
computationally intensive, particularly with large datasets. Nevertheless, advancements in hardware and
optimization techniques have mitigated these challenges to a great extent.

Department of CSE – AI&ML, GMRIT Page 23


CRIME ANALYSIS USING MACHINE LEARNING 2023

4.5 Linear Regression:


Shukla, A., Katal, A., Raghuvanshi, S., & Sharma, S. (2021, June). Criminal Combat: Crime
Analysis and Prediction Using Machine Learning. In 2021 International Conference on Intelligent
Technologies (CONIT) (pp. 1-5). IEEE.

Fig 6: Linear Regression Work Flow Diagram

Linear Regression:
Linear regression is a fundamental and widely used statistical and machine learning technique that
serves as a powerful tool for predicting numerical outcomes. In the context of crime prediction, linear
regression allows us to model the relationship between independent variables and the continuous target
variable, such as the frequency of a particular type of crime. The methodology begins with the collection of
relevant historical crime data and the identification of features that may influence crime rates, such as time
of day, socioeconomic factors, and geographic location.
During data preprocessing, careful attention is given to handling missing values, outliers, and scaling
features to ensure the reliability and accuracy of the model. Linear regression aims to fit a linear equation
to the data, representing the relationship between the independent variables and the dependent variable. The
coefficients of this equation provide insights into the strength and direction of the impact each feature has
on the predicted outcome.

Department of CSE – AI&ML, GMRIT Page 24


CRIME ANALYSIS USING MACHINE LEARNING 2023

The model is trained using the collected and preprocessed data, and the training process involves
adjusting the coefficients to minimize the difference between the predicted and actual values. Evaluation
metrics such as mean squared error or R-squared are commonly employed to assess the model's goodness
of fit and predictive performance. Cross-validation techniques may be applied to ensure the model's
robustness and generalization to new, unseen data.
Linear regression's simplicity and interpretability make it particularly valuable for understanding the
linear relationships within crime data. It allows law enforcement agencies to gain insights into the factors
influencing crime rates and make informed decisions based on these insights. Additionally, linear regression
models are computationally efficient and can be easily implemented and interpreted, making them
accessible for practical use.
However, it's important to note that linear regression assumes a linear relationship between variables,
and its performance may be limited when dealing with highly complex or non-linear patterns in crime data.
In such cases, more sophisticated machine learning techniques, such as ensemble methods or support vector
machines, may be considered. Nonetheless, linear regression remains a foundational and effective approach
for crime prediction, providing valuable insights into the quantitative aspects of criminal activities.

Department of CSE – AI&ML, GMRIT Page 25


CRIME ANALYSIS USING MACHINE LEARNING 2023

5 Case Study
5.1 Case Study-1
Kshatri, S. S., Singh, D., Narain, B., Bhatia, S., Quasim, M. T., & Sinha, G. R. (2021). An empirical
analysis of machine learning algorithms for crime prediction using stacked generalization: an
ensemble approach. Ieee Access, 9, 67488-67500.

Stacking-Based Crime Prediction Method(SBCPM):

Fig 7: Stacking-Based Crime Prediction Method(SBCPM)

The model first split the dataset into training and testing sets and then train multiple base models on the
training set. Use the trained base models to make predictions on the testing set. Combine the predictions
from the base models and use them as input to train a meta-model. This meta-model learns to make the final
predictions on test data based on the predictions of the base models. The developed Machine learning based
model can be applied in real-time crime predictions. By utilizing the previous crime data, it identifies the
crime patterns. It is more accurate for the violence data.
The model's accuracy in determining crime patterns and forecasting is 99.5%. This makes it a valuable tool
for real-time crime prediction.

Department of CSE – AI&ML, GMRIT Page 26


CRIME ANALYSIS USING MACHINE LEARNING 2023

6 RESULTS & DISCUSSION


Reference no Title Method Metrics

1 An empirical analysis Stacking-Based Crime Accuracy – 99.5%.


of machine learning Prediction
algorithms for crime Method(SBCPM).
prediction using
stacked generalization:
an ensemble approach.

6 Improving Accuracy of K-means clustering and Accuracy – 65.019%.


crime data using K- Decision Tree.
Means and Decision
Tree Techniques.

7 Machine Learning Naive Bayes. Accuracy – 99.4%.


based Advanced Crime
Prediction and
Analysis.

10 Crime Analysis and Support Vector Accuracy: 95%


Prediction using Machine.
Machine Learning
Algorithms.

11 Criminal Combat: Linear regression. RMSE: 0.011


Crime Analysis and
MSE: 0.001
Prediction Using
Machine Learning. MAE:0.008

Table 2: Results of 5 reference papers.

Department of CSE – AI&ML, GMRIT Page 27


CRIME ANALYSIS USING MACHINE LEARNING 2023

The empirical analysis of machine learning algorithms for crime prediction has emerged as a critical
area of research, as reflected in the studies presented. The "Stacking-Based Crime Prediction Method
(SBCPM)" showcased an impressive accuracy of 99.5%, emphasizing the effectiveness of ensemble
approaches like stacking generalization. This method leverages the strengths of multiple algorithms,
contributing to its high predictive accuracy and reliability in crime prediction scenarios.
In contrast, the study on "Improving Accuracy of crime data using K-Means and Decision Tree
Techniques" employed K-means clustering and Decision Tree algorithms, achieving an accuracy of
65.019%. While not as high as the SBCPM, this combination suggests the significance of utilizing diverse
techniques for crime prediction, with K-means identifying data patterns and Decision Trees providing
decision rules.
The study on "Machine Learning based Advanced Crime Prediction and Analysis" focused on Naive
Bayes and attained an accuracy of 99.4%. The high accuracy underscores Naive Bayes' suitability for crime
prediction, particularly in scenarios where certain features may exhibit conditional independence.
The application of Support Vector Machine (SVM) in "Crime Analysis and Prediction using
Machine Learning Algorithms" yielded an accuracy of 95%. SVMs are known for handling complex
decision boundaries, making them robust for crime prediction tasks where the relationships between features
are intricate.
In "Criminal Combat: Crime Analysis and Prediction Using Machine Learning," Linear Regression
was employed, and the model reported Root Mean Square Error (RMSE) of 0.011, Mean Squared Error
(MSE) of 0.001, and Mean Absolute Error (MAE) of 0.008. These metrics indicate the model's precision in
estimating crime-related variables, showcasing the efficacy of linear regression in certain crime prediction
scenarios.
In summary, these studies collectively highlight the versatility of machine learning algorithms in
crime prediction, each excelling in specific contexts. Ensemble methods like SBCPM offer high accuracy
through model combination, while techniques such as K-means clustering and Decision Trees contribute to
accuracy improvements through data pattern recognition and rule-based decision-making. Naive Bayes and
SVM demonstrate their effectiveness in handling complex relationships, and Linear Regression proves
valuable for precise estimation tasks. The variety of approaches underscores the importance of selecting the
right algorithm based on the characteristics of crime data and the specific goals of prediction models.

Department of CSE – AI&ML, GMRIT Page 28


CRIME ANALYSIS USING MACHINE LEARNING 2023

Metrics:
1. Root Mean Square Error (RMSE):
Root Mean Square Error (RMSE) is a metric used to measure the average magnitude of the errors
between predicted values and actual values. It is often used in the context of regression analysis
and machine learning to evaluate the performance of a predictive model.
2. Mean Absolute Error (MAE):
Mean Absolute Error (MAE) is another metric used to measure the average magnitude of errors
between predicted values and actual values. Like RMSE, MAE is often used in regression
analysis and machine learning to assess the performance of a predictive model. However, MAE
differs from RMSE in the way it calculates the error.
3. R-squared (R²):
It is a statistical measure that represents the proportion of the variance in the dependent variable
that is predictable from the independent variables in a regression model. In other words, R-
squared provides an indication of how well the independent variables explain the variability of
the dependent variable.
4. Accuracy:
Accuracy is a common metric used to evaluate the performance of classification models. It
represents the ratio of correctly predicted instances to the total number of instances in the dataset.
5. F1-Score:
The F1 score is a metric commonly used in binary classification to balance precision and recall.
It provides a single score that combines both precision and recall into a single value.

Department of CSE – AI&ML, GMRIT Page 29


CRIME ANALYSIS USING MACHINE LEARNING 2023

7 CONCLUSION

In the realm of crime prediction using machine learning, the Stacking-Based Crime Prediction Method
(SBCPM) stands out as a formidable approach, demonstrating superior accuracy compared to other
algorithms. The Stacking-Based Crime Prediction Method(SBCPM) is used in the identification of crime
patterns using crime data. By leveraging the principle of stacking, wherein predictions from multiple
classifiers are amalgamated, SBCPM achieves an exceptional classification accuracy of 99.5%. This method
excels in the identification of crime patterns, offering a robust solution for crime analysis and forecasting.
The significance of SBCPM lies in its ability to handle large datasets efficiently, providing enhanced
predictive performance. The ensemble nature of stacking allows it to capitalize on the strengths of various
algorithms, resulting in a comprehensive and accurate crime prediction model. The reported accuracy of
99.5% on testing data underscores the efficacy of SBCPM in achieving reliable results, showcasing its
potential for real-world crime analysis applications. This method's success suggests that ensemble
approaches, particularly stacking, play a pivotal role in addressing the complexities inherent in crime data.
While individual algorithms such as K-means clustering, Decision Trees, Naive Bayes, Support Vector
Machine, and Linear Regression contribute significantly to crime prediction, SBCPM emerges as a
comprehensive solution that excels in accuracy and predictive power. In conclusion, the landscape of crime
prediction using machine learning benefits greatly from the advancements brought forth by the Stacking-
Based Crime Prediction Method. Its ability to amalgamate diverse classifiers and yield a classification
accuracy of 99.5% positions it as a promising and powerful tool for crime analysis and forecasting. As
technology evolves, the integration of ensemble methods like SBCPM continues to shape the future of crime
prediction, offering a nuanced and effective approach to enhancing public safety and law enforcement
efforts.

Department of CSE – AI&ML, GMRIT Page 30


CRIME ANALYSIS USING MACHINE LEARNING 2023

REFERENCES

1. Kshatri, S. S., Singh, D., Narain, B., Bhatia, S., Quasim, M. T., & Sinha, G. R. (2021). An empirical
analysis of machine learning algorithms for crime prediction using stacked generalization: an ensemble
approach. Ieee Access, 9, 67488-67500.
2. Sravani, T., & Suguna, M. R. (2022, February). Comparative Analysis Of Crime Hotspot Detection And
Prediction Using Convolutional Neural Network Over Support Vector Machine with Engineered Spatial
Features Towards Increase in Classifier Accuracy. In 2022 International Conference on Business
Analytics for Technology and Security (ICBATS) (pp. 1-5). IEEE.
3. Akil, R. M., Sarathambekai, S., Vairam, T., Krishnan, R. S., Dharaneesh, G. S., & Janarthanan, D. (2023,
March). Crime Data Analysis and Safety Recommendation System Using Machine Learning. In 2023
9th International Conference on Advanced Computing and Communication Systems (ICACCS) (Vol.
1, pp. 183-188). IEEE.
4. Bonam, J., Burra, L. R., Susheel, G. S. V. N. S., Narendra, K., Sandeep, M., & Nagamani, G. (2023,
July). Crime Hotspot Detection using Optimized K-means Clustering and Machine Learning
Techniques. In 2023 4th International Conference on Electronics and Sustainable Communication
Systems (ICESC) (pp. 787-792). IEEE.
5. Vinothkumar, K., Ranjith, K. S., Vikram, R. R., Mekala, N., Reshma, R., & Sasirekha, S. P. (2023,
March). Crime Hotspot Identification using SVM in Machine Learning. In 2023 International
Conference on Sustainable Computing and Data Communication Systems (ICSCDS) (pp. 366-369).
IEEE.
6. Kanimozhi, N., Keerthana, N. V., Pavithra, G. S., Ranjitha, G., & Yuvarani, S. (2021, March). CRIME
type and occurrence prediction using machine learning algorithm. In 2021 International conference on
artificial intelligence and smart systems (ICAIS) (pp. 266-273). IEEE.
7. Mandalapu, V., Elluri, L., Vyas, P., & Roy, N. (2023). Crime Prediction Using Machine Learning and
Deep Learning: A Systematic Review and Future Directions. IEEE Access.
8. Chahal, J. K., & Sharma, A. (2021, December). Improving Accuracy of crime data using K-Means and
Decision Tree Techniques. In 2021 IEEE International Conference on Technology, Research, and
Innovation for Betterment of Society (TRIBES) (pp. 1-4). IEEE.

9. Khatun, S., Banoth, K., Dilli, A., Kakarlapudi, S., Karrola, S. V., & Babu, G. C. (2023, March). Machine
Learning based Advanced Crime Prediction and Analysis. In 2023 International Conference on
Sustainable Computing and Data Communication Systems (ICSCDS) (pp. 90-96). IEEE.

Department of CSE – AI&ML, GMRIT Page 31


CRIME ANALYSIS USING MACHINE LEARNING 2023

10. Muñoz, V., Vallejo, M., & Aedo, J. E. (2021, August). Machine learning models for predicting crime
hotspots in medellin city. In 2021 2nd Sustainable Cities Latin America Conference (SCLA) (pp. 1-6).
IEEE.
11. Darshan, M. S., & Shankaraiah, S. (2022, October). Crime Analysis and Prediction using Machine
Learning Algorithms. In 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon)
(pp. 1-7). IEEE.
12. Shukla, A., Katal, A., Raghuvanshi, S., & Sharma, S. (2021, June). Criminal Combat: Crime Analysis
and Prediction Using Machine Learning. In 2021 International Conference on Intelligent Technologies
(CONIT) (pp. 1-5). IEEE.
13. Rao, P. V., Sunkari, S., Raghumandala, T., Koka, G., & Rayankula, D. C. (2023, March). Prediction of
Crime Data using Machine Learning Techniques. In 2023 International Conference on Sustainable
Computing and Data Communication Systems (ICSCDS) (pp. 276-280). IEEE.
14. Bolkiah, A. H. A. A., Hamzah, H. H., Ibrahim, Z., Diah, N. M., Sapawi, A. M., & Hanum, H. M. (2022,
December). Crime Scene Prediction Using the Integration of K-Means Clustering and Support Vector
Machine. In 2022 IEEE 10th Conference on Systems, Process & Control (ICSPC) (pp. 242-246). IEEE.
15. Yao, S., Wei, M., Yan, L., Wang, C., Dong, X., Liu, F., & Xiong, Y. (2020, August). Prediction of crime
hotspots based on spatial factors of random forest. In 2020 15th International Conference on Computer
Science & Education (ICCSE) (pp. 811-815). IEEE.

Department of CSE – AI&ML, GMRIT Page 32

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy