0% found this document useful (0 votes)
29 views12 pages

95 Submission-2

FYP project ideas for electrical engineering students

Uploaded by

Luna Tic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views12 pages

95 Submission-2

FYP project ideas for electrical engineering students

Uploaded by

Luna Tic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Crime Data Analysis using Machine Learning

Divyaa Shri Bhuvaneshwaran1, Meera Rajagopal2

Department of Artificial Intelligence and Data Science, Panimalar Engineering College, Chennai,
Tamil Nadu, Indiaa),b)

b.divyaashri2003@gmail.com 1,a), meerarajagopal2630@gmail.com2,b)

Abstract: The surge in criminal cases in well as crimes where force is used as a kind of
India has led to a growing number of coercion. Crimes involving violence can vary
unresolved incidents, posing a significant from homicide to harassment, and they may not
challenge for law enforcement agencies. always be initiated with a weapon, depending
Understanding and categorizing criminal on the jurisdiction. Violence crimes often
activity trends is crucial for effective include murders, robberies, rapes, attempted
prevention. Despite the prevalence of murders, kidnappings, thefts, riots, dowry
machine learning and deep learning deaths, dowry atrocities, etc. According to the
techniques, law enforcement struggles to actual statistics on the number of murders, 17 of
promptly address and prevent crimes. This the poorest counties were found to be in
paper explores the application of various Maharashtra, and the next two poorest counties,
machine learning methods in predicting and Andhra Pradesh and Rajasthan, were found to
analyzing crime data. The primary objective have the largest concentrations of violent crime
is to underscore the efficacy of machine districts. Violent crimes comprise the bulk of
learning in forecasting violent crimes within crimes committed against women. Opposing
specific regions, enabling law enforcement to public order as a crime was reported in 13
proactively reduce crime rates. The study districts in Rajasthan, 12 districts in Bihar, and
focuses on refining the accuracy of prediction 11 districts in Tamil Nadu, out of a total of 100
models through data processing, utilizing districts [6].
algorithms such as Logistic Regression,
Analyzing time-series data is the main
Decision Tree classification, KNN and
objective of time series analysis. This
Random Forest classification. By training
knowledge heavily dominates in forecasting
these models with a comprehensive dataset
future values based on previously seen data.
and validating them with test data, the
The growing population and continuous
research aims to provide law enforcement
urbanization are contributing to an increase
agencies with effective tools for crime
in violent crime and accidents. Utilizing
prevention in Indian society. The goal of the
Big Data Analysis (BDA) to mine through
project is to improve prediction model
these massive data sets for insights and
accuracy by means of data processing and
examine hidden patterns and their
techniques. In order to give law enforcement
interrelationships is one of the newest
officials useful tools for preventing crime in
trends in data analysis [1], [2]. Since crime
Indian society, the research trains these
is one of the most prevalent and worrisome
models using an extensive dataset and
issues in our society, it is imperative that it
validates them with test data.
be prevented [5]. The scientific examination
Keywords: Machine Learning, Decision Trees, and recognition of numerous correlations,
Logistic Regression, Random Forest, patterns, and trends in Crime analysis and
prediction are methods used to combat
1. INTRODUCTION
disorder and crime [3]. Though they happen
In recent years, numerous crimes have been everywhere, cities are the scene of most
reported worldwide. Threatening to use force crimes. It affects basic human rights and
against the victim is a serious offense known as leads to the collapse of society's social
violence. It includes crimes when the motive is structure. It is not always possible for law
a violent act like murder, rape, or robbery as enforcement officers to manually identify
areas with a high crime rate during a 2. LITERATURE SURVEY
specific period because criminal
motivations are dynamic and because The first group makes use of techniques for
current crime statistics are not being used pattern-based event extraction. Riloff [15]
properly. initially suggested such a method in 1993 to
extract terrorist occurrences from texts that
Furthermore, there is ongoing debate on the were peculiar to the domain. There are
need for software tools and systems to currently quite a few pattern-based EE
support criminal investigation and systems available for extracting different
prevention efforts based on textual content types of events, each of which is
shared online. This issue is intended to be domain-specific [16], [18]. Our research is
handled by applications such as hate speech very interested in the most recent
detection, crime pattern modeling, crime developments in EE studies related to
prediction, identification of topics relevant terrorism and criminal activity.
to crime, etc. [10], [11], [12]. To combat
and prevent crimes, it is still crucial to Several techniques for forecasting crime
utilize more efficient methods for analyzing data have been presented in recent research.
the criminal content of computer-mediated After comparing linear regression, addictive
communication (CMC). One of the hardest regression, and decision trees, Lawrence
and most successful techniques for McClendon and Natarajan Meghan
detecting and predicting criminal behavior concluded that linear regression was the
is crime-related event extraction (CREE) most accurate and effective method for
[13], [14]. predicting crime data based on the training
set input for all three algorithms [7].
While structured data can be mined for
patterns and trends using traditional According to a survey [17], crime
methods, modern data mining can extract forecasting will be improved by employing
information from both structured and effective data gathering and mining
unstructured data because of the progress in techniques to produce a better crime
technology. Data mining helps criminal prediction and by applying knowledgeable
investigators analyze the massive volume of learning to construct numerous models for a
data more precisely and efficiently [12]. single problem. Using a single classification
The advent of big data has enabled social and prediction output, which also specifies
science research and development to better the time of year and season, it is possible to
understand how social phenomena and forecast what the next crime in a certain
human behavior are related. neighborhood would be. There is a time
dimension where crime happens more
The research suggests a smart police frequently.
method that uses machine learning to
predict the kinds of crimes and their risks in Babakura et al. compared Naive Bayesian
response to these difficulties. This method and Back Propagation to predict the crime
presents an ensemble model that combines data and classified it into low, medium, and
bagging and stacking approaches, and it high levels of crime rate. Additionally
uses text-based crime summaries. By calculated were the precision, recall, and
merging several classifiers, this ensemble accuracy. When utilizing WEKA to
method—known as the Supervised Based evaluate data categorization on a crime
Crime Prediction Method (SBCPM)—aims dataset, it was discovered that the Nave
to improve prediction efficacy. The study's Bayesian performed better [9].
conclusion highlights the importance of this
K-means clustering, hierarchical clustering,
novel strategy by demonstrating the
and DBSCAN clustering were used by
potential of ensemble learning approaches
Sivaranjani et al. to analyze crime data from
in crime prediction and prevention.
cities in the Indian state of Tamilnadu.
Based on various geographic data, Liao et sociology for analysis, Christopher
al. proposed a unique approach for additionally highlighted several potential
Bayesian learning-based crime prediction difficulties [33]. [34] describes how the
[23]. author applied the crime forecasting model
and used the data mining classification
In order to identify cyber security incidents approach to perform a comparative research
from a brief, noisy text, Yagcioglu et al. to find the best method for predicting crime
[19] used a long short-term memory hotspots. Data mining approach
(LSTM) recurrent neural network in CRISP-DM, which contains six steps, was
conjunction with CNN. For deep learning in utilized by Saltos et al. For the experimental
non-Euclidean areas, graph neural networks purpose, three algorithms from the three
(GNN) employ many neurons acting on a phases#39; categories are used, and
graph structure. As a result, in [21], the evaluation is done using evaluation metrics.
authors suggested using attention-based Experiments are being conducted to
graph convolutional networks to jointly examine and assess crime frequency
extract numerous event triggers and prediction [35].
arguments. According to the significance of
each input component to the EE task, Liu et 3. PROPOSED SYSTEM
al.'s [22] attention processes in neural
models have been used to direct a neural The proposed crime data analysis system
model to treat each component of the input integrates information from law
differently. enforcement databases, emergency calls,
surveillance, and community reports. It
In one borough of a city, Xiangyu et al. makes use of text mining, machine learning
made use of dynamic trends in urban data, algorithms, and geographic analysis to
and they subsequently used transfer anticipate and classify crimes, spot trends,
learning techniques to improve the crime and keep an eye on social media for
prediction in other provinces. These possible criminal activity. Platforms that
variables are combined, and spatio- promote citizen reporting help to foster
temporal patterns are modelled for crime community involvement. Based on
prediction using a unique transfer learning anticipated crime hotspots, dynamic
framework [25]. 95% accuracy was attained resource allocation maximizes law
when the Indian numerical plates [24] were enforcement efforts. Tools for visualization
introduced using a fuzzy-based technique. support interpretation by highlighting
Images that were blurred instead of tilted ethical and private issues. Ongoing
and noisy were used. The authors improvements are driven by feedback loops
additionally advanced the idea of and continuous review, which ensures
identifying cars involved in crime flexibility to changing crime trends and
prediction by proposing a second model for technical breakthroughs. The system is a
Myanmar numbers and characters [26]. flexible, technologically advanced strategy
for improving law enforcement and
To undertake a crime analysis and use the preventing crime.
data to construct a crime matching
procedure, Keyvanpour et al. presented the It is impossible to give an exact accuracy
SOM clustering approach [30]. rate without knowing specifics about the
Hosseinkhani et al. presented a review to machine learning algorithms, training set,
identify crime hotspots, forecast crime and assessment metrics utilized in the
patterns, and extract usable information suggested crime data analysis system.
through data mining [31]. According to a Thorough testing and validation would be
framework proposed by Khalid et al [32], required to ascertain the accuracy of the
criminal cases are mapped to sociological system; this usually involves using previous
theories based on certain features. In his data that was not utilized for training. The
discussion of the integration of big data and effectiveness of the system in predicting
different kinds of crimes would be elements, particularly for models with linear
evaluated using evaluation measures like foundations. When highly correlated
precision, recall, and F1 score. The efficacy features are present, a relatively tiny value
of these elements and how well they fit the for the coefficient of some significant
ever-changing landscape of criminal features will be assigned.
activity will determine the actual accuracy
rate.
c. Data Mining(WEKA)

Data mining involves gathering raw data


and transforming it into knowledge that can
be used to real-world scenarios like the
stock market or tracking customer spending
patterns at the neighbourhood Wal-Mart
(via the processes of inference and
analysis). Machine learning methods are
used to analyze data sets using data mining
software packages, such as the Waikato
Environment for Knowledge Analysis
(WEKA), the data mining software
Figure 1: The proposed diagram
programme utilized in this project.
a. Data Analysis and Collection
4. MACHINE LEARNING MODELS
The dataset was created by compiling crime
statistics from every Indian state. Factual a. Logistic Regression Algorithm
reports on murder, rape, and theft are
Supervised learning involves the use of
included in the data collection and are
algorithms like logistic regression, which is
thought to be the key components. one of the most widely used machine
Information is acquired from a variety of learning techniques. It is applied to
sources during the data collecting process, categorize activities whose objective is to
which is then utilized to create machine estimate the likelihood that an instance will
learning models. The information needs to fall into a certain class or not. This type of
be kept in a fashion that makes sense for the statistical technique examines the
connection between a group of binary
issue.
variables that are independent and
dependent on one another.
b. Data Pre-processing

Raw datasets that have been obtained


typically have some problems, like
redundant data, missing values, and highly
correlated data. In order to increase
modeling effectiveness and decrease
computing costs, redundant data should be
eliminated. Missing values happen when the
variable in an observation doesn't have a
data value saved. When there are too many
Figure 2: Basic Logistic Regression Model
blank spaces, they are typically filled with
the mean or median value, a zero value, or
they are just erased. High correlational
statistics indicate that the feature and the
study aim or other features have similar
meanings. These characteristics will
obstruct the examination of influential
b. Decision Tree Algorithm (KNN) method in machine learning. By
considering its K nearest neighbors within
Although decision trees are a supervised the training dataset, KNN makes use of the
learning approach, they are mostly concept of similarity to predict the label or
employed to solve classification issues.
value of a new data point.
However, they may also be used to solve
regression problems. This classifier is
tree-structured, with internal nodes standing
in for dataset attributes, branches for
decision rules, and leaf nodes for each
outcome. The Decision Node and the Leaf
Node are the two nodes that make up a
decision tree. While leaf nodes represent
the result of decisions and do not have any
more branches, decision nodes are used to
make any kind of decision and have Figure 4: Basic KNN Model
numerous branches. It is a graphical tool
that shows all of the options for solving a 5. RESULTS AND DISCUSSION
problem or making a decision given certain
parameters. Machine Accuracy Recall AUC Time
Learning Rate taken
c. Random Forest Algorithm Algorith in
ms secon
Among the supervised learning methods is ds
the well-known machine learning algorithm Logistic 0.9216 0.78 0.9365 0.085
Random Forest. It may be applied to ML Regressio 092
issues involving both classification and n
regression. Its foundation is the idea of Decision 0.8423 0.7 0.8466 0.096
ensemble learning, which is the act of Tree 534
merging several classifiers to solve a Random 0.8419 0.64 0.8524 481.5
Forest 26764
challenging issue and enhance the model's
KNN 0.7826 0.62 0.7954 6.784
functionality. A classifier called Random 368
Forest uses many decision trees on different
dataset subsets and averages them to 6. CONCLUSION
increase the dataset's predicted
accuracy.Rather of depending just on a In the context of law enforcement, using
single decision tree, the random forest machine learning algorithms to compare
forecasts the outcome based on the majority crime data has shown to be an effective
vote of projections from each tree. strategy. Crime analysis can benefit greatly
from utilizing machine learning's
adaptability in processing a variety of data
sources, including spatial, temporal, and
socioeconomic data. Using the machine
learning techniques such as Random
Forests, Decision Trees, and Support Vector
Machines (SVM), law enforcement
agencies can get an extensive knowledge of
crime trends and patterns. These algorithms
Figure 3: Basic Random Forest Diagram can efficiently integrate different data
sources to improve resource allocation,
d. K-Nearest Neighbour Algorithm (KNN) strategic planning, and predictive policing.
Algorithms like KNN, Naive Bayes, and
Regression and classification problems may Logistic Regression were used in a
be resolved with the help of the reliable and particular case study to analyze crime data
understandable K-Nearest Neighbors
pertaining to a certain area. Extensive "Deprivation violence and conflict: An
research and alignment with particular analysis of Naxalite activity in the districts
objectives are key factors in selecting the of India", Int. J. Conf. Violence, vol. 2, no.
best machine learning model for crime 2, pp. 317-333, 2008.
prediction. By measuring accuracy with a [7] R. Connelly, C. J. Playford, V. Gayle
98.1% rate, the Logistic Regression model and C. Dibben, "The role of administrative
outperformed the others based on metrics data in the big data revolution in social
like precision, recall, and support. While science research", Social Sci. Res., vol. 59,
logistic regression has the potential to pp. 1-12, Sep. 2016.
predict crime with a high degree of [8] H. Chen, W. Chung, J. J. Xu, G. Wang, Y.
accuracy, it is essential to acknowledge that Qin and M. Chau, "Crime data mining: A
the best model will vary depending on the general framework and some
particulars of the crime dataset and the examples", Computer, vol. 37, no. 4, pp. 50-56,
study goals. Thorough testing is used to Apr. 2004.
support the outcome, including statistical [9] P. Pławiak, M. Abdar and U. R. Acharya,
significance checks, hyperparameter tuning, "Application of new deep genetic cascade
and overfitting considerations. In order to ensemble of SVM classifiers to predict the
improve prediction models and aid in law Australian credit scoring", Appl. Soft Comput.,
enforcement decision-making, research on vol. 84, Nov. 2019.
crime data analysis may go in new [10] H. Hassani, X. Huang, E. S. Silva and M.
directions in the future. These may include Ghodsi, "A review of data mining applications
focusing on certain crime types, expanding in crime", Stat. Anal. Data Mining ASA Data
to bigger datasets, and adding new Sci. J., vol. 9, no. 3, pp. 139-154, Jun. 2016.
characteristics. [11] J. T. Nockleby, "Hate speech" in
Encyclopedia of the American Constitution,
7. REFERENCES New York, NY, USA:Macmillan, pp.
1277-1279, 2000.
[1] Nowshin Tasnim, Iftekher Toufique [12] A. H. Salas, J. Morzan-Samam and M.
Imam, et al., “A Novel Multi-Module Nunez-del-Prado, "Crime alert! crime
Approach to Predict Crime Based on typification in news based on text
Multivariate Spatio-Temporal Data Using mining", Proc. Future Inf. Commun. Conf., vol.
Attention and Sequential Fusion Model”, 69, pp. 725-741, 2020.
IEEE Access, vol. 10, 2022. [13] S. Yagcioglu, M. S. Seyfioglu, B. Citamak,
[2] Umair Muneer Butt, Sukumar B. Bardak, S. Guldamlasioglu, A. Yuksel, et al.,
Letchmunan, et al., “Spatio-Temporal "Detecting cybersecurity events from noisy
Crime HotSpot Detection and Prediction: A short text" in arXiv:1904.05054, 2019.
Systematic Literature Review”, IEEE [14] F. Rahma and A. Romadhony, "Rule-based
Access, vol. 8, 2020. crime information extraction on Indonesian
[3] M. Feng, J. Zheng, J. Ren, A. Hussain, digital news", Proc. Int. Conf. Data Sci. Its
X. Li, Y. Xi, et al., "Big data analytics and Appl. (ICoDSA), pp. 10-15, Oct. 2021.
mining for effective visualization and trends [15] E. Riloff, "Automatically constructing a
forecasting of crime data", IEEE Access, dictionary for information extraction
vol. 7, pp. 106111-106123, 2019. tasks", Proc. 11th Nat. Conf. Artif. Intell., pp.
[4] M. Feng, J. Zheng, Y. Han, J. Ren and 811-816, 1993.
Q. Liu, "Big data analytics and mining for [16] F. Hogenboom, "Automated detection of
crime data analysis visualization and financial events in news text".
prediction", Proc. Int. Conf. Brain Inspired [17] A. Almaw and K. Kadam, "Survey paper
Cognit. Syst., pp. 605-614, 2018. on crime prediction using ensemble
[5] H. J. Eysenck, "Crime and approach", Int. J. Pure Appl. Math., vol. 118,
personality", Medico-Legal J., vol. 47, no. no. 8, pp. 133-139, 2018.
1, pp. 18-32, 1979 [18] J. A. Reyes-Ortiz, "Criminal event
[6] V. K. Borooah and N. Ireland, ontology population and enrichment using
patterns recognition from text", Int. J. Pattern rate prediction in the urban environment
Recognit. Artif. Intell., vol. 33, no. 11, Oct. using social factors", Procedia Comput.
2019. Sci., vol. 136, pp. 472-478, Jan. 2018.
[19] S. Yagcioglu, M. S. Seyfioglu, B. Citamak, [26] G. R. Sinha, K. S. Raju, R. K. Patra, D.
B. Bardak, S. Guldamlasioglu, A. Yuksel, et al., W. Aye and D. T. Khin, "Research studies
"Detecting cybersecurity events from noisy on human cognitive ability", Int. J. Intell.
short text" in arXiv:1904.05054, 2019. Defence Support Syst., vol. 5, no. 4, pp.
[20] T. B. Hyde, H. Dentz, S. A. Wang, H. 298-304, 2018.
E. Burchett, S. Mounier-Jack and C. F. [30] M. R. Keyvanpour, M. Javideh and M. R.
Mantel, "The impact of new vaccine Ebrahimi, "Detecting and investigating crime
introduction on immunization and health by means of data mining: A general crime
systems: A review of the published matching framework", Procedia Comput. Sci.,
literature", Vaccine, vol. 30, no. 45, pp. vol. 3, pp. 872-880, Jan. 2011.
6347-6358, 2015. [31] J. Hosseinkhani, M. Koochakzaei, S.
[21] X. Liu, Z. Luo and H. Huang, "Jointly Keikhaee and J. H. Naniz, "Detecting suspicion
multiple events extraction via information on the Web using crime data
attention-based graph information mining techniques Detecting suspicion
aggregation", Proc. Conf. Empirical information on the Web using crime data
Methods Natural Lang. Process., pp. mining techniques", Int. J. Adv. Comput. Sci.
1247-1256, 2018. Inf. Technol., vol. 3, no. 1, pp. 32-41, 2014.
[22] J. Liu, Y. Chen, K. Liu and J. Zhao, [32] S. Khalid and S. A. Khan, "A framework
"Event detection via gated multilingual for mapping crime data on sociological
attention mechanism", Proc. 32nd AAAI hypothesis", Proc. IEEE 16th Int. Conf. Smart
Conf. Artif. Intell., pp. 4865-4872, 2018. Cities Improving Qual. Life Using ICT IoT AI
[23] Z. Li, T. Zhang, Z. Yuan, Z. Wu and Z. (HONET-ICT), pp. 135-139, Oct. 2019.
Du, "Spatio-temporal pattern analysis and [33] C. A. Bail, "The cultural environment:
prediction for urban crime", Proc. 6th Int. Measuring culture with big data", Theory Soc.,
Conf. Adv. Cloud Big Data (CBD), pp. vol. 43, no. 3, pp. 465-482, Jul. 2014.
177-182, Aug. 2018. [34] C.-H. Yu, M. W. Ward, M. Morabito and
[24] L. Acion, D. Kelmansky, M. D. Van W. Ding, "Crime forecasting using data mining
Laan, E. Sahker, D. S. Jones and S. Arndt, techniques", Proc. IEEE 11th Int. Conf. Data
"Use of a machine learning framework to Mining Workshops, pp. 779-786, Dec. 2011.
predict substance use disorder treatment [35] G. Saltos and M. Cocea, "An exploration
success", PLoS ONE, vol. 12, no. 4, pp. of crime prediction using data mining on open
1-14, 2017. data", Int. J. Inf. Technol. Decis. Making, vol.
[25] V. Ingilevich and S. Ivanov, "Crime 16, no. 5, pp. 1155-1181, Sep. 2017.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy