Crime Analysis and Prediction Using Data
Crime Analysis and Prediction Using Data
Agaba Joab Ezra, Erio Crucecia, Binga Ivan, Amoo Brenda Ayoo, Alaroker
Juliet Olwedo.
College of Computing and Informatics Sciences, Makerere University Kampala Uganda.
Department of Computer Science
Email:
joabagaba@cit.mak.ac.ug, eriocrucial.ec@gmail.com, ivanbinga@gmail.com,
amoobrenda@gmail.com, julietalaroker@gmail.com,
ABSTRACT
Crime analysis and prediction is a systematic approach for identifying and analyzing patterns and
trends in crime. Crime analysis is an area of vital importance to the police department. Study of
crime data can help the police department to analyze crime patterns, inter-related clues and
important hidden relations between the crimes, that is why data mining can be of great aid to
analyze, visualize and predict crime using crime data sets. [1] Classification and correlation of
data sets makes it easy to understand similarities & differences amongst the data objects. The focus
is on criminality of places rather than the tracing of individual criminals. The main users of the
system will be the police force who from time to time shall be able to predict the possibility of
crime occurrence in the nearest future as well a particular time the crime will likely occur. In this
paper we basically look at kmeans clustering algorithm for data mining and use it to generate
hotspots of criminal activity and also use it to predict the chance of crime occurring in the nearest
future. This analysis may help the law enforcement of the country to take a more accurate decision
for example allocation of more resources like police officers in the crime prone areas [2].
Key words: Crime Analysis, data mining, classification, correlation, Visualization, prediction,
Kmeans algorithm, modus operandi.
direct effect on society. Governments spend
1. INTRODUCTION lots of money through law enforcement agencies
For a long time, research in the area of crime to
analysis has been used to mitigate crime and try and stop crimes from taking place. Today,
ensure public safety. The improvement of many law enforcement bodies have large
technology in recent years has proven to help in volumes of data related to crimes, which need to
mining of large volumes of data. Data mining be processed to turn into useful information.
helps in processing of large amounts of data and Crime data are complex because they have many
discovering hidden information. [1] dimensions and in different formats e.g. most of
Crimes are a social nuisance and it has a them contain string records and narrative records.
Due to this diversity it is difficult to mine them. 2. OVERVIEW OF CRIME
The research problem that this project tries to
According to the Uganda police annual crime
address is to develop a software platform to
report of 2014 [11], a crime is an act committed,
conduct descriptive and predictive analysis of
or omitted, in violation of the law either
diverse crime data. Descriptive analyzing focuses
forbidding or commanding it. Crime can also be
on identifying spatial temporal relationships with
referred to as a comprehensive concept that is
crime data. Predictive analytics methods are
defined in both legal and non- legal sense [7].
mainly used for predicting category of a crime
From the legal point of view crime is the
which can occur somewhere at a given time.
breaking or breaching of the criminal law
Crime cannot be predicted since it is neither
(penal code) that governs a geographical area
systematic nor random [3]. Even though we
(jurisdiction) aimed at protecting the lives,
cannot predict who may be the victims of crime
property and the rights of citizens of belonging
but we can predict the place that has probability
to that jurisdiction.
for its occurrence. Our goal is to design an
Crime occurs in a variety of forms
effective system that will give accuracy of at
which police informally categorizes as being
most 70%. So for building such a powerful crime
major or volume. Major crimes consist of the
analytics tool we have to collect crime records
high profile crimes such as murder, armed
and evaluate it. The main challenge in front of us
robbery. These crimes can either be one-offs or
is developing a better, efficient crime pattern
serial. Serial crimes are relatively easy to link
detection tool. The challenges we faced included:
crimes together due to clear similarities in terms
• Analysis of data is difficult since data is of modus operandi or descriptions of the
incomplete and inconsistent offenders. This linking is possible due to the
• Limitation in getting crime data from law comparatively low volume of such crimes.
enforcement bodies. Major crimes usually have a team of detectives
• Accuracy of the data which highly allocated to conduct the investigation. In
depends on the accuracy of the training contrast, volume crimes such as burglary and
set. shoplifting are far more prevalent. They are
usually serial in nature as offenders go on to
commit many such crimes. Property crimes
such as domestic burglary offences committed
by individuals are highly similar and it’s rare to
have a description of the offenders.
7. REFERENCES
[1] F. U. M, "Knowledge Discovery and journal on systems and man, vol. 3, no. 3,
Data Mining: Towards a unifying pp. 2848-2853, 2005.
framework," in AAAI Press, Menlo Park,
Carliforni, Portland, Orgeon, Proc. 2nd Int. [9] W. C. H. Chen, "A general framework
Conf. on Knowledge Discovery and Data and some examples," in IEEE international
Mining.. conference, 2004.