0% found this document useful (0 votes)
245 views11 pages

Data - Analysis On Rape Victims

This data analysis summarizes rape case data from 2000-2010 across different Indian states and age groups. The analysis includes grouping the data by state, calculating statistics, visualizing trends through histograms and line graphs, and examining correlations between states and age groups using a matrix. The goal is to spread awareness about the crime and ensure women's safety.

Uploaded by

Harish B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
245 views11 pages

Data - Analysis On Rape Victims

This data analysis summarizes rape case data from 2000-2010 across different Indian states and age groups. The analysis includes grouping the data by state, calculating statistics, visualizing trends through histograms and line graphs, and examining correlations between states and age groups using a matrix. The goal is to spread awareness about the crime and ensure women's safety.

Uploaded by

Harish B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

ASSIGNEMENT

DATA ANALYSIS
Name: Vaishnavi. K
Roll. No: 20BIT060
Department: IT

HTML link for the data analysis: rapecases_dataanalysis


. ipynb link for the data analysis : rapecases_dataanalysis

OBJECTIVE:

 The objective of the data analysis is to show the records of the rape cases which
have been grouped according to the state as well as age wise.
 The dataset contains a set of 1051 records under 11 attributes- Area name, year,
subgroup, Total Rape cases, victims above 50 years, victims between the age of 10-
14 years, 14-18 years, 18- 30 years,30-50 years, total cases of rape and victims below
the age of 10
 The data analysis gives you a glimpse of the rape cases reported area wise and age
wise along with displaying the data statistically with the help of graphs
 The main purpose of this data is to spread awareness about the crime and to take
better steps to ensure the safety of women.

ABOUT THE DATASET:


Rape accounts for about 12% of all crimes against women. The distribution of reported
cases is quite uneven across the nation.
India’s average rate of reported rape cases is about 6.3 per 100,000 of the population.
However, this masks vast geographical differences with places like Sikkim and Delhi
having rates of 30.3 and 22.5, respectively, while Tamil Nadu has a rate of less than one.
Of course, one must be careful in interpreting these state-wise differences as these are
‘reported’ cases and could suffer from under-reporting.
The data analysis done below is just one among the many cases that have taken place in
10 years from 2000 to 2010.
UNDERSTANDING THE DATA:
Link for the dataset: rapevictims.xlsx
PREPARING THE DATA:
1. The basics about the dataset

By reading the csv file and giving the command to read it would display the data
set as present in the excel sheet.
Here I have given head(51), therefore it would display the data of first 50 states
present in the excel data sheet where in vertically downwards it would display the
data state wise and horizontally it would display the data according to the area
name, year, and age wise as well the cases in total and those that have been
reported.
2. Getting to know about the data, its basic statistics and shape

The above data shows the mean, count etc of the data set accordingly, where in
the mean of the rape cases reported is around 361.920 and that of the total cases
is around 362.198
GROUPING THE DATA

1.Next was to gather the information about the datatype of the dataset

i. .info() is used to display the type of data structures available in the dataset
ii. As shown above except for Area_Name and Subgroup which are of string data
type the rest are of float data type

2. The next step was to group the data as well as display the cases area wise as a
sum

i. .groupby() would help in grouping the data given according to the attribute
give. Since I have given the attribute as Area_Name the data would be
grouped area wise
ii. The next step was to display this which was done just by mentioning
head(), here I have given the parameter for display as 10 so it would
display the first 9 lists of data present
3. The next step was to display the max cases rate from the data set

i. First, I had converted the data into a list datatype


ii. Then I had traversed through the list for each area name and each age wise
to show the max cases rate and display it horizontally for each accordingly

4. Next, I wanted to show the maximum cases that were reported area wise

i. The pd.DataFrame which belongs to panda is a two-dimensional size-


mutable, potentially heterogeneous tabular data structure with labelled axes
ii. As shown above by using dataframe we would get the max cases from each
state reported
ANALYSING AND VISUALISING THE DATA:
1. Histogram is a diagram consisting of rectangles whose area is proportional to the
frequency of a variable and whose width is equal to class interval.
2. A histogram is used to summarize discrete or continuous data. In other words it
provides a visual interpretation
3. Here, histograms are used to depicts the rape cases taking place age wise.
Histograms that depict the rape cases age wise and in total are:
Total rape cases

LINE GRAPH FOR COMPARISON


1. A line graph is a type of chart used to show information that changes over time.
We plot line graphs using several points connected by straight lines. 
2. Below we have used line graph to compare the cases that have been reported in
different areas( here I have chosen three states namely bihar, assam and delhi)

As shown above Assam and Bihar show higher reported cases than Delhi. However
there have been incidents where in these are just cases which may not have been
reported which leads to the lowering of the crime rate in the area.

There are also individual line graphs of certain states to show the cases reported
age wise.
FOR DELHI:

FOR KARNATAKA:

FOR BIHAR:
CORRELATION MATRIX:
1. A correlation matrix is a tabular data representing the correlations between pairs
of variables in a given matrix
2. The correlation matrix is an important data analysis metric that is computed to
summarise data to understand the relationship between various variables and
make decisions accordingly.
Correlation matrix in the form of figure:
Each row and column represent a variable, and each value in this matrix is the correlation
coefficient between the variables represented by the corresponding row and column.
For the below figure we have drawn a correlation about the cases that have been
reported age wise corresponding to a particular area name(state / district) so that it is
easier to compound the data to come up with the finalised result and draw awareness
accordingly.

Through the analysis done so far we have drawn attention to the fact that between the
year 2000-2010 there has been a visible increase in the rape cases reported .
The highest being reported from Sikkim, Delhi, Bihar from the age gap of 14- 30 years
whereas the least has been reported from the state of TamilNadu , but we must also keep
in mind that the analysis done above shows the cases that just been reported , there
have been crimes that have taken place but not been reported.
Through this analysis I hope to draw attention to the hideous crime that has been
happening through the ages and that we as citizens must take our necessary steps
forward to reduce this and eventually make this come to an end.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy