0% found this document useful (0 votes)
17 views6 pages

Real Time Crime Dashboard

The document outlines a project for real-time crime data analytics using Apache Kafka and Spark Streaming to process and visualize crime data. It details the architecture, data pipeline, and the use of Power BI for dynamic visualizations, enabling authorities to make informed decisions based on real-time insights. Key benefits include early crime trend detection, improved resource allocation, and enhanced public safety.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views6 pages

Real Time Crime Dashboard

The document outlines a project for real-time crime data analytics using Apache Kafka and Spark Streaming to process and visualize crime data. It details the architecture, data pipeline, and the use of Power BI for dynamic visualizations, enabling authorities to make informed decisions based on real-time insights. Key benefits include early crime trend detection, improved resource allocation, and enhanced public safety.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Real-Time Crime Data Analytics Using Spark Streaming

1. Introduction
Project Title: Real-Time Crime Data Analytics Using Spark Streaming
Overview: This project demonstrates the development and deployment of a real-time big data
analytics system focused on crime data, which has become an increasingly important tool in
modern policing and urban safety initiatives. By leveraging the capabilities of Apache Kafka for
real-time data ingestion and Apache Spark Streaming for distributed processing, we can provide
instant insights into crime activities across multiple locations. This enables authorities to make
rapid decisions and adjust strategies accordingly.
Real-time crime analytics involve collecting incident reports from sources like police control
centers, emergency call data, and field units. These raw data streams are processed, structured, and
filtered to remove redundancy or errors. After enrichment and classification, the results are stored in
efficient formats like Parquet or Hive tables. Power BI acts as the front-end analytics platform,
allowing users to interact with dynamic visuals and respond to real-time changes in the data.
Key Objectives:

• Capture and process crime events in real time using Apache Kafka.
• Use Spark Streaming to cleanse, transform, and aggregate data in micro-batches.
• Categorize crimes based on type, severity, timestamp, and geolocation.
• Store the processed output in Hive or export as Parquet/CSV for long-term storage.
• Visualize patterns and KPIs through interactive Power BI dashboards.
Expected Benefits:

• Early detection of crime trends and hotspots.


• Improved allocation of law enforcement resources.
• Enhanced public safety and emergency response.
• Data-driven support for policymaking and urban planning.
Real-Time Crime Data Analytics Using Spark Streaming

2. Architecture and System Components


Technology Stack and Tools Used:

• Apache Kafka: Acts as a real-time distributed event broker. It captures raw crime event
messages and ensures fast delivery to consumers.
• Apache Spark Streaming (PySpark): Performs stream processing by reading
messages from Kafka topics. Applies transformation logic such as deduplication,
classification, and geolocation enrichment.
• Hive/HDFS/Parquet/CSV: Storage layer where the structured and cleaned crime data is
saved for querying and visualization.
• Power BI: Business Intelligence tool used to create dynamic visualizations and KPIs for
decision-makers.
Data Pipeline Overview:
Crime Incident Data → Kafka Topics → Spark Streaming → Hive/Parquet/CSV Outpu t → Power BI
Dashboard

Process Explanation:

• Data Ingestion: Kafka Producers send live crime records, structured in JSON or Avro
format.
• Stream Processing: Spark Streaming jobs operate in near real-time to batch and process
crime data every few seconds.
• Transformation: Includes parsing of timestamp, city classification, mapping GPS
coordinates to zones, and tagging severity based on crime type.
• Storage: Data is pushed into Hive for analytical queries or saved in efficient Parquet format
for reporting tools.
• Visualization: Power BI fetches and presents KPIs, trends, and maps.
Advantages of the Architecture:

• Scalable and fault-tolerant processing


• Real-time visualization and insights
• Modular and easy to maintain
• Supports structured and semi-structured data
Real-Time Crime Data Analytics Using Spark Streaming

3. Dashboard Visualizations in Power BI


3.1 Crime Type Distribution – Donut Chart

• Offers visual segmentation of crime types reported within the last 24 hours.
o Theft: 35%
o Assault: 25%
o Burglary: 20%
o Robbery: 15%
o Others: 5%
• Helps identify dominant criminal activities in a given time window.

3.2 Crime Hotspots – Bar Chart

• Bar chart showing top five cities with the most crime cases:
o Delhi – 420 cases
o Mumbai – 390 cases
o Bengaluru – 310 cases
o Hyderabad – 290 cases
o Chennai – 250 cases
• Allows city administrators to focus on targeted regions.
3.3 Hourly Crime Trends – Line Chart

• X-axis: Hour of the Day


• Y-axis: Crime Count
• Displays spikes during nighttime (e.g., 9 PM to 11 PM)
• Useful for shift management and preventive measures
3.4 Crime Heatmap – Geographical Visualization

• Uses latitude and longitude coordinates to show crime density.


• Heat zones:
o Red = High activity
o Yellow = Medium activity
o Green = Low activity
3.5 Severity Breakdown – Stacked Column Chart

• Displays High, Medium, and Low severity crimes by city.


• Helps law enforcement focus on high-risk zones.
3.6 Slicers and Filters:

• By location, severity, and date range


• Enhances user interactivity
Real-Time Crime Data Analytics Using Spark Streaming

4. Key Performance Indicators (KPIs) and Insights


Power BI showcases real-time KPIs, each offering vital data insights and enabling better
decision-making.
Total Crimes Today

• Tracks crime volume in a 24-hour cycle.


• KPI Card Example: 1,450 cases

High Severity Crime Count

• Shows count of severe crimes such as armed assault, murder.


• Example: 320 incidents
Average Crimes per Hour

• Calculation: Total crimes / 24 hours


• Example: 60 per hour
Peak Crime Time Interval

• Determines hour with highest incidents


• E.g., 9 PM – 10 PM
Most Frequent Crime Type

• Dynamically updated trend analysis


• Example: Theft
Severity Distribution

• Proportional representation of severity:


o High: 22%
o Medium: 50%
o Low: 28%
Top Crime-Prone Locations

• Based on continuous crime reporting


• Delhi and Mumbai typically lead
Real-Time Alert Triggers

• Alerts when hourly crime count exceeds set threshold


• Used for auto-notification and fast response
Officer Resource Load Index

• Calculated from crime rate vs available officers


• Helps adjust force deployment dynamically
Real-Time Crime Data Analytics Using Spark Streaming

5. Sample Dataset and Format


Processed sample from Spark Streaming:

Timestamp Crime Type Location Severity Count


2025-06-17 09:00:00 Theft Delhi Low 50
2025-06-17 10:00:00 Burglary Mumbai High 30
2025-06-17 11:00:00 Assault Bengaluru Medium 40
• Records like these are stored in Hive or CSV files.
• They serve as input for Power BI dashboards.
• Can be queried using Spark SQL or HiveQL.
Features of the Data:

• Timestamp normalized to hourly granularity


• Severity is pre-classified using logic in Spark
• Location data supports geospatial analysis
Real-Time Crime Data Analytics Using Spark Streaming

6. Power BI Implementation and Deployment


Step-by-Step Deployment Process:
Step 1: Connect to Data Source

• Use Get Data→ Choose Folder/File (CSV or Parquet)


• Or use Hive ODBC connection with direct query support
Step 2: Clean and Prepare the Data

• Remove null or duplicate entries


• Add derived columns:
o Hourfrom timestamp
o DayOfWeek, Severity_Label
Step 3: Add DAX Measures
TotalCrimes = SUM('CrimeData'[Count])
HighSeverityCrimes = CALCULATE(SUM('CrimeData'[Count]), 'CrimeData'[Severity]
= "High")
AvgCrimesPerHour = AVERAGEX(SUMMARIZE('CrimeData', 'CrimeData'[Hour], "CrimeC ount",
SUM('CrimeData'[Count])), [CrimeCount])

Step 4: Build the Visual Interface

• Add visual tiles:


o Line Chart
o Map
o Stacked Column
o KPI cards
o Donut and bar charts
• Add slicers: Time range, City, Severity
Step 5: Configure Auto-Refresh

• In Power BI Service:
o Set schedule to refresh every 30 minutes or use DirectQuery for real-time
o Set alerts based on thresholds (e.g., more than 100 crimes/hour)
Advanced Tips:

• Integrate ML models to forecast crime trends


• Use row-level security to provide restricted views
• Create bookmarks for time-based comparisons

End of Project Report

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy