0% found this document useful (0 votes)

46 views5 pages

SQL Project File

This document contains the SQL code and analysis for an auto insurance risk dataset. Some key findings include: 1) 50.23% of customers made a claim in the current exposure period. 2) Those with higher average exposure tended to claim more often than others. 3) Exposure buckets E1 and E4 had the highest claim rates, comprising almost 2/3 of total claims. 4) Area C had the highest number of average claims as a percentage of total policies. 5) Average vehicle age was lower for those who claimed compared to those who didn't. BonusMalus decreases with older driver age groups. 6) Vehicle brand B12 with regular gas had the

Uploaded by

Harsh Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views5 pages

SQL Project File

Uploaded by

Harsh Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

SQL Graded Project

Utkarsh Atri – Edart C-III

Dataset - Auto_Insurance_Risk

Use auto_insurance_risk;

1. Write a query to calculate what % of the customers have made a claim in the current exposure period[i.e. in the given
dataset]?

select count(*) from Auto_insurance_risk

where ClaimNb>0; 2
(34060/678013)*100 = 50.23%

2.1. Create a new column as 'claim_flag' in the table 'auto_insurance_risk' as integer datatype.

alter table Auto_insurance_risk 1.5

add column claim_flag int;

2.2 Set the value to 1 when ClaimNb is greater than 0 and set the value to 0 otherwise.

UPDATE Auto_insurance_risk
SET claim_flag = 1.5
CASE WHEN ClaimNb > 0 THEN 1
ELSE 0
END;

3.1. What is the average exposure period for those who have claimed?

select avg(Exposure) from Auto_insurance_risk

where claim_flag = 1; 1

3.2 What do you infer from the result? 1

Use claim_flag variable to group the data.
select claim_flag,avg(Exposure) from Auto_insurance_risk
GROUP by claim_flag

Thus those with higher avg. Exposure tend to claim much more often as compared to the rest.

4.1. If we create an exposure bucket where buckets are like below, what is the % of total claims by these buckets? 2
UPDATE Auto_insurance_risk
SET ebucket =
CASE
WHEN Exposure >= 0 and Exposure <=0.25 THEN "E1"
WHEN Exposure > 0.25 and Exposure <=0.50 THEN "E2"
WHEN Exposure >= 0.51 and Exposure <=0.75 THEN "E3"

This study source was downloaded by 100000826983498 from CourseHero.com on 06-07-2022 06:46:43 GMT -05:00

https://www.coursehero.com/file/106702000/sql-project-filedocx/
WHEN Exposure > 0.75 THEN "E4"
END

For percent claims per bucket

select ebucket,count(ClaimNb), count(ClaimNb)/6780.13

from Auto_insurance_risk
group by ebucket;

4.2 What do you infer from the summary?

We can conclude that E1 and E4 have higher claim (total they both comprise of almost 2/3 rd of 1
total claims. Thus need to be checked into as this means anyperson within E1/E4 has high
chances of claim.

5. Which area has the highest number of average claims? Show the data in percentage w.r.t. the number of policies in
corresponding Area.

select area,count(ClaimNb)
from Auto_insurance_risk
group by area 2
order by count(ClaimNb) desc
limit 1;

Area C

6. If we use these exposure bucket along with Area i.e. group Area and Exposure Buckets together and look at the claim
rate, an interesting pattern could be seen in the data. What is that?

select Area,ebucket,sum(claim_flag)/6780.13 as
claim_rate,sum(ClaimNb)
from Auto_insurance_risk
group by Area,ebucket 3
order by sum(ClaimNb) desc;

We can see that as mentioned earlier E4 and

E1 have higher claims.
So we see that it’s E4  E1 E2  E3 as an
average trend.

7.1. If we look at average Vehicle Age for those who claimed vs those who didn't claim, what do you see in the summary?
1.5 Marks for SQL and 1 for inference.

select claim_flag,avg(VehAge) from Auto_insurance_risk

group by claim_flag; 2.5

Those who did not claim have higher vehicle age as compared
to those who claimed.

This study source was downloaded by 100000826983498 from CourseHero.com on 06-07-2022 06:46:43 GMT -05:00

https://www.coursehero.com/file/106702000/sql-project-filedocx/
7.2. Now if we calculate the average Vehicle Age for those who claimed and group them by Area, what do you see in the
summary? Any particular pattern you see in the data?

select Area,avg(VehAge) 2.5

from Auto_insurance_risk
group by area
having claim_flag=1;

8. If we calculate the average vehicle age by exposure bucket(as mentioned 3

above), we see an interesting trend between those who claimed vs those who
didn't. What is that?

select ebucket,avg(VehAge),claim_flag
from Auto_insurance_risk
group by ebucket,claim_flag;

9.1. Create a Claim_Ct flag on the ClaimNb field as below, and take average of the BonusMalus by Claim_Ct. 2
UPDATE Auto_insurance_risk

SET Claim_Ct =
CASE WHEN ClaimNb = 1 THEN "1 Claim"
WHEN ClaimNb > 1 THEN "MT 1 Claims"
WHEN ClaimNb = 0 THEN "No Claims"
END;

select Claim_Ct,avg(BonusMalus)from Auto_insurance_risk

group by Claim_Ct;

9.2 What is the inference from the summary?

We can see that the average BonuMalus is almost same for categories being a bit inclined 1
towards those who have already claimed more than once.

10. Using the same Claim_Ct logic created above, if we aggregate the 4
Density column (take average) by Claim_Ct, what inference can we
make from the summary data?

select Claim_Ct,avg(Density)from Auto_insurance_risk

group by Claim_Ct

Average Density is higher for those with more than

one claims. It increases with the claims, thus being
more for those who’ve claimed.

2
11. Which Vehicle Brand & Vehicle Gas combination have the
highest number of Average Claims (use ClaimNb field for
aggregation)?

select VehBrand,VehGas,avg(ClaimNb)
from Auto_insurance_risk
group by VehBrand,VehGas
order by avg(ClaimNb) desc;

This study source was downloaded by 100000826983498 from CourseHero.com on 06-07-2022 06:46:43 GMT -05:00

https://www.coursehero.com/file/106702000/sql-project-filedocx/
Thus Vehicle Brand B12 which is a Regular Vehicle Gas has the highest average claims.

12. List the Top 5 Regions & Exposure[use the buckets

created above] Combination from Claim Rate's
perspective. Use claim_flag to calculate the claim rate.

select
Region,Exposure,count(claim_flag)/6780.13
from Auto_insurance_risk 3
group by Region,Exposure
order by count(claim_flag)/6780.13 DESC
limit 5;

13.1. Are there any cases of illegal driving i.e. underaged folks
driving and committing accidents?

select claim_flag,count(claim_flag)
from Auto_insurance_risk
where age = "1 - Beginner" 1
group by claim_flag;

Yes, there are a total of 61 cases of illegal driving

13.2 Create a bucket on DrivAge and then take average of BonusMalus by this Age Group Category. WHat do you infer from
the summary?
DrivAge=18 then 1-Beginner, DrivAge<=30 then 2-Junior, DrivAge<=45 then 3-Middle Age, DrivAge<=60 then 4-Mid-
Senior, DrivAge>60 then 5-Senior 2.5 Marks for SQL and 1.5 for inference.

UPDATE Auto_insurance_risk
SET age =
CASE WHEN DrivAge =18 THEN "1 - Beginner"
WHEN DrivAge > 18 and DrivAge <=30 THEN "2 -
Junior"
WHEN DrivAge > 30 and DrivAge <=45 THEN "3 -
Middle Age"
WHEN DrivAge > 45 and DrivAge <=60 THEN "4 - Mid
Senior" 4
WHEN DrivAge > 60 THEN "5 - Senior"
END;

select age as Age_Category,avg(BonusMalus)

from Auto_insurance_risk
group by age;

We can see that BonusMalus i.e. which penalises them for making claims decreases with age.
This can be due to the fact the that older people have much more experience in driving as
compared to younger ones so they are expected to drive cautiously.

14. Mention one major difference between unique constraint and primary key? 2
Primary Key - Only one primary key is allowed to use in a table, thus used to uniquely identify each
record in the table. The primary key does not accept the any duplicate and NULL values

This study source was downloaded by 100000826983498 from CourseHero.com on 06-07-2022 06:46:43 GMT -05:00

https://www.coursehero.com/file/106702000/sql-project-filedocx/
Unique key - A column with a unique key constraint can only contain unique values. It is not a
compulsion to have a unique key in a table.

15. If there are 5 records in table A and 10 records in table B and we cross-join these two tables, how many records will be
there in the result set?
2
5*10 = 50

16. What is the difference between inner join and left outer join?

Inner join returns a combined tuples between two or more tables where at least one attribute in
common. If there is no attribute in common between tables then it will return nothing.
2
Left Outer join is an operation that returns a combined tuples from a specified table even the join
condition will fail. It returns all records from the left table (Table 1) and matching records from the
right table (Table 2).

17. Consider a scenario where Table A has 5 records and Table B has 5 records. Now while inner joining Table A and Table
B, there is one duplicate on the joining column in Table B (i.e. Table A has 5 unique records, but Table B has 4 unique
values and one redundant value). What will be record count of the output? 2
25

18. What is the difference between WHERE clause and HAVING clause?

WHERE Clause is used to filter the records from the table based on the specified condition whereas
HAVING Clause is used to filter record from the groups based on the specified condition.
WHERE Clause can be used without GROUP BY Clause, but HAVING Clause cannot be used
without GROUP BY Clause.

This study source was downloaded by 100000826983498 from CourseHero.com on 06-07-2022 06:46:43 GMT -05:00

https://www.coursehero.com/file/106702000/sql-project-filedocx/
Powered by TCPDF (www.tcpdf.org)

ACDA Level 3
0% (2)
ACDA Level 3
14 pages
HomeCredit Test
No ratings yet
HomeCredit Test
8 pages
How Ab-Initio Job Is Run What Happens When You Push The "Run" Button?
100% (2)
How Ab-Initio Job Is Run What Happens When You Push The "Run" Button?
39 pages
All MCQ Answers - Merged
No ratings yet
All MCQ Answers - Merged
136 pages
Problem Statement-Auto Insurance Project-1
No ratings yet
Problem Statement-Auto Insurance Project-1
3 pages
Fraud Detection in Auto Insurance
No ratings yet
Fraud Detection in Auto Insurance
28 pages
HomeCredit Test 2020
No ratings yet
HomeCredit Test 2020
7 pages
SQL Objective and Subjective Questions
No ratings yet
SQL Objective and Subjective Questions
35 pages
Questions
No ratings yet
Questions
2 pages
Hive Assignment 1
No ratings yet
Hive Assignment 1
4 pages
Data Analyst Test AdvaRisk
No ratings yet
Data Analyst Test AdvaRisk
14 pages
COMS4111 Spring 2014
No ratings yet
COMS4111 Spring 2014
8 pages
File T NG H P N I Dung - Team 3 - VJPE205 1.1 1
No ratings yet
File T NG H P N I Dung - Team 3 - VJPE205 1.1 1
40 pages
SQL Examples
No ratings yet
SQL Examples
11 pages
SQL Worksheet With Answers
No ratings yet
SQL Worksheet With Answers
6 pages
Unit 3 and 4 Que Bank
No ratings yet
Unit 3 and 4 Que Bank
11 pages
SQLclassQuiz Final
No ratings yet
SQLclassQuiz Final
16 pages
Buisness Report SAS
No ratings yet
Buisness Report SAS
9 pages
All DBMS MCQ Answers
No ratings yet
All DBMS MCQ Answers
38 pages
Weekly Challenge 4 - 3 - Coursera
100% (1)
Weekly Challenge 4 - 3 - Coursera
1 page
SQLTraining
No ratings yet
SQLTraining
51 pages
Oracle Sumarry Chapters 4,6,7 and 8
No ratings yet
Oracle Sumarry Chapters 4,6,7 and 8
15 pages
Info Teza
No ratings yet
Info Teza
10 pages
Solution - Milestone Test 5 - DS3 - 14sept - Mridulaxi
No ratings yet
Solution - Milestone Test 5 - DS3 - 14sept - Mridulaxi
23 pages
SQL Good FAQs
No ratings yet
SQL Good FAQs
6 pages
Bis 345 Entire Course
No ratings yet
Bis 345 Entire Course
19 pages
Data Analyst Test - AdvaRisk
No ratings yet
Data Analyst Test - AdvaRisk
13 pages
Final Exam Semester 1
No ratings yet
Final Exam Semester 1
12 pages
Final
100% (1)
Final
14 pages
Oracle Final Exam Semester 1
100% (1)
Oracle Final Exam Semester 1
22 pages
ACKO MOCKDRIVEQuestions and Answers
No ratings yet
ACKO MOCKDRIVEQuestions and Answers
7 pages
NF 2
No ratings yet
NF 2
12 pages
Instructions SC AC19 10a
No ratings yet
Instructions SC AC19 10a
4 pages
Car Insurance Case Study - Claims Analysis and Findings
No ratings yet
Car Insurance Case Study - Claims Analysis and Findings
15 pages
Document Text Nou
No ratings yet
Document Text Nou
12 pages
Dbms
No ratings yet
Dbms
13 pages
Big Data Hadoop and Spark Developer: Certification Project
No ratings yet
Big Data Hadoop and Spark Developer: Certification Project
9 pages
CSC Data Base Assignment
No ratings yet
CSC Data Base Assignment
20 pages
Advanced SQL Functions
No ratings yet
Advanced SQL Functions
12 pages
KPMG Data Analyst Interview Questions
No ratings yet
KPMG Data Analyst Interview Questions
30 pages
Car Insurance Case Study - Claims Analysis and Findings
No ratings yet
Car Insurance Case Study - Claims Analysis and Findings
15 pages
SQL CH 05
No ratings yet
SQL CH 05
4 pages
DBMS Lab Manual
No ratings yet
DBMS Lab Manual
59 pages
Lab Manual Week 04
No ratings yet
Lab Manual Week 04
4 pages
DP080 Lecture 5
No ratings yet
DP080 Lecture 5
13 pages
Dbms
No ratings yet
Dbms
1 page
Analytic Functions by Example Oracle FAQ
No ratings yet
Analytic Functions by Example Oracle FAQ
16 pages
Data Visualisation Using Tableau Project
No ratings yet
Data Visualisation Using Tableau Project
2 pages
Assignment2 Stats
No ratings yet
Assignment2 Stats
5 pages
Assignment 6
No ratings yet
Assignment 6
4 pages
DP080 Lecture 5
No ratings yet
DP080 Lecture 5
13 pages
DP080 Lecture 5
No ratings yet
DP080 Lecture 5
13 pages
Data Analysis
No ratings yet
Data Analysis
25 pages
VJPE205HK1 23242.1 - Group 14
No ratings yet
VJPE205HK1 23242.1 - Group 14
37 pages
Mid Term Exam Semester 2 Part 2
No ratings yet
Mid Term Exam Semester 2 Part 2
15 pages
SQL 1737456396
No ratings yet
SQL 1737456396
17 pages
AI-900: Microsoft Azure AI Fundamentals Preparation
From Everand
AI-900: Microsoft Azure AI Fundamentals Preparation
Georgio Daccache
No ratings yet
Siebel Insurance 8 Guide
From Everand
Siebel Insurance 8 Guide
Mohammed Azizuddin Aamer
4/5 (2)
CISA Exam-Testing Concept-Knowledge of Risk Assessment
From Everand
CISA Exam-Testing Concept-Knowledge of Risk Assessment
Hemang Doshi
2.5/5 (4)
Textbook of Urgent Care Management: Chapter 9, Insurance Requirements for the Urgent Care Center
From Everand
Textbook of Urgent Care Management: Chapter 9, Insurance Requirements for the Urgent Care Center
David Wood
No ratings yet
The VaR Modeling Handbook: Practical Applications in Alternative Investing, Banking, Insurance, and Portfolio Management
From Everand
The VaR Modeling Handbook: Practical Applications in Alternative Investing, Banking, Insurance, and Portfolio Management
Greg N. Gregoriou
No ratings yet
Mysql Constraints
No ratings yet
Mysql Constraints
24 pages
Interview Questions
No ratings yet
Interview Questions
14 pages
SQL 101 A Beginner S Guide To Database From 1 To N Dev Nodrm
No ratings yet
SQL 101 A Beginner S Guide To Database From 1 To N Dev Nodrm
47 pages
SQL Labsheet
No ratings yet
SQL Labsheet
9 pages
COC DBS For L3 and 4
No ratings yet
COC DBS For L3 and 4
25 pages
Assign Details
No ratings yet
Assign Details
26 pages
CHAPTER 12 Activity-Based Costing
0% (1)
CHAPTER 12 Activity-Based Costing
23 pages
Using MySQL With LabVIEW
50% (2)
Using MySQL With LabVIEW
49 pages
Part B Pgm9 - DBMS-Lab - Movies Database
No ratings yet
Part B Pgm9 - DBMS-Lab - Movies Database
7 pages
Telephone Information System
No ratings yet
Telephone Information System
68 pages
Case Study 3 - Group06
No ratings yet
Case Study 3 - Group06
34 pages
Vessel Characteristic
0% (1)
Vessel Characteristic
7 pages
New Hope
100% (3)
New Hope
25 pages
PUT DBMS Objective 01 06 2022
No ratings yet
PUT DBMS Objective 01 06 2022
3 pages
Connecting To Oracle Database From NetBeans IDE
100% (1)
Connecting To Oracle Database From NetBeans IDE
46 pages
Dbms 4 Units by Subbu
No ratings yet
Dbms 4 Units by Subbu
132 pages
Dbms File
No ratings yet
Dbms File
37 pages
DBMS
No ratings yet
DBMS
12 pages
SQL Interview Questions
100% (1)
SQL Interview Questions
10 pages
Student Attendance System
86% (21)
Student Attendance System
48 pages
Avinash Interview Question and Answers
No ratings yet
Avinash Interview Question and Answers
9 pages
Online Vehicle Showroom Database Design
No ratings yet
Online Vehicle Showroom Database Design
10 pages
Assignment 2
No ratings yet
Assignment 2
23 pages
SQL Performance Explained
No ratings yet
SQL Performance Explained
122 pages
Debremarkos University Burie Campas: Department of Computer Sience Database Lab Mannul
No ratings yet
Debremarkos University Burie Campas: Department of Computer Sience Database Lab Mannul
21 pages
DBMS Lab Manual
No ratings yet
DBMS Lab Manual
41 pages
Project Report: (MCA-601 (N2) )
No ratings yet
Project Report: (MCA-601 (N2) )
69 pages
DB Part1
No ratings yet
DB Part1
9 pages
Dbms 2
No ratings yet
Dbms 2
77 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

SQL Project File

Uploaded by

SQL Project File

Uploaded by

SQL Graded Project

Utkarsh Atri – Edart C-III

select count(*) from Auto_insurance_risk

alter table Auto_insurance_risk 1.5

select avg(Exposure) from Auto_insurance_risk

3.2 What do you infer from the result? 1

For percent claims per bucket

select ebucket,count(ClaimNb), count(ClaimNb)/6780.13

4.2 What do you infer from the summary?

We can see that as mentioned earlier E4 and

select claim_flag,avg(VehAge) from Auto_insurance_risk

select Area,avg(VehAge) 2.5

8. If we calculate the average vehicle age by exposure bucket(as mentioned 3

select Claim_Ct,avg(BonusMalus)from Auto_insurance_risk

9.2 What is the inference from the summary?

select Claim_Ct,avg(Density)from Auto_insurance_risk

Average Density is higher for those with more than

12. List the Top 5 Regions & Exposure[use the buckets

Yes, there are a total of 61 cases of illegal driving

select age as Age_Category,avg(BonusMalus)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.