Bank Loan Analysis
Bank Loan Analysis
ANALYSIS
USING SQL
By: Sadiyer Mandal
Table of Contents
Objective 01
Introduction 02
Dataset Descriptions 03
SQL Queries 04
Key Insights 05
Conclusion 06
Objective
To analyze loan data to uncover key
insights that can aid the bank in
minimizing loan default risks and
improving loan approval strategies.
This analysis aims to identify trends
and risk factors, enabling the bank to
make data-driven decisions that
enhance loan approval rates and
optimize lending processes,
ultimately contributing to financial
growth and customer satisfaction.
Introduction
This project is entirely centered around
SQL, leveraging its robust capabilities to
conduct comprehensive analyses of loan
data. We will calculate key metrics such as
default percentages, debt-to-income (DTI)
ratios, and loan approval rates segmented
by homeownership status and loan intent.
The structured nature of SQL facilitates
efficient aggregation and filtering of data,
enabling us to extract valuable insights
that can guide strategic decision-making
for banks and lending institutions.
Ultimately, this project highlights the
power of SQL in loan analysis, providing a
data-driven perspective on the
competitive dynamics of loan offerings
within the banking sector.
Dataset Description
The loan dataset used in this project was sourced from Kaggle, a well-known
platform for data science and machine learning resources. This dataset
comprises detailed information on various loan applications, including loan
characteristics, and performance metrics. Key features include:
Loan Dataset
person_age: The age of the individual applying for the loan.
person_gender: The gender of the applicant, either "male" or "female."
person_education: The education level of the applicant, such as "High
School," "Bachelor," or "Master."
person_income: The annual income of the applicant in currency units.
person_emp_exp: The applicant's employment experience in years.
person_home_ownership: The type of home ownership the applicant has,
such as "RENT," "OWN," or "MORTGAGE."
loan_amnt: The amount of the loan that the applicant is requesting.
loan_intent: The purpose of the loan, such as "PERSONAL," "EDUCATION,"
or "MEDICAL."
loan_int_rate: The interest rate applied to the loan.
loan_percent_income: The percentage of the applicant’s income that the
loan payment will require.
cb_person_cred_hist_length: The length of the applicant's credit history in
years.
credit_score: The applicant's credit score, representing their
creditworthiness.
previous_loan_defaults_on_file: Indicates if the applicant has any previous
loan defaults ("Yes" or "No").
loan_status: The status of the loan, where "1" indicate an approved loan
and "0" a declined loan.
SQL Queries
Loan Dataset Summary
Insights:
Percentage of Approved Loans: 22.2%
Average Interest Rate on Loans: 11.1%
Average Credit History Length: 5.87 years
Average Income of Applicants: $80,319
Average Loan Amount Applied: $9,583
Average Debt-to-Income (DTI) Ratio: 13.97%
SQL Queries
Credit Score vs Loan Default Percentage
Insights:
1.Good Score Grouping (Credit Score > 700):
Non-Defaulter: 65.36%, Defaulter: 34.64%
Insight: Low default rate, making this group favorable for loans with better terms and higher
limits.
2.Average Score Grouping(Credit Score 500-700):
Non-Defaulter: 48.52%, Defaulter: 51.48%
Insight: Balanced default rate, requiring moderate risk management like higher interest rates or
stricter terms.
3.Risky Score Grouping(Credit Score 300-500):
Non-Defaulter: 21.17%, Defaulter: 78.83%
Insight: High default risk, requiring caution with strategies like secured loans or higher interest
rates.
SQL Queries
Analysis of Loan Intent vs. Average Interest Rate vs. Average Loan
Approved vs. Risk Factor vs. Default Percentage
Insights:
The risk factor is defined as the ratio of the number of defaulted loans to the total number of approved loans. It
represents the default loan percentage in relation to the total approved loan applicants, providing a measure of the
potential risk associated with a particular loan intent or category.
Insights:
1. High-Risk, Low People Count Sections
Sections Identified: VENTURE (Own) with a Risk Factor of 1.823 and 35 people, and VENTURE (Other) with a Risk
Factor of 62.96 and 1 person.
Strategy: While these sections exhibit high risk, their relatively low people counts mean that reductions should be
carefully limited to manage risk without causing significant impact on overall stability.
2. Low-Risk, High People Count Sections
Sections Identified: HOMEIMPROVEMENT (Rent) with a Risk Factor of 0.0397 and 923 people, and EDUCATION
(Rent) with a Risk Factor of 0.0394 and 1246 people.
Strategy: Apply minimal reductions in these sections to maintain stability due to their low risk and high people
counts, ensuring minimal disruption while managing overall distribution.
3. Moderate-Risk Sections with Higher People Count
Sections Identified: VENTURE (Mortgage) with a Risk Factor of 0.4825 and 131 people.
Strategy: Implement moderate reductions here to balance effective risk management and maintaining stability, as
it has a higher people count with moderate risk.
4.Overall Approach
Focus on moderate reductions in areas like VENTURE (Mortgage) to manage both risk and impact on people
count.
Apply minimal reductions to low-risk, high people count sections.
Conduct careful reductions in high-risk, low people count areas to mitigate risk without destabilizing broader
sections.
SQL Queries
Adding Headcount to the Loan Intents: Medical, Personal, and Debt
Consolidation
Insights:
1.High-Risk, Low People Count Sections
Sections Identified:
MEDICAL (OWN) with a Risk Factor of 3.4636% and 44 people.
MEDICAL (OTHER) with a Risk Factor of 3.0300% and 11 people.
Strategy: While these sections show elevated risk, their relatively low people counts suggest that careful management is necessary.
2.Moderate-Risk, Medium People Count Sections
Sections Identified:
DEBT CONSOLIDATION (OTHER) with a Risk Factor of 3.6763% and 8 people.
DEBT CONSOLIDATION (OWN) with a Risk Factor of 1.3446% and 28 people.
Strategy: Moderate additions can be considered for these segments, but they should be paired with enhanced monitoring and adjusted loan terms, such as stricter repayment schedules
or customized lending policies, to manage potential risks effectively.
3.Low-Risk, High People Count Sections
Sections Identified:
MEDICAL (RENT) with a Risk Factor of 0.0250% and 1,712 people.
DEBT CONSOLIDATION (RENT) with a Risk Factor of 0.0262% and 1,533 people.
Strategy: These sections provide the best opportunity for safely increasing people counts due to their low risk. Focus on adding more individuals with average or moderate income
profiles to maintain portfolio stability while minimizing the likelihood of defaults.
4.Moderate-Risk, High People Count Sections
Sections Identified:
PERSONAL (MORTGAGE) with a Risk Factor of 0.2301% and 257 people.
DEBT CONSOLIDATION (MORTGAGE) with a Risk Factor of 0.0890% and 594 people.
Strategy: Carefully balance additions in these sections. While the risk level is moderate, the higher people count warrants a cautious approach. Consider targeted promotions or reduced
interest rates to attract low-risk individuals while minimizing overall exposure.
Overall Approach:
Focus on cautious, strategic additions to low-risk, high people count sections to leverage their stability.
Moderate additions in moderate-risk, higher people count segments to carefully maintain growth while controlling for risk.
Implement more restrictive or tailored approaches in high-risk, low people count sections to protect against significant risk increases.
SQL Queries
Male vs Female and Default Percentage
Insights:
The loan default percentages for males and females are relatively similar, indicating no significant
difference in their likelihood of default. Therefore, it is not necessary to differentiate between
genders when allocating loans. It is advisable to offer the same loan amounts to both males and
females, ensuring equitable lending practices.
SQL Queries
Person_credit_hist_length vs default rate
Insights:
The loan default percentage for individuals with a short credit history is 51.19%, which is significantly
higher compared to other categories. Conversely, the lowest default percentage is observed for
individuals with a long credit history at 40.96%. Therefore, it is advisable to prioritize individuals with
long and medium-length credit histories when allocating loans to minimize the risk of defaults.
Conclusions
Based on the comprehensive analysis and including considerations for Debt-to-Income (DTI) ratios and credit scores, the
final strategic conclusions for loan allocation and risk management are as follows:
1. High-Risk Loan Intents
Identified Categories: Venture, Home Improvement, and Education.
Conclusion: Venture loans present the highest risk factor, followed by Home Improvement and Education. These
categories require focused reduction efforts due to their elevated risk levels and significant average loan amounts
approved.
Recommendation: Strategic reductions should primarily target individuals with poor Debt-to-Income (DTI) ratios
(greater than 44%) and lower credit scores (less than 700) to mitigate default risks. Individuals with good DTIs (less
than 30%) or good credit scores (greater than 700) can be retained or closely monitored.
2. Low-Risk Loan Intents
Identified Categories: Personal, Debt Consolidation, and Medical.
Conclusion: These categories exhibit comparatively lower risk, making them favorable for expanding the pool of loan
recipients.
Recommendation: Focus on increasing individuals with good credit scores (above 700) and good or average DTIs
(less than 44%), especially in the Medical, Debt Consolidation, and Personal categories. This strategy will enhance
portfolio stability and reduce risk exposure.
3. High-Risk, Low People Count Sections
Identified Sections: VENTURE (Own) and VENTURE (Other).
Conclusion: Despite high risk factors, these sections have low people counts, meaning that reductions should be
carefully managed.
Recommendation: Limit reductions to individuals with high-risk profiles (poor DTIs and low credit scores), while
monitoring those with average DTIs (30-44%) and good credit scores (above 700) to mitigate risk without
destabilizing overall stability.
4. Low-Risk, High People Count Sections
Identified Sections: HOME IMPROVEMENT (Rent) and EDUCATION (Rent).
Conclusion: These sections have lower risk factors and high people counts, suggesting minimal disruption is needed.
Recommendation: Apply minimal reductions and focus on maintaining stability by retaining individuals with good
DTIs and good credit scores.
5. Moderate-Risk Sections with Higher People Count
Identified Sections: VENTURE (Mortgage).
Conclusion: This section exhibits a moderate risk level combined with a higher people count.
Recommendation: Implement moderate reductions, targeting individuals with high DTIs (greater than 44%) and low
credit scores. Efforts should focus on balancing risk management and maintaining overall stability.
6. Low-Risk, High People Count Sections
Identified Sections: MEDICAL (Rent) and DEBT CONSOLIDATION (Rent).
Conclusion: These sections provide an opportunity for safely increasing the number of individuals.
Recommendation: Focus on adding individuals with good credit scores (above 700) and favorable DTIs (less than
30%) to ensure portfolio stability and minimize default risks.
7. Moderate-Risk, High People Count Sections
Identified Sections: PERSONAL (Mortgage) and DEBT CONSOLIDATION (Mortgage).
Conclusion: These sections exhibit moderate risk levels and high people counts, requiring a cautious approach to
management.
Recommendation: Carefully balance additions by targeting low-risk individuals with good DTIs and good credit
scores. Consider targeted promotions or reduced interest rates to attract favorable profiles and maintain overall
stability.
8. Gender-Based Loan Allocation
Conclusion: Loan default percentages for males and females are relatively similar, indicating no need to differentiate
loan allocation practices by gender.
Recommendation: Maintain equitable lending practices by offering similar loan amounts to both genders, ensuring
fairness and inclusivity.
9. Credit History-Based Allocation
Conclusion: Individuals with a short credit history exhibit a significantly higher default percentage, while those with long
or medium credit histories have lower default rates.
Recommendation: Prioritize loan allocations to individuals with long and medium credit histories to minimize
default risks and enhance portfolio performance.
Recommendation
Reduction Focus: Prioritize reducing individuals in high-risk categories and sections with
high DTIs (greater than 44%) and low credit scores (below 700). Retain and monitor
individuals with good DTIs (less than 30%) and good credit scores (above 700).
Reallocation Focus: Focus on increasing individuals with good credit scores and good or
average DTIs in low-risk sections to maintain stability and minimize risk exposure.
Monitoring and Adjustments: Implement enhanced monitoring of moderate-risk sections
with customized lending terms, such as stricter repayment schedules, to proactively
manage potential risks.
This strategy leverages the insights from DTI and credit score metrics to optimize loan
allocation, reduce defaults, and maximize overall portfolio stability.
Thank
You