0% found this document useful (0 votes)

92 views4 pages

Loan Pre Research Paper

This document discusses using decision trees, naive bayes, and PCA (principal component analysis) models to predict loan approvals. It implements a decision tree model to classify loan approvals based on home, personal, and other attributes. It also uses naive bayes for loan prediction, though it has lower accuracy than decision trees. PCA is used to remove dimensions from the naive bayes model to improve its accuracy. The document reviews related works on evaluating data mining methods for loan prediction and credit score risk management. It discusses the research methodology for using these algorithms to help banks determine which loan applications to approve or reject.

Uploaded by

Vaseem Akram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views4 pages

Loan Pre Research Paper

Uploaded by

Vaseem Akram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Loan prediction using Decision Trees, Naive Bayes

and PCA

Purushottam
Vaseem Akram Sanjay Kumar
Computer Science
Computer Science Computer Science
VIT, Vellore
VIT, Vellore VIT, Vellore
purushottam1@gmail.com
skakram100@gmail.com sanjay@gmail.com

Abstract— Loan prediction is important for the financial So as to keep away these type of criminal deceptions data
companies or sectors to calculate the credit score of the mining techniques have been implemented. This will use the
customers who want to apply for the loan. By the results they previous records of the users and will help them so that they
get by the model are used to approve the loan for the can estimate who many customers they can trust. As so they
customers whether they can get the loan or not without that it can stop these type of fraud customers from coming, and
will be difficult for the institution to know whether he or she create new offers for the users so that they can trust the bank .
can get the loan. As a result the bank will get those customers who are truly
credible. In these type of sections data mining techniques are
being utilized in an effective manner.
ACKNOWLEDGEMENT - We would like to thank DR. .
Santhi. K, for giving us the opportunity to do our project
under her guidance. We are extremely thankful to her for
giving us valuable inputs and ideas, solving our doubts II. IMPLEMENTATION OF PAPER 1 AND PAPER 2
whenever we had and giving us timely feedbacks. After the In this section we are going to implement an decision tree
completion of every milestone and helping us all over the based on the technique we have approached in the paper. By
project and semester to complete the project. this we can find out the number of customers who are able
to have the access to take the loan.

A. Decision Tree

The Decision Tree algorithm is used in the approached

I. INTRODUCTION model. It consists of a parent node, child and branches. The
parent node represents the attributes of the given data set, the
As there are a large group of data available presently the data branch denotes the output of the given data , the child node
mining models or methods have become more useful and also represents the similar symbols in the data. The top most is
we can acquire knowledge from them. These methods are used the parent node.
in many cases like retail companies, communication, bio data
reading or evaluation, intrusion finding and many different The tree which we developed is about what we are getting as
cases. They are also helpful in banking sectors that can be give a data set. In the figure it will show the 3 sections like
helpful or useful to be equal with other sectors. Here we do home, personal and other. It will show whether a person can
implemented an model for banking sector for : - Loan get a loan on that particular type or not.
prediction is important for the financial companies or sectors
to calculate the credit score of the customers who want to
apply for the loan. By the results they get by the model are B. Naïve Bayes
used to approve the loan for the customers whether they can
get the loan or not without that it will be difficult for the It is based on the Bayes algorithm and it is independent of
institution to know whether he or she can get the loan. others. It sis very simple to make this model as it is not that
much complicated even though we have a huge amount of
As there is huge development of data in banking sector will datasets. But the accuracy we get here is very low when
be handling with analyzing and making use of the data to gain compared to that decision tree algorithm. It will have many
required knowledge which have become an piece of work dimensions i.e, it cannot get the accurate output of the data
above man ability. These methods have to acquired according which has many dimensional values. Although it has a very
to the way to find out the business crisis by finding different low accuracy and still it is used because of its simplicity in
arrangements, grouping and connection which are loaded in using and it will perform some useful methods.
data base. As we use the techniques given the banking sectors
can achieve best accurate results of the users and their credit
scores and the their possibility to get the loan. The
development and the competitiveness has made the banking
sectors to keep focus on the users control of something and
criminal deception.
According to the Mileris every evaluation should start with
the initial probability prediction for particular cases of
fields. After that a sample of data is taken to get some more
information regarding the events. Here a additional info of
the prior prediction values are updated by evaluating the
revised predictions, as mentioned to as Posterior
predictions.
Andersonetal explained the Bayesian algorithm as given
way of making all probability evaluations.

C. PCA(Principle Component Analysis)

PCA is a technique that will use some statistical or

independent values and gets the results under variance of
given dataset.

Principle Component Analysis is a unique way of approach

which is based on S.V.D methodology. PCA gives results
that are related to the variance of the given values. The
output which we obtain can be sometimes useful or
preferable. It is used to remove the dimensional values. In
the Project we have used the PCA to remove dimensions in
the Naïve Bayes and get more accurate results to the project.
A. Reasearch Methodology
Banking sector are now a days in very competitive world, so
there is need of checking the credit score risk management as
III. RELATED WORKS
it play a major role for the safety of the company or sector.
When a user comes for the bank for applying for a loan of any
Our team has reviewed many articles based on the evaluating
type the bank should first calculate the credit score of the
process of different data mining methods or models and
particular person who approached. Same process is done for all
those which we use are being explained here. Loan prediction
users who comes for the loan request. As it is more important
is important for the financial companies or sectors to calculate
for the banking sectors many evaluation techniques are used to
the credit score of the customers who want to apply for the
help the banks so that they come to know for whom they have
loan. By the results they get by the model are used to approve
accept and reject. Here we are discussing more briefly about
the loan for the customers whether they can get the loan or
the algorithms we are using in the project component. We are
not without that it will be difficult for the institution to know
using decision tree, Naïve Bayes and principle component
whether he or she can get the loan.
analysis. Naives Bayes is a model based on the Bayes rule, the
model which predicts the attributes X1….Xn they all are
For removing the point values and to get the exact values of
independent of one another as Y. The evaluation for this
the credit scores we using PCA(Principle Component prediction is get simplified as P(X/Y) and estimation done fro
Analysis) for removing these point values. In this we are
the taken data.
also using Decision Tree for getting the number of
customers who can get the loan. The BN denotes JPD(Joint Probability Distribution) as a
group a sequence of inputs Xi. In present model, to make or
IV. METHODOLOGY develop a Naïve Bayes is depending upon the given
The use of data mining to estimate the defaults accurately is equation(1):
very necessary in banking sectors because to avoid the risk in
credit score management as it is important to keep the trusted
users form fault deception. Of all relative models for getting
or evaluating the results of moderate accuracy Bayesian
classifier is used to predict the probability. As known,
Antonakis and Safikianakis proved the prediction of a case by
Given a new occurrence X new =(X 1
grouping some data. Rosner explained from the information
...Xn),equation(1) the prediction that Y can make on other data,
he found about Posterior Prediction.
as the data for the X new & P(Y) and P(Xi/Y) from the given
input of the data. In case of other interest of getting most
probable values of Y, so the rule for Bayes algorithm is given
as the equation given below(2):
VI. REFERENCES

[I] Abdelmoula, Aida Krichene. "Bank credit risk

analysis with knearest-neighbor classifier: Case
ofTunisian banks." Accounting and Management
But , Mitchell(2010) proposed, “ as the Xi becomes Information Systems 14.1 (2015):79.
continuous the another way is to be selected for denoting the [2] Arutjothi, G., Dr. C. Senthamarai. "Comparison of
Feature Selection Methods for Credit Risk
distribution P(Xi/Y)”. There is very usual way that for every Assessment", International Journal of Computer
value of yk of Y, the Xi is Gaussian and is described as a Science, Volume 5, Issue I, No 5, 2017.
means of SD to Xi and yk. As to get such a naïve Bayes [3] Arutjothi,G.,Dr.C.Senthamarai. "Credit Risk
model , there is a need of estimation of mean and SD for Evaluation using Hybrid Feature Selection Method"
Gaussians: Software Engineering and Technology 9.2 (2017):23-
26.
[4] Attig, Anja, and Petra Perner. "The Problem
ofNormalization and a Normalized Similarity Measure
by Online Data."Tran. CBR 4.1 (2011):3-17.
[5] Babu, Ram, and A. Rama Satish. "Improved of K-
Nearest Neighbor Techniques in Credit Scoring."
International Journal For Development ofComputer
Science & Technology I (2013).
For every given input value of Xi and each possible value [6] Bach, Mirjana Pejic, et al. "Selection of Variables
yk of Y, keep in note that there are 2nK of these for Credit Risk Data Mining Models: Preliminary
parameters, all of which must be estimated independently. research" MIPRO 2017-40 th Jubilee International
In accordingly, we have to estimate the priors on Y as well Convention.2017.
[7] Byanjankar, Ajay, Markku Heikkila, and Jozsef
Mezei. "Predicting credit risk in peer-to-peer lending: A
neural network approach." Computational Intelligence,
2015 IEEE Symposium Series on. IEEE, 2015.
[8] Devi, CR Durga, and R. Manicka Chezian. "A
relative evaluation of the performance of ensemble
The model which is explained above is naïve Bayes model learning in credit scoring." Advances in Computer
which predicts x as a combination of CC(i.e. they are Applications (ICACA), IEEE International Conference
on. IEEE, 2016.
dependent on class variable Y) Gaussians. Additionally, naïve
[9] G.Arutjothi, Dr.C.Senthamarai, "Effective Analysis
Bayes prediction introduce some other data in which the ofFinancial Data using Knowledge Discovery
input values of Xi will not depend on any other components. Database", International Journal of Computational
Intelligence and Informatics, Vol. 6: No. 2, September
2016
V. CONCLUSION [10] Goel, Dr Himani, and Gurbhej Singh "Evaluation
of Expectation Maximization based Clustering
Thus we came to the conclusion that Decision trees are Approach for Reusability Prediction of Function based
much more accurate than naïve Bayes as it can be noticed Software Systems." International Journal of Computer
from the accuracy difference. But naïve Bayes can be Applications (0975-8887) Volume (2010).
improved by applying PCA and thus reducing dimensions.
[11] Abid, F. and Zouari, A. (2000), “Financial distress
prediction using neural networks”
[12 ] Abramowicz, W., Nowak, M. and Sztykiel, J.
(2003), Bayesian Networks as a Decision Support Tool
in CreditScoring Domain,IdeaGroup Publishing.
[13] Altman, E.I. (1968), “Financial ratios, discriminant
analysis and the prediction of corporate
bankruptcy”,JournalofFinance,Vol.23No.4,pp.589-609.
[14] Anderson, D.R., Sweeney, D.J., Freeman, J.,
Williams, T.A. and Shoesmith, E. (2007), Statistics for
Businessand Economics,Thomson
LearningEMEA,London.
[15] Antonakis,A.C.andSfakianakis,M.E.(2009),
“Assessing naïve Bayesasa method forscreeningcredit
applicants”,JournalofApplied
Statistics,Vol.36No.5,pp.537-545.
https://www.coursera.org/account/accomplishments/records/EGMMNJQVV9MX

Shaping Paper English
100% (1)
Shaping Paper English
26 pages
IDS 575 Project Report
No ratings yet
IDS 575 Project Report
9 pages
Kyocera KM1650 / 2050 Parts List / Manual
No ratings yet
Kyocera KM1650 / 2050 Parts List / Manual
48 pages
Yiye Avila - Dones Del Espíritu
50% (4)
Yiye Avila - Dones Del Espíritu
1 page
Tooth-Colored Restorations: (1) Good Esthetics
100% (1)
Tooth-Colored Restorations: (1) Good Esthetics
12 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
89 pages
Horticulture PRE-TEST
100% (1)
Horticulture PRE-TEST
4 pages
CCNA 3 Lab
100% (1)
CCNA 3 Lab
30 pages
School of Information Technology and Engineering M.Tech Software Engineering (Integrated) FALL SEMESTER 2020 - 2021
No ratings yet
School of Information Technology and Engineering M.Tech Software Engineering (Integrated) FALL SEMESTER 2020 - 2021
36 pages
DEEO
50% (2)
DEEO
8 pages
PREDICTING BANK CREDIT RISK USING DATA MINING Group SIX
No ratings yet
PREDICTING BANK CREDIT RISK USING DATA MINING Group SIX
5 pages
Customer Loan Prediction: Term Project Report
100% (1)
Customer Loan Prediction: Term Project Report
11 pages
Concrete Hollow Blocks
No ratings yet
Concrete Hollow Blocks
6 pages
Rapport Loan Prediction Finance
No ratings yet
Rapport Loan Prediction Finance
24 pages
Latest Data Mining Lab Manual
No ratings yet
Latest Data Mining Lab Manual
74 pages
Algorithm For The Loan Credibility Prediction System: Soni P M, Varghese Paul
No ratings yet
Algorithm For The Loan Credibility Prediction System: Soni P M, Varghese Paul
8 pages
Project Lit Final1
No ratings yet
Project Lit Final1
15 pages
Law of Karma Value Systems For Success.
No ratings yet
Law of Karma Value Systems For Success.
51 pages
A Journey Through Time - Us Forces in Malir Ww2
No ratings yet
A Journey Through Time - Us Forces in Malir Ww2
7 pages
A Novel Hybrid Classification Model For The Loan Repayment Capability Prediction System
No ratings yet
A Novel Hybrid Classification Model For The Loan Repayment Capability Prediction System
6 pages
(IJCST-V9I3P21) :sanket Bhattad, Sumit Bawane, Shweta Agrawal, Unnati Ramteke, Dr. P. B. Ambhore
No ratings yet
(IJCST-V9I3P21) :sanket Bhattad, Sumit Bawane, Shweta Agrawal, Unnati Ramteke, Dr. P. B. Ambhore
4 pages
RD 01 Mus 2
No ratings yet
RD 01 Mus 2
9 pages
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
No ratings yet
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
22 pages
Cluster Credit Risk R PDF
No ratings yet
Cluster Credit Risk R PDF
13 pages
Beginners Simple Enhancement For SE38: Applies To
No ratings yet
Beginners Simple Enhancement For SE38: Applies To
16 pages
Model YCRL Remote Condenser Scroll Liquid Chiller Style A: FORM 150.27-EG1 (1210)
No ratings yet
Model YCRL Remote Condenser Scroll Liquid Chiller Style A: FORM 150.27-EG1 (1210)
44 pages
Loan Approval Prediction Using DM Techniques: Pusendra Chaudhary, Sumit Chaudhary, Arpan Mahatra
No ratings yet
Loan Approval Prediction Using DM Techniques: Pusendra Chaudhary, Sumit Chaudhary, Arpan Mahatra
8 pages
An Exploratory Data Analysis For Loan Prediction Based On Nature of The Clients
No ratings yet
An Exploratory Data Analysis For Loan Prediction Based On Nature of The Clients
4 pages
KEC-751B (VLSI Design Lab)
No ratings yet
KEC-751B (VLSI Design Lab)
44 pages
Data Mining: (Kumar, Viswanath and Rao, 2016)
No ratings yet
Data Mining: (Kumar, Viswanath and Rao, 2016)
3 pages
MP Paper
No ratings yet
MP Paper
4 pages
Bank Loan Approval Prediction Using Data Science Technique (ML)
No ratings yet
Bank Loan Approval Prediction Using Data Science Technique (ML)
10 pages
Loan Eligibility Prediction
No ratings yet
Loan Eligibility Prediction
12 pages
Urinary Elimination
100% (14)
Urinary Elimination
7 pages
Credit Risk Analysis Using Naive Bayes in Machine Learning
No ratings yet
Credit Risk Analysis Using Naive Bayes in Machine Learning
5 pages
Riopipeline2019 1107 201905201751ibp1107 19 Jacques PDF
No ratings yet
Riopipeline2019 1107 201905201751ibp1107 19 Jacques PDF
7 pages
Bài So N - Syntax Lesson 5
No ratings yet
Bài So N - Syntax Lesson 5
21 pages
Fin Irjmets1651834789
No ratings yet
Fin Irjmets1651834789
8 pages
Loan Approval Prediction Using Supervised Learning Algorithm
No ratings yet
Loan Approval Prediction Using Supervised Learning Algorithm
11 pages
Identifying Bank Frauds Using CRISP-DM and Decision Trees
No ratings yet
Identifying Bank Frauds Using CRISP-DM and Decision Trees
8 pages
Loan Approval - PPT
No ratings yet
Loan Approval - PPT
19 pages
Project Stage I Report
No ratings yet
Project Stage I Report
17 pages
Loan Approval Prediction Based On Machine Learning Approach: Kumar Arun, Garg Ishan, Kaur Sanmeet
No ratings yet
Loan Approval Prediction Based On Machine Learning Approach: Kumar Arun, Garg Ishan, Kaur Sanmeet
4 pages
Bank Loan Prediction Using Machine Learn
No ratings yet
Bank Loan Prediction Using Machine Learn
7 pages
2022 V13i1198
No ratings yet
2022 V13i1198
12 pages
SSRN Id4532468
No ratings yet
SSRN Id4532468
13 pages
Credit Card Score Prediction Using Machine Learning
No ratings yet
Credit Card Score Prediction Using Machine Learning
8 pages
Credit Risk Analysis
No ratings yet
Credit Risk Analysis
6 pages
Credit Approval Data Analysis Using Classification and Regression Models
No ratings yet
Credit Approval Data Analysis Using Classification and Regression Models
2 pages
Accounting - Seneca - Toronto, Canada
No ratings yet
Accounting - Seneca - Toronto, Canada
7 pages
2818-Article Text-5218-1-10-20210411
No ratings yet
2818-Article Text-5218-1-10-20210411
5 pages
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
No ratings yet
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
11 pages
Loan Prediction 10
No ratings yet
Loan Prediction 10
10 pages
GEA Convair: Empty PET Bottle Air Conveying System
No ratings yet
GEA Convair: Empty PET Bottle Air Conveying System
4 pages
Yousra 032
No ratings yet
Yousra 032
11 pages
UTS Extensive Reading
100% (1)
UTS Extensive Reading
4 pages
Paper 1
No ratings yet
Paper 1
10 pages
Data Mining Techniques and Its Applications in Banking Section - Chitra and Subashini
No ratings yet
Data Mining Techniques and Its Applications in Banking Section - Chitra and Subashini
8 pages
Data Mining Approach
No ratings yet
Data Mining Approach
4 pages
Research Paper
No ratings yet
Research Paper
14 pages
2022 V13i876
No ratings yet
2022 V13i876
9 pages
Loan Approval Prediction System Using Machina Learning
No ratings yet
Loan Approval Prediction System Using Machina Learning
4 pages
Lecture Notes For Introductory Probability - Gravner
No ratings yet
Lecture Notes For Introductory Probability - Gravner
218 pages
Survey Paper On Classification
No ratings yet
Survey Paper On Classification
6 pages
Paper 3
No ratings yet
Paper 3
5 pages
Reviewer About Major Biomes in Environmental Science
No ratings yet
Reviewer About Major Biomes in Environmental Science
1 page
Dinesh RESEARCH PAPER
No ratings yet
Dinesh RESEARCH PAPER
7 pages
Financial Supervision and Management System
No ratings yet
Financial Supervision and Management System
9 pages
Ranvijay 12203409
No ratings yet
Ranvijay 12203409
13 pages
B2 19bec113 19bec116 Loan Prediction
No ratings yet
B2 19bec113 19bec116 Loan Prediction
3 pages
Cytogenetics Lab Report
No ratings yet
Cytogenetics Lab Report
8 pages
Mother India Rice Mill Pan Ack
No ratings yet
Mother India Rice Mill Pan Ack
1 page
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
7 pages
Arpit Pal E2 17 Report Loan-Prediction-System
No ratings yet
Arpit Pal E2 17 Report Loan-Prediction-System
34 pages
Gupta 2020
No ratings yet
Gupta 2020
4 pages
Loan Prediction System Using Machine Learning
No ratings yet
Loan Prediction System Using Machine Learning
4 pages
TCS-P-122.09-Rev. 00 Storage Handling & Installation of Comp
No ratings yet
TCS-P-122.09-Rev. 00 Storage Handling & Installation of Comp
20 pages
Prathyush PullaUB9A
No ratings yet
Prathyush PullaUB9A
9 pages
K3.4 KS3 Science Worksheet Y7+8+9 Revision
No ratings yet
K3.4 KS3 Science Worksheet Y7+8+9 Revision
109 pages
SSRN 5088929
No ratings yet
SSRN 5088929
11 pages
An Automatic Credit Analysis Model
No ratings yet
An Automatic Credit Analysis Model
12 pages
5.QFP - Probability-1
No ratings yet
5.QFP - Probability-1
2 pages
Student Lms - Usecs
No ratings yet
Student Lms - Usecs
1 page
Research Paper ALAS
No ratings yet
Research Paper ALAS
4 pages
Real Test Bank Legal and Ethical Aspects of Health Information Management 4th Edition by Dana C McWay Ebook and TestBank Bundle Digital Bundle
No ratings yet
Real Test Bank Legal and Ethical Aspects of Health Information Management 4th Edition by Dana C McWay Ebook and TestBank Bundle Digital Bundle
351 pages
Ls Comp 1ed Tr9 U2 Worksheet Ans
No ratings yet
Ls Comp 1ed Tr9 U2 Worksheet Ans
8 pages
Loan Eligibility Prediction
No ratings yet
Loan Eligibility Prediction
14 pages
The Future of Business Credit: AI, Blockchain, and the New Rules of Business Lending
From Everand
The Future of Business Credit: AI, Blockchain, and the New Rules of Business Lending
Creden Stonebook
No ratings yet
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
From Everand
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
Steven Taylor
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Loan Pre Research Paper

Uploaded by

Loan Pre Research Paper

Uploaded by

Loan prediction using Decision Trees, Naive Bayes

The Decision Tree algorithm is used in the approached

C. PCA(Principle Component Analysis)

PCA is a technique that will use some statistical or

Principle Component Analysis is a unique way of approach

[I] Abdelmoula, Aida Krichene. "Bank credit risk

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.