RST-MLP Method
RST-MLP Method
net/publication/361293278
CITATIONS READS
4 632
3 authors, including:
1 PUBLICATION 4 CITATIONS
Federal University of Technology
36 PUBLICATIONS 172 CITATIONS
SEE PROFILE
SEE PROFILE
All content following this page was uploaded by Kuboye Bamidele Moses on 28 June 2022.
PERSPEKTIF
Available online http://ojs.uma.ac.id/index.php/perspektif
1)Computer Science Department the Federal University of Technology Akure, Ondo State,
Nigeria
2)Software Engineering Department, the Federal University of Technology Akure, Ondo State,
Nigeria
3)Information Technology Department the Federal University of Technology Akure, Ondo
State, Nigeria
Received: April 20, 2022; Reviewed: April 20, 2022; Accepted: May 26, 2022
Abstract
Financial organizations such as banks have experienced an increase in demand for loans from borrowers over the
years. These organizations are highly interested in knowing whether a borrower can pay back if granted the loan
requested. Granting loans to defaulters can cripple the business, hence, these financial organizations are compelled
to evaluate credit worthiness of clients using the credit history of borrowers. Credit scoring is a technique used in
predicting the probability that a borrower will default. Several techniques have been adopted over the years such as
statistical and machine learning techniques, however, Machine learning techniques have been found to perform
better than the statistical techniques because they solve the challenges faced by credit analyst by automating the
processing and extraction of knowledge from data. The objective of this work is to improve upon the Artificial Neural
Network machine learning technique by adopting a better feature extraction technique. The methodology adopted
in this research is to use Rough Set Theory (RST) for relevant and efficient feature selection and Multi-Layer
Perceptron (MLP) Neural Network for classification. To test the models, the Australian and the German credit
datasets were used in the Anaconda machine learning platform. The results obtained from the research was
compared with some other machine learning models such as: Support Vector Machine, Random Forest, Decision
Tree, Logistic Regression, Naive Bayes, K-Nearest Neighbour and Artificial Neural Network using standard
evaluation metric to ascertain its performance on the two datasets. The results show that this work outperforms all
other models in any of the metrics considered. This research therefore has been able to show that the model is good
for credit scoring and has improved performance.
Keywords: Classification Techniques; Credit Scoring; Machine Learning Techniques; Rough Set Theory; Multi-Layer
Perceptron.
How to Cite: Ekong, R.E. Akintola, K.G. & Kuboye, B.M. (2022). Development of Credit Scoring Model for Borrowers
Using Machine Learning Techniques. PERSPEKTIF, 11 (3): 829-838
829
Ekong R.E., Akintola K.G. & Kuboye, B.M, Development of Credit Scoring Model for Borrowers Using
Machine Learning Techniques
830
PERSPEKTIF, 11(3) (2022): 829-838
A paper published in 2017 showed the authors found out that one of the challenges of
suitability of Extreme Machine Learning (ELM) low accuracy was the presence of redundant
as a predictive model for credit scoring. The attributes in their credit dataset. They proposed
result of their research proved that ELM when a credit model to optimize the Decision Tree
compared to other models used in their research model. The result showed C5.0 was a better
produced a better result. However, only three option. The setbacks in their approach was that
metrics were used in evaluating the model they only used accuracy for performance
performances. In our research, six performance evaluation.
metrics was employed to investigate the Furthermore, previous researches used
suitability of our credit scoring model for Australian and German credit dataset to analyse
prediction (Bequé and Lessmann, 2017). credit risk (Thanawala, 2019; Pandey, Jagadev,
In a previous research published in 2018, Mohapatra & Dehuri, 2017). These authors
the researchers investigated how commercial focused only on accuracy as a measure to
banks could predict loans using machine ascertain which model did better. Although
learning techniques. They developed a credit classification accuracy is widely used by many
model using KNN classifiers. They discovered researchers because it is easy to understand and
that there was no perfect model from their compute, it is not a reliable measure when
previous research. The credit dataset was dealing with an imbalanced dataset. In this
analysed and their classifier was implemented research, the proposed model performance was
using R. The result of using a combination of evaluated using various metrics.
KNN classifiers produced an accuracy of 75%
(Arutjothi & Senthamarai (2018). The limitation RESEARCH METHODS
of their research is their accuracy result was low. The architectural diagram for our
An improved accuracy is necessary to minimize proposed credit scoring model for this research
the credit risk of financial institutions and is presented in Figure 1. This model was divided
maximize profit. into three parts namely, the data preparation,
In another publication in 2018, the authors the feature selection and the classification. In the
carried out a prediction analysis on borrowers data preparation part, data was collected from
defaulting using machine learning model and an online repository, cleaned from missing
deep learning model. The paper highlighted key values by replacing the missing value with
components in approving credits to borrowers corresponding mean or median of all instances
as feature selection algorithms, classification and the obtained samples were discretized using
models, evaluation metrics and credit analysts. Decision Tree model. Attribute reduction was
They observed that the tree models were more carried out in the feature selection part using
stable than deep learning models (Addo, Guegan, Rough Set Theory (RST) and then the result
& Hassani, 2018). obtained from performing feature selection was
Three authors in Zhang, Yang & Zhou, divided into training set and test set. The
(2018) also researched credit rating using three classification part is used to ascertain the
machine learning models for feature selection. creditworthiness of a borrower using Multi-
The models are C5.0, CHAID and CART. The layer Perceptron (MLP) Neural Network.
831
Ekong R.E., Akintola K.G. & Kuboye, B.M, Development of Credit Scoring Model for Borrowers Using
Machine Learning Techniques
The dataset used in this paper are German has been used in most scientific papers available
and Australian credit dataset which was to predict credit score. The major features of
retrieved from the UCI Machine Learning datasets are mentioned in Table 1.
Repository (https://archive.ics.uci.edu/) which
Once data was collected, the datasets were performing RST (Becker, Radomska-Zalas &
queried to check if there were any missing Ziemba, 2020), Indiscernibility, Reduct, and Core
values. There were 37 occurrences in the are the concepts used in our RST methodology.
Australian sample with one or more missing 𝑇 = (𝑈, 𝐴, 𝐶, 𝐷), where 𝑈 denotes the
variables. Replacement method was used in universe of discourse, 𝐴 denotes a collection of
replacing the missing value with corresponding basic features and 𝐶, 𝐷 ⊂ denotes two subsets
mean or median of all instances. If the attribute of features known as conditional and decision
is nominal, we used median but if numerical we attributes, respectively . Let a ∈ A, P ⊆ A. The
used the mean. Also, the numerical variables indiscernibility relation 𝐼𝑁𝐷(𝑃) is expressed as
contained in the credit dataset were transformed (Qu et al., 2020):
into nominal variables by discretization while
𝐼𝑁𝐷(𝑃) = {(𝑥, 𝑦) ∈ 𝑈 × 𝑈 ∶ ∀ 𝑎 ∈
the nominal variables were represented in
𝑃, 𝑎(𝑥) = 𝑎(𝑦)} (1)
numerical code to aid in easy interpretability of
the feature. Normalization is then performed to Where (𝑥, 𝑦) are the cases, while a(x)
obtain a range of values between 0 and 1. signifies the value of attribute 𝑎 for case 𝑥 , and
A large number of attributes in datasets (𝑥, 𝑦) denotes a pair of cases. Thus, if (𝑥, 𝑦) ∈
may be unnecessary. The computing time for 𝐼𝑁𝐷(𝑃), it implies that x and y are identical with
categorization will increase if these duplicate regard to P.
attributes are not deleted. As a result, feature In order to select the most relevant
selection is utilized to choose crucial features attributes, we used a selection strategy in setting
that are required while creating models. In this a threshold value (Chowdhury & Turin, 2020)
research, RST was used to perform feature and it is calculated from the priority values
selection. Although there are numerous concepts representing the significance of each feature.
in RST (Skowron & Dutta, 2018) and methods of
832
PERSPEKTIF, 11(3) (2022): 829-838
𝑐𝑚𝑎𝑥 −𝑐𝑚𝑖𝑛
𝛽(𝑐) ≥ + (2) structure as the original dataset. There exist
3
Where 𝑐𝑚𝑎𝑥 and 𝑐𝑚𝑖𝑛 respectively reflect multiple reducts for a dataset. A set of features
the ranking vector's greatest and minimum 𝑅 ⊆ 𝐶 is called a reduct of 𝐶 . The set of all the
values. The goal of this technique is to provide a attributes indispensable in 𝐶 is denoted by
measure independent of the actual distribution 𝐶𝑂𝑅𝐸(𝐶).
of the priority values and to approximate the
selection of features having their priority values 𝐶𝑂𝑅𝐸(𝐶) = ⋂ 𝑅𝐸𝐷(𝐶) (3)
larger than the median value. The result is used
to form reduct, which is a minimal subset of Where 𝑅𝐸𝐷(𝐶) is the set of all reducts of 𝐶 .
attributes providing the same equivalence class
Data Split. Before classification was done, layers were used and its derivation is expressed
the pre-processed dataset was divided into two in equation 4 to 6.
sections: a training dataset and a test dataset
with 70:30 as the ratio. The training dataset was 𝑛𝑘, 𝑡 = 𝜔𝑘,0 + ∑𝑖𝑖=0 𝜔𝑘, 𝑖 𝑥𝑖, 𝑡 (4)
utilized for classification, whereas the test
dataset was used to assess the classification 1
accuracy. 𝑁𝑘,𝑡 = 1+ 𝑒 −𝑛𝑘, 𝑡 (5)
Model Selection. The multi-layer
perceptron (MLP) model trained with the back Equation (4) and (5) shows how the nodes
propagation was employed in this study. are being calculated from the set of input x with
Let 𝑥 = (𝑥1, 𝑥2, … , 𝑥𝑛 ) as a set of input and their corresponding arbitrary weights and the
ℎ = (ℎ1 , … , ℎ𝑚 ) as the set of nodes in the result (nk,t) is passed into the activation function.
hidden layer. There can be more than one hidden The activation function used is sigmoid function.
layer in MLP (Bekesiene, Smaliukiene, &
Vaicaitiene, 2021). In this research, two hidden
833
Ekong R.E., Akintola K.G. & Kuboye, B.M, Development of Credit Scoring Model for Borrowers Using
Machine Learning Techniques
Equation (6) is the result of the calculation this study. Each criterion has its strengths and
of all the nodes in the input and hidden layers weaknesses. In this study, it was found that using
while 𝛾0 is the bias. a combination of these indicators rather than a
single measure to evaluate the performance of
credit scoring models is preferable. The
𝑦𝑡 = 𝛾0 + ∑𝑙𝑙=1 𝛾𝑙 𝑃𝑘, 𝑖 + ∑𝑖𝑖=0 𝛽𝑥𝑖, 𝑡
confusion matrix is the source of all of these
(6) performance evaluation criteria. The confusion
matrix is a form of square matrix table used in
Model Performance Evaluation. showcasing the classification model result on
Accuracy, Precision, F1 Score, Recall, Type I test datasets. Confusion matrix gives an idea of
error and Type II error were among the what classification models predict correctly and
performance evaluation techniques employed in what they predict incorrectly.
RESULT & DISCUSSION existing credits at this bank. For the Australian
A total of 12 out of 20 conditional dataset, 7 out of 14 attributes were selected. The
attributes met the criteria when feature attributes are A2, A3, A5, A6, A7, A10 and A14.
selection was performed on the German credit Figure 2 represents the confusion matrix of
dataset. These features are Existing checking the model on the Australian dataset. A total of
account status; Month-long duration; History of 90% accuracy, 93% precision, 89% for recall and
credit; Bonds and savings accounts; Rate of 91% for f1 score were obtained. Figure 3 shows
installments expressed as a proportion of the confusion matrix of the model on the German
disposable income; Job; Other installment plans; dataset. A total of 87% for accuracy, 84% for
Other guarantors / debtors; property; current precision, 88% for f1 score and 91% for recall
residential address, age in years and Number of were obtained.
834
PERSPEKTIF, 11(3) (2022): 829-838
In order to ascertain how well our model compared with the result obtained from a
performed, the model was benchmarked with 6 previous research by Aithal & Jathanna (2019)
other models on the Australian and German and it outperformed the other models as shown
credit dataset. The models used were K-Nearest in Table 4. Our feature extraction methodology
Neighbour (KNN), Logistic Regression (LR), produced a higher accuracy when applied to a
Naive Bayes (NB), Random Forest (RF), and larger dataset compared to the novel credit data
Multi-Linear Perceptron (MLP) Neural Network model (Zhang, et al., 2018). It is important to
Decision Tree (DST) and Support Vector note that the credit history of the borrowers was
Machine (SVM). It was observed that the not considered as an important feature. This
proposed model performed best in all the six might be misleading because it is important to
performance metrics. study a borrower’s history before such an
The proposed model has an accuracy of applicant is granted a loan.
90% and 87% for the Australian and German The error results equally reveal that the
credit dataset respectively. This shows that Type I and Type II errors are the lowest when
there is a significant improvement with a compared with all the models used. Table 5 and
minimum of 4% in terms of accuracy, precision, Table 6 shows the performance of the proposed
f1 score and recall for each of the dataset used model on Australian and German datasets
when compared to other machine learning respectively compared to some existing models.
techniques. Furthermore, MLP+RST was
835
Ekong R.E., Akintola K.G. & Kuboye, B.M, Development of Credit Scoring Model for Borrowers Using
Machine Learning Techniques
836
PERSPEKTIF, 11(3) (2022): 829-838
837
Ekong R.E., Akintola K.G. & Kuboye, B.M, Development of Credit Scoring Model for Borrowers Using
Machine Learning Techniques
Conscripts. Mathematics, 9(6), 626. Qu, J., Bai, X., Gu, J., Taghizadeh-Hesary, F., & Lin, J.
https://doi.org/10.3390/math9060626 (2020). Assessment of Rough Set Theory in
Bequé, A., & Lessmann, S. (2017). Extreme learning Relation to Risks Regarding Hydraulic
machines for credit scoring: An empirical Engineering Investment
evaluation. Expert Systems with Decisions. Mathematics, 8(8), 1308.
Applications, 86, 42–53. https://doi.org/10.3390/math8081308
https://doi.org/10.1016/j.eswa.2017.05.050 Rudra Kumar, M., & Kumar Gunjan, V. (2020). Review
Bougard, D. A. (2017). Agricultural credit models: of Machine Learning models for Credit Scoring
identifying high risk Analysis. Ingeniería Solidaria, 16(1).
applications. Scholar.ufs.ac.za. https://doi.org/10.16925/2357-
https://scholar.ufs.ac.za/handle/11660/647 6014.2020.01.11
2?show=full Sharifi, P., Jain, V., Arab Poshtkohi, M., Seyyedi, E., &
Chowdhury, M. Z. I., & Turin, T. C. (2020). Variable Aghapour, V. (2021). Banks Credit Risk
selection strategies and its importance in Prediction with Optimized ANN Based on
clinical prediction modelling. Family Medicine Improved Owl Search
and Community Health, 8(1), e000262. Algorithm. Mathematical Problems in
https://doi.org/10.1136/fmch-2019-000262 Engineering, 2021, 1–10.
Hilscher, J., & Wilson, M. (2017). Credit Ratings and https://doi.org/10.1155/2021/8458501
Credit Risk: Is One Measure Simplilearn.com (2021). An Overview on Multilayer
Enough? Management Science, 63(10), 3414– Perceptron (MLP). [online] Available at:
3437. https://www.simplilearn.com/tutorials/dee
https://doi.org/10.1287/mnsc.2016.2514 p-learning-tutorial/multilayer-perceptron.
İlter, D., Kocadağlı, O., & Ravishanker, N. (2019). Siregar, S. P., & Wanto, A. (2017). Analysis of Artificial
Feature Selection Approaches for Machine Neural Network Accuracy Using
Learning Classifiers on Yearly Credit Scoring Backpropagation Algorithm In Predicting
Data. Recent Advances in Data Science and Process (Forecasting). IJISTECH
Business Analytics, 200–204. (International Journal of Information System
Moradi, S., & Mokhatab Rafiei, F. (2019). A dynamic & Technology), 1(1), 34.
credit risk assessment model with data mining https://doi.org/10.30645/ijistech.v1i1.4
techniques: evidence from Iranian banks. Skowron, A., & Dutta, S. (2018). Rough sets: past,
Financial Innovation, 5(1). present, and future. Natural
https://doi.org/10.1186/s40854-019-0121- Computing, 17(4), 855–876.
9. https://doi.org/10.1007/s11047-018-9700-
Nyoni, E. E. T., & Matshisela, N. (2018). Credit scoring 3
using machine learning algorithims. Thanawala, D. D. (2019). Credit Risk Analysis Using
Zimbabwe Journal of Science & Technology, e- Machine Learning and Neural Networks. Open
ISSN 2409-0360, 13, 26–34. Access Master's Report, Michigan
Pabuccu, H. & Ayan, T.Y. (2017) the Development of Technological University.
an Alternative Method for the Sovereign https://doi.org/10.37099/mtu.dc.etdr/856
Credit Rating System Based on Adaptive Wu, Y., & Pan, Y. (2021). Application Analysis of
Neuro-Fuzzy Inference System. American Credit Scoring of Financial Institutions Based
Journal of Operations Research, 7, 41-55. on Machine Learning Model. Complexity,
Pandey, T. N., Jagadev, A. K., Mohapatra, S. K., & 2021, 1–12.
Dehuri, S. (2017). Credit risk analysis using https://doi.org/10.1155/2021/9222617
machine learning classifiers. 2017 Zhang, X., Yang, Y., & Zhou, Z. (2018). A novel credit
International Conference on Energy, scoring model based on optimized random
Communication, Data Analytics and Soft forest. 2018 IEEE 8th Annual Computing and
Computing (ICECDS). Communication Workshop and Conference
https://doi.org/10.1109/icecds.2017.838976 (CCWC).
9 https://doi.org/10.1109/ccwc.2018.830170
7
838