0% found this document useful (0 votes)
82 views9 pages

Cretid Scoring - Neuronal Networks

This document presents a study comparing the performance of credit scoring models using neural networks versus traditional statistical techniques for microfinance institutions in Peru. The study builds several credit scoring models using multilayer perceptron neural networks and benchmarks them against models using linear discriminant analysis, quadratic discriminant analysis, and logistic regression techniques. Based on a sample of almost 5500 borrowers, the results show that the neural network models outperform the other three classic techniques both in terms of the area under the receiver-operating characteristic curve and misclassification costs, demonstrating their potential to improve the efficiency of microfinance institutions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views9 pages

Cretid Scoring - Neuronal Networks

This document presents a study comparing the performance of credit scoring models using neural networks versus traditional statistical techniques for microfinance institutions in Peru. The study builds several credit scoring models using multilayer perceptron neural networks and benchmarks them against models using linear discriminant analysis, quadratic discriminant analysis, and logistic regression techniques. Based on a sample of almost 5500 borrowers, the results show that the neural network models outperform the other three classic techniques both in terms of the area under the receiver-operating characteristic curve and misclassification costs, demonstrating their potential to improve the efficiency of microfinance institutions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Expert Systems with Applications 40 (2013) 356–364

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

Credit scoring models for the microfinance industry using neural networks:
Evidence from Peru
Antonio Blanco a,⇑, Rafael Pino-Mejías b, Juan Lara c, Salvador Rayo c
a
Department of Financial Economics and Operations Management, Faculty of Economics and Business Studies, University of Seville, Avda. Ramon y Cajal, 1, 41018 Seville, Spain
b
Department of Statistics and Operational Research, Faculty of Mathematics, University of Seville, Avda. Reina Mercedes, s/n 41012 Seville, Spain
c
Department of Financial Economics and Accounting, Faculty of Economics and Business Studies, University of Granada, Campus Cartuja, s/n 18071 Granada, Spain

a r t i c l e i n f o a b s t r a c t

Keywords: Credit scoring systems are currently in common use by numerous financial institutions worldwide. How-
Microfinance institutions ever, credit scoring with the microfinance industry is a relatively recent application, and no model which
Classification rules employs a non-parametric statistical technique has yet, to the best of our knowledge, been published.
Multilayer perceptron This lack is surprising since the implementation of credit scoring should contribute towards the efficiency
Linear discriminant analysis
of microfinance institutions, thereby improving their competitiveness in an increasingly constrained
Quadratic discriminant analysis
Logistic regression
environment. This paper builds several non-parametric credit scoring models based on the multilayer
perceptron approach (MLP) and benchmarks their performance against other models which employ
the traditional linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and logistic
regression (LR) techniques. Based on a sample of almost 5500 borrowers from a Peruvian microfinance
institution, the results reveal that neural network models outperform the other three classic techniques
both in terms of area under the receiver-operating characteristic curve (AUC) and as misclassification
costs.
Ó 2012 Elsevier Ltd. All rights reserved.

1. Introduction the implementation of automatic credit scoring2 systems to evaluate


their credit applicants since credit scoring reduces the cost of credit
Over the last decade, the microfinance sector1 has grown dra- analysis, improves cash flow, enables faster credit decisions, reduces
matically, and is currently considered as a booming industry. In the losses, and also results in the closer monitoring of existing ac-
the period 1998–2008, the number of microfinancial institutions counts and the prioritization of repayment collection. To this end,
(hereinafter, MFIs) grew by 474%, and the number of customers Rhyne and Christen (1999) suggest that credit scoring is one of the
increased by 1048%. Attracted by this rapid growth, a large number most important uses of technology that may affect microfinance,
of international commercial banks have started operating in the and Schreiner (2004) affirms that experiments carried out in Bolivia
microfinance sector, viewing it as a potential for profitable invest- and Colombia show that the implementation of credit scoring im-
ment. This injection of interest has increased the competition be- proves the judgment of credit risk and thus cuts, by more than
tween the players in this industry, and has negatively affected the $75,000 per year, the costs of MFIs. Nevertheless, and in contrast
MFIs. The MFIs therefore need to increase their efficiency in all their to the concentration of research on financial institutions, the devel-
processes, minimize their costs, and control their credit risk if they opment of credit scoring models in the microfinance sector has only
want to survive in the long-term. One way for the MFIs to become undergone minor advances. Furthermore, those models in existence
more efficient in order to compete with the commercial banks is are based on traditional parametric statistical techniques, mainly
linear discriminant analysis (LDA), quadratic discriminant analysis
(QDA), and logistic regression (LR), despite the overwhelming
⇑ Corresponding author. Tel.: +34 954 559 875; fax: +34 954 557 570.
evidence found in numerous studies which indicates that the non-
E-mail address: aj_blanco@us.es (A. Blanco).
1
parametric methodologies usually outperform these classic statisti-
In the microfinance sector, operate the Microfinance institutions (hereinafter
MFIs) which offer savings services and small loans (namely microcredits) to those
cal models (for example, see Lee & Chen, 2005; West, 2000). That
sectors of the population with the greatest problems of access to financial resources. is, to the best of the authors’ knowledge, in the existing literature
Therefore, the MFIs exercise relevant social work since they financially support the
poorest people, who, by creating a microenterprise, can escape the socioeconomic
2
situation of exclusion in which they find themselves. For this reason, the goals and The objective of credit scoring models is to assign credit applicants to one of two
management criteria of the many MFIs lay less emphasis on business components groups: either to a ‘good credit’ group that is likely to repay the financial obligation or
and greater emphasis on social components than those used by their new competitors a ‘bad credit’ group that should be denied credit because of a high likelihood of
(international commercial banks). defaulting on the financial obligation (Hand & Henley, 1997).

0957-4174/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.eswa.2012.07.051
A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364 357

no credit scoring model designed for the microfinance industry ap- fault. In Section 3, several credit scoring models specifically de-
plies a non-parametric methodology, and therefore, the microfi- signed for MFIs are developed. To his end, various methodologies
nance industry has not yet benefited from the advantages of are employed: Fisher discriminant analysis, logistic regression,
non-parametric techniques to improve the performance of credit and multilayer perceptron. In Section 4, the results of different
scoring models, and hence are failing to compete on equal terms models are shown and their comparison is made. An extensive dis-
with their new competitors, the international commercial banks. cussion on the results is also carried out. Finally, Section 5 provides
Of the few credit scoring models developed for MFIs, all have used the main conclusion of this study and future research lines are
parametric methodologies, particularly LDA and LR (Kleimeier & analyzed.
Dinh, 2007; Rayo, Lara, & Camino, 2010; Reinke, 1998; Sharma & Zel-
ler, 1997; Viganò, 1993; Vogelgesang, 2003; Zeller, 1998). However,
2. Data and variables
the strict assumptions (linearity, normality and independence
among predictor variables) of these traditional statistical models, to-
2.1. The data set
gether with the pre-existing functional form relating response vari-
ables to predictor variables, limit their application in the real world.
We use a data set of microcredits from a Peruvian Microfinance
Several authors (for example, Karels & Prakash, 1987; Reichert, Cho,
Institution (Edpyme Proempresa). Our dataset contains customer
& Wagner, 1983) point out that two basic assumptions of LDA are
information during the period 2003–2008 related to: (a) personal
often violated when applied to credit scoring problems: (a) the inde-
characteristics (marital status, gender, etc.); (b) economic and
pendent variables included in the model are multivariate and nor-
financial ratios of their microenterprise; (c) characteristics of the
mally distributed, (b) the group dispersion matrices (or variance–
current financial operation (type interest, amount, etc.); (d) vari-
covariance matrices) are equal across the failing and the non-failing
ables related to the macroeconomic context; and (e) any delays
groups (for a detailed analysis of the problems in applying discrim-
in the payment of a microcredit fee. After eliminating missing
inant analysis in credit scoring models, see Eisenbeis, 1978). In the
and abnormal cases, 5451 cases remain. From among these, 2673
cases where the covariance matrices of the two populations are
(49.03%) are default cases, and 2778 (50.97%) are not. In line with
unequal, theoretically, QDA should be adopted, although LDA is
other studies (for example, Schreiner, 2004), a microcredit present-
reported to be a more robust and precise technique (Dillon &
ing a delay in repayment of at least fifteen days is defined as de-
Goldstein, 1984). In the same way as LDA, LR is also optimal under
fault microcredit. To perform an appropriate comparison of the
the assumption of multivariate normal distributions with equal
classification models, (LDA, QDA, LP, and MLP), our final data set
covariance matrices, and LR also remains optimal in a wider variety
is randomly split into two disjoint sub-sets; a training set of 75%
of situations. However, logistic regression requires larger data sets to
and a test set of 25%. The test sample contains a total of 1363 cases
obtain stable results, interactions between predictor variables must
(51.80% failed and 48.20% non-failed). The configuration of param-
be formulated, and complex non-linear relations between the
eters of each model is selected through a 10-fold cross-validation
dependent and independent variables could be incorporated through
procedure, as described in Sections 3.1–3.3. One advantage of
appropriate but not evident transformations. For these reasons, in
cross-validation is that the credit scoring model is developed with
recent years, non-parametric statistical models, such as the k-near-
a large proportion of the available data (75% in this case).
est neighbor algorithm (Henley & Hand, 1996), support vector ma-
chines (Vapnik, 1998), decision tree models (Davis, Edelman, &
Gammerman, 1992), and neural network models (Patuwo, Michael, 2.2. Description of input variables
& Ming, 1993), have been successfully applied to credit scoring prob-
lems. Of these, artificial neural networks (ANNs) constitute one of Table 1 shows the input variables used in this study.3 They pro-
the most powerful tools for pattern classification due to their non- vide the various characteristics of borrowers, lenders, and loans.
linear and non-parametric adaptive-learning properties. Many stud- Numerous qualitative variables are considered in our study, since:
ies have been conducted that have compared ANNs with other tradi- (a) Schreiner (2004) suggests that the input variables of the credit
tional classification techniques in the field of credit scoring models, scoring forces the microfinance sector to be more qualitative and
since the default prediction accuracies of ANNs are better than those informal than those considered by traditional banks; and (b) recent
using classic LDA and LR (Arminger, Enache, & Bonne, 1997; Desai, literature concludes that the inclusion of qualitative variables im-
Conway, Crook, & Overstreet, 1997; Desai, Crook, & Overstreet, proves the prediction power of models. Moreover, since the default
1996; Hand & Henley, 1997; Lee & Chen, 2005; Lee, Chiu, Lu, & Chen, of borrowers has a close relationship with the general economic sit-
2002; Malhotra & Malhotra, 2002; Markham & Ragsdale, 1995; Pat- uation, variables linked to the macroeconomic context are also con-
uwo et al., 1993; Piramuthu, 1999; Srinivasan & Ruparel, 1990; sidered as input variables. With respect to the dependent variable,
West, 2000). However, despite yielding satisfactory results, ANNs default of the microcredit, this takes a value of 1 if the microcredit
also feature certain disadvantages, such as its black box nature and fails, and 0 otherwise.
the long training process involved in the design of the optimal net- The first ratio indicates the number of times the income exceeds
work topology (Chung & Gray, 1999). total assets. Therefore, we estimate that the ratio (R1) is inversely
The main goal of this paper is therefore to develop a credit scor- related with respect to the probability of default. The ratio R2 mea-
ing model specially designed for the microfinance industry by sures the relationship between the gross and operating costs of the
using multilayer perceptron neural networks (hereinafter, MLP). microenterprise. As with the previous ratio, we expect that the sign
Moreover, we also compare the performance of MLP models of its coefficient is negative since the higher the value of this ratio,
against the three parametric techniques most widely used: linear the more solvent the income/loss of the firm, and the lower the
discriminant analysis (LDA), quadratic discriminant analysis financial difficulties. The third financial ratio (R3) measures the
(QDA), and logistic regression (LR). Based on a large sample which liquidity of the microenterprise. Due to the design of this ratio,
contains financial and non-financial variables of almost 5500 bor- the higher its value, the lower the probability of default. Therefore,
rowers from a Peruvian MFI, seventeen credit scoring models are the sign of the estimator is expected to be negative. The fourth
created, of which fourteen are MLP-based models. 3
This table also shows the expected sign of the relationship between each input
The remainder of our paper proceeds as follows. In Section 2, variable and the probability of default. The statistical descriptions of all the input
details of our data set are provided, and a detailed examination variables are shown in Table 1 and Table 2 of Appendix 1. These statistics are
of the variables available is undertaken in order to predict the de- presented for each group (failed and non-failed).
Table 1

358
Description of financial, non-financial and economic variables.

Variable Description Expected estimator sign (b)


Financial ratios
R1 Asset rotation: income sales/total assets 
R2 Productivity: gross utility/operating costs 
R3 Liquidity: cash/total asset liquidity 
R4 Liquidity rotations: cash/income Sales  360 +
R5 Leverage1: total liabilities/(total liabilities + shareholders’ total equity) +
R6 Leverage2: total liabilities/shareholders’ equity +
R7 ROA: net income/total assets 
R8 ROE: net income/shareholders’ equity 
Non-financial information
Zone Geographical location of the agency or branch. Dummy variable: (0) central zone, (1) Outskirts +
Old Duration as a borrower of the MFI. Numeric variable 
Previous_Loan_Granted Previously granted credits. Numeric variable 
Loan_Granted Loans granted in the last year. Numeric variable 
Loan_Denied Previously denied loans. Numeric variable +

A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364


Sector Activity sector of the micro-business. Categorical variable: (0) commerce, (1) agriculture, (2) production, (3) service ±
Purpose Destination of microcredit. Dummy variable: (0) work capital, (1) fixed asset +
Mfi_Class MFI customer classification Dummy variable: (0) normal customer, (1) customer with repayment problems of any sort +
Total_Fees Total number of fees paid in credit history. Numeric variable 
Arrears Number of arrears. Numeric variable +
Ave_Arrear Average (days) of customer default. Numeric variable +
Max_Arrear Number of days of major default. Numeric variable +
Gender Borrower gender. Dummy variable: (0) male, (1) female 
Age Age at time of application. Numeric variable ±
Marital_St Marital Status. Dummy variable: (0) single, (1) family unit 
Employm_St Employment Status of borrower. Dummy variable: (0) owner, (1) dependent ±
Guarantee Guarantee presented. Dummy variable: (0) sworn declaration, (1) real guarantee +
Currency Type of currency for loan granted. Dummy variable: (0) Peruvian Nuevos Soles (PEN) (1) US Dollar ($) +
Amount Amount of microcredit. Numeric variable 
Duration Number of monthly fees for applied loan. Numeric variable. +
Interest_R Monthly interest rate for microcredit. Numeric variable +
Forecast Loan officer forecast: credit situation at expiration. Dummy variable: (0) without problems, (1) with problems +
Macroeconomic indicators
GDP Rate of annual change of Gross Domestic Product (GDP) during loan term 
CPI Rate of annual change of Consumer Price Index (CPI) during loan term +
Empl_R Rate of annual change of variation of employment rate (ER) during loan term 
ER Rate of annual change of variation of exchange rate (ER) PENa-$ during loan term +
IR Rate of annual change of interest rate (IR) during loan term +
SEI Rate of annual change of stock exchange index (SEI) during loan term 
Water Rate of annual change in cost of municipal water during loan term +
Electricity Rate of annual change in cost of electricity during loan term +
Phone Rate of annual change in cost of telephone consumption during loan term +
a
Peru’s currency is the Nuevo Sol, denoted by the ISO code PEN.
A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364 359

financial ratio (R4) indicates the number of days the microenter- who rarely have problems in the fulfillment of their payment obli-
prise takes to recover its treasury. In this case, the larger the value gations. On the other hand, real guarantees are demanded from
of this variable, the greater the likelihood of default. Therefore, the both new customers and those who in the past have had problems
expected sign of the estimator is positive. The fifth financial ratio with payments. Therefore, the sign of the estimator of the variable
(R5) represents the percentage of liabilities that have microenter- Guarantee must be positive. A microcredit granted in foreign cur-
prises in their financial structure. We understand that a high level rency (not in local currency) is affected by a risk in the rate of ex-
of liabilities inversely affects the ability of micro-entrepreneurs to change and, for that reason a positive sign is expected in the
pay. Consequently, a positive sign of the estimator of this variable estimator of the variable Currency. On the other hand, E. Proempresa
is expected. The sixth financial ratio (R6) measures the ratio be- only accepts a microcredit request for high amounts if customers
tween the amount of debt and equity, and thus complements the have paid their previous microcredits without any problem. There-
information provided by the previous variable. We estimate that fore, microcredits of high amounts correspond to old customers
a high debt ratio results in an increase in the likelihood of default, and good payers, since these customers have a lower probability
which would be a negative estimate. The seventh financial ratio of default than those with microcredits whose amount is lower.
(R7) measures the return on assets (ROA). A higher return on assets Therefore, a negative sign is expected in the estimator of the vari-
should help reduce the likelihood of default. A negative sign of this able Amount. It is widely supported by the literature on credit risk
variable estimator is therefore expected. The final financial vari- that the bank which lends money over the long-term runs a greater
able (R8) measures the return on equity (ROE), that is, the return risk of default than those banks that give short-term loans. There-
accrued by property of the company. The greater the financial re- fore, the coefficient of the variable Duration must be positive. The
turn of a firm, the smaller its probability of default. We therefore higher the interest rate of a financing source is, the more difficul-
consider that the sign of the estimator of this variable should be ties the borrower has repaying it. Consequently, the variable Inter-
negative. Customers who both live in a central area and locate their est_R must have a positive estimator. Finally, we believe that an
microenterprise in a central area usually run less risk of financial important variable, albeit totally subjective, is the risk analyst’s
distress than those in rural areas. Therefore, the variable Zone is ex- opinion on the probability that a customer may have financial
pected to have a positive sign in the estimator. The age of the rela- problems. Just as defined for the variable Forecast, we expect a po-
tion MFI-customer implies that the bank knows the payment sitive sign in its estimator.
history of a customer in detail, and this is why the variable Old is On the other hand, we also introduce variables with informa-
inversely related to the probability of default. The variables Previ- tion about the economic cycle since the absence of this kind of var-
ous_Loan_Granted and Loan_Granted are expected to have a nega- iable has historically implied a major limitation of financial
tive sign in the estimator for the same reasons as for the variable distress models. Furthermore, as stated by Kim and Sohn (2010)
Old; bearing in mind that a lasting relationship with the financial the macroeconomic environment is a key factor that directly af-
institution involves the lender knowing all the risk inherent to fects the payment behavior of any borrower. The macroeconomic
the customer and also believing that this customer is reliable, variables under consideration are calculated through the following
Crook, Hamilton, and Thomas (1992). For customers with loans de- expression:
nied in the past, the risk of financial problems is more present.
VMiþj  VMi
Thus, it is considered that the sign of the variable Loan_Denied is DVMi;j ¼ ð1Þ
VM i
positive. Since there is no previous reference that suggests a crite-
rion for the consideration of a sector with more financial problems where DVMi,j is the variation rate of the considered macroeconomic
than others, the sign of the variable Sector remains undetermined. variable and VM is the considered macroeconomic variable and i is
For the variable Purpose, we propose a positive sign as we under- the moment of the granting of the loan and j is the microcredit
stand that the microcredit destined to the acquisition of an asset duration.
implies a greater risk than a credit destined for working capital be-
cause the process of asset recovery through depreciation takes 3. Research methodology and experimental design
longer. Borrowers with any problem of payment in the past (great-
er risk) take the value of 1 in the variable Mfi_Class, thus, we con- 3.1. Discriminant analysis credit scoring model
sider that the sign of the estimator is positive. The higher the
amount in fees the customer has paid, the greater the experience Given two multivariate independent samples where p quantita-
as a customer, and the less the probability of default. Therefore, tive predictor variables have been observed for ni cases, i = 1, 2,
the sign of the estimator of the variable Total_Fees is negative. n = n1 + n2, the LDA model supposes that both populations are mul-
However, the variables Arrears, Ave_Arrears and Max_Arrears are tivariate normal with means l1 and l2 and common covariance
closely related to the probability of non-payment, and hence their matrix R. The LDA rule classifies a p-dimensional vector x to class
estimator has positive signs. According to Schreiner (2004), women 2 if
are better payers than men. Consequently a negative sign is consid-
b 1 ðl 1 t b 1 1
ered for the variable Gender. There is no empirical evidence about xt R ^2  l
^ 1Þ > l^ R l^ 2  l^ t1 R^ 1 l^ 1 þ log p^ 1  log p^ 2 ð2Þ
the relationship between the variable Age and the probability of 2 2 2
default; therefore the sign of the estimator for this variable cannot where the prior probabilities of class memberships p1 and p2
be determined. Customers responsible for a family unit usually are usually estimated by the class proportions in the training set.
have better payment behavior of their debts than those who are Linear Discriminant Analysis provides the minimum misclassifi-
single, that is, those without family obligations (Kleimeier & Dinh, cation rate, and therefore it is optimal under the hypothesis previ-
2007). For this reason, the variable Marital_St must have a negative ously described. This rule can be expressed as class 2 if D > 0,
estimator. A positive estimator is expected in the variable Em- where D is the linear discriminant function, computed through a
ploy_St, since customers who have some experience in the running linear combination of the inputs. The classification rule can also
of a microenterprise, have a lower probability of default than those be formulated by predicting class 1 if the estimated probability
who have only worked as an employee (that is, without any expe- for the class 1 is greater than a threshold probability pc. This last
rience as micro-entrepreneurs). In microfinance, the reputation of value can be selected by the empirical optimization of the classifi-
the borrower is the main guarantee. Hence, E. Proempresa asks cation error. As suggested in Hastie, Tibshirani, and Friedman
for only a sworn statement of their property from those customers (2001), a K-fold cross-validation may be followed. The training
360 A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364

data set is randomly split into K roughly equal-sized parts. For the ance matrices; although LR remains optimal in a wider variety of
kth part, the LDA model is fitted to the other K  1 parts, and the situations. However, LR requires larger data sets in order to obtain
classification error for each possible pc is computed on the kth part. stable results, and complex nonlinear relations between the depen-
The mean classification error of the K parts is obtained for each pc. dent and independent variables could be incorporated through
Ninety-nine possible values for pc (0.01, 0.02, . . . , 0.99) are consid- appropriate but not evident transformations.
ered in our study, and the value minimizing the 10-fold cross-val-
idation classification error set is selected, namely 0.35. 3.3. Artificial neural networks credit scoring models
Linear Discriminant Analysis is fitted with the R function lda
(Venables & Ripley, 2002), available in the MASS library. A variable Artificial Neural Networks (ANNs) constitute a computational
selection process with the function ‘greedy.wilks’ of the package paradigm which provides a great variety of mathematical nonlin-
‘klaR’ of R (Weihs, Ligges, Luebke, & Raabe, 2005) is first performed. ear models, useful for tackling a wide range of statistical problems.
In this case the initial model is defined by starting with the variable Several theoretical results support a particular architecture,
which separates the groups most. The model is then extended by namely the multilayer perceptron (MLP), an example being the
including further variables depending on the Wilk’s lambda crite- universal approximate property, as in Bishop (1995). Moreover,
rion: select the one which minimizes the Wilk’s lambda of the MLP is the most commonly used type of neural network in busi-
model and the variable is included if its p-value still shows statis- ness studies (Vellido, Lisboa, & Vaughan, 1999; Zhang, Patuwo, &
tical significance. Hu, 1998). Following these results, we have considered a three-lay-
The LDA model can be explained through the coefficients of the ered perceptron where the output layer is formed of one node
linear discriminant function. However, the simplicity of the model which provides the estimation of the probability of default. This
can be insufficient to capture complex structures in the dataset. value is computed with the logistic activation function g(u) = eu/
Moreover, the optimality of the LDA classification rule requires (eu + 1), also used in the hidden layer. By denoting H as the size
the data to be independent and normally distributed while the of the hidden layer, {vih, i = 0, 1, 2, . . . , p, h = 1, 2, . . . , H} as the synap-
covariance matrices are also required to comply with the tic weights for the connections between the p-sized input an1d the
homoscedastic assumption (Johnson & Wichern, 2002). When the hidden layer, and {wh, h = 0, 1, 2, . . . , H} as the synaptic weights for
covariance matrices are not assumed to be equal, quadratic dis- the connections between the hidden nodes and the output node,
crimination functions are computed, and hence the QDA rule yields then the output of the neural network from a vector of inputs
1 (x1, . . . , xp) is
arg maxdi ðxÞ; di ðxÞ ¼  b i j  1 ðx  l
log j R b 1 ðx  l
^ i Þt R ^ iÞ !
i
i 2 2 X
H
p
þ log p
^i ð3Þ ^ ¼ g w0 þ
y wh gðv 0h þ m v ih xj Þ ð5Þ
j¼1
h¼1

The R function qda (Venables & Ripley, 2002) in the MASS library is
The output of this model provides an estimation of the probability
used in our case study. A similar search for the cut point is also car-
of default for the corresponding input vector. A final decision can
ried out for the QDA model through the same set of 99 threshold
be obtained by comparing this output with a threshold, usually
probabilities as in LDA, thereby obtaining 0.99. ^ > 0:5:
set at 0.5, thereby reaching a decision of default if y
One major disadvantage of MLP is the fact that there is no
3.2. Logistic regression credit scoring model known procedure which guarantees that a global solution can be
attained for the problem of finding a configuration of synaptic
For a binary response and p quantitative predictors x1, . . . , xp, weights that minimizes the usual error criteria, and hence one of
(some of which may be dummy variables for coding qualitative the many possible local minima is often obtained through one of
variables, as in LDA and QDA), the LR model assumes that the prob- the many learning rules proposed in the literature. A further draw-
ability of the target response is back is its black-box nature, which makes it very difficult to inter-
eb0 þb1 x1 þþbp xp pret the resulting model, although certain relevant proposals exist,
pðx1 ; . . . ; xp Þ ¼ ð4Þ from among which stand out Bayesian neural networks (Neal,
1 þ eb0 þb1 x1 þþbp xp
1996).
There are several inferential procedures to test the statistical signif- As input nodes, our MLP models use the set of variables selected
icance of the whole model and of the individual significance of each for the sequential parametric model that has the highest area un-
variable. The model may also be interpreted for which a great fam- der the receiver operating characteristic curve4 (LR model). Never-
ily of diagnostics and criteria are available to identify influential and theless, since performance of the MLP can be improved with
outlying observations. Logistic regression can be fully embedded in normalization of the quantitative input variables, the range of each
a formal decision framework, but in order to perform a comparison predictor variable is mapped into the [1, 1] interval. No general rule
with the other models, a threshold probability needs to be specified, exists for the determination of the optimal number of hidden nodes:
which corresponds to varying the prior class probabilities. Thus 99 a crucial parameter for the optimal network performance (Kim,
possible values for this threshold probability (0.01, 0.02, . . . , 0.99) 2003). The most common way to determine the size of the hidden
are also considered, and that value which minimizes the 10-fold layer is via experiments or trial and error (Tang & Fishwick, 1993;
validation error is selected, thereby obtaining 0.58. Wong, 1991). The number of hidden nodes determines the complex-
We have fitted the LR model with the glm function in R (Ven- ity of the final model, and networks of a more complex nature fail to
ables & Ripley, 2002), in an attempt to compute the maximum like- ensure better generalization capability. One well-known strategy is
lihood estimators of the p + 1 parameters by an iterative weighted based on some type of validation study (Hastie et al., 2001) and
least squares (IWLS) algorithm. In the same way as in LDA, a pre- therefore we selected the size of the hidden layer (H) through a
vious stepwise procedure is run in order to select the most signif- 10-fold cross-validation search in {1, 2, . . . , 20}.
icant variables. The function ‘step.glm’ of R is employed, which Two different programs are used in the construction of the MLP
applies a forward sequential procedure based on the Akaike Infor- credit scoring models. The first choice is the freely available R sys-
mation Criterion. tem. The nnet R function (Venables & Ripley, 2002) fits single-hid-
Again, in the same way as LDA, LR is also optimal under the
assumption of multivariate normal distributions with equal covari- 4
Hereafter, AUC.
A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364 361

den-layer neural networks by means of the BFGS procedure, a qua- rules are made through MATLAB. And finally, another two MLPs are
si-Newton method also known as a variable metric algorithm, in an then fitted using the R system, one of which applies the regulariza-
effort to minimize an error criterion which allows a decay term k in tion procedure.
order to prevent overfitting problems.5 For classification problems,
one appropriate error function is the conditional maximum likeli- 3.4. Model evaluation measures
hood (or entropy) criterion (Hastie et al., 2001). Defining
W = (W1, . . . , WM) as the vector of all M coefficients of the net, and gi- The area under the ROC curve (AUC) is often employed in clas-
ven n targets y1, . . . , yn, where yi = 1 for microcredit default, and yi = 0 sification problems. In this paper, the AUC is computed with the
otherwise, the BFGS method is applied to the following problem: aid of the ROCR library available in R (Sing, Sander, Beerenwinkel,
!
X
n X
M & Lengauer, 2005). However, it is well known that, in order to eval-
^i þ ð1  yi Þ lnð1  y
Min ðyi ln y ^i ÞÞ þ k W 2i ð6Þ uate the overall default prediction capability of the designed mod-
W
i¼1 i¼1 els, the prior probabilities and the misclassification costs should
The R implementation of an MLP model requires the specification of also be considered (West, 2000). It is apparent that the cost associ-
two parameters: the size of the hidden layer (H) and the decay ated with a Type I error (a customer with good credit is misclassi-
parameter (k), and therefore a 10-fold cross-validated search of the fied as a customer with bad credit) and a Type II error (a customer
size of the hidden layer (H) and the decay parameter (k) is carried with bad credit is misclassified as a customer with good credit) are
out over a grid defined as {1, 2, . . . , 20}  {0, 0.01, 0.05, 0.1, 0.2, frequently very different. Generally, the misclassification costs
. . . , 1.5}. In this case, we have also considered training without regu- associated with Type II errors are much higher than those associ-
larization, where k = 0. ated with Type I errors. According to West (2000), the relative ratio
The Neural Network Toolbox (Demuth & Beale, 1997) with of misclassification costs associated with Type I and Type II errors
MATLAB R2010b constitutes the other tool employed to fit MLP. must be 1:5,6 and hence special attention should be paid to Type II
This commercial system offers a great variety of learning rules, errors of all models constructed. In accordance with West (2000), we
and we have considered the following six main learning algorithms express the function on computing the expected misclassification
to train the MLP: gradient descent, gradient descent with momen- cost when only two populations are considered as:
tum, BFGS quasi-Newton (similar to R), Levenberg–Marquardt, Cost ¼ C 21 P 21 p1 þ C 12 P12 p2 ð8Þ
scaled conjugate gradient, and resilient back-propagation. The first
algorithm is the traditional back-propagation method originally where p1 and p2 are prior probabilities of good and bad credit pop-
proposed with MLP, and hence it is included in our study, accom- ulations, P21 and P12 measures the probability of making Type I er-
panied by the variant based on a momentum term. These two rors (a customer with good credit is misclassified as a customer
learning rules require a key parameter, the learning rate. Rumel- with bad credit) and Type II errors (a customer with bad credit is
hart, Hinton, and Williams (1986) concluded that lower learning misclassified as a customer with good credit), respectively, and
rates tend to give the best network results and the networks are C21 as well as C12 are the corresponding misclassification costs of
unable to converge when the learning rate is greater than 0.012. Type I and Type II errors. In order to compute the expected mis-
For this reason, learning rate 0.010 is tested during the training classification costs of the various default prediction models, the
process of MLPs that use the gradient descent and its variant based estimates of misclassification probability and misclassification costs
on a momentum term as training algorithms. In our case, as recom- have first to be calculated. The most commonly adopted estimates
mended by MATLAB, the momentum takes the value 0.90. The for P21 and P12 are the fraction of good-credit customers misclassi-
other four methods are recommended in the MATLAB documenta- fied as bad-credit customers and the fraction of bad-credit custom-
tion for classification problems, and are widely known as second- ers misclassified as good-credit customers, where the two
order training algorithms. These six learning rules try to minimize coefficients differ and are independent from each model.
a sum of squared errors (SSE):
X
n 4. Results and discussion
Min ^i Þ2
ðyi  y ð7Þ
W
i¼1
In this section, the performance of the three parametric models
As in R, there remains the problem of selecting H, and therefore the (LDA, QDA and LR) are first discussed and compared, and, secondly,
size of the hidden layer (H) is chosen through a 10-fold cross-vali- the various MLPs developed are benchmarked with respect to the
dation search in {1, 2, . . . , 20} for each learning method. classic techniques. Finally, the statistical characteristics of the best
MATLAB allows the use of early stopping in MLP training. This credit scoring models are described.
well-known strategy splits the training data set into effective train- The input variables selected in the sequential selection process
ing and validation sets, and the error on the validation set is mon- and the values of their coefficients for LDA and LR models are
itored during training. When the validation error begins an shown in Tables 3 and A1 of Appendix 1. Table 3 contains the
increasing trend, the training process is stopped because an over- AUC, Types I–II errors and misclassification costs of all the models
fitting phenomenon may have been initiated. We have trained built. Focusing on the parametric models, we observe that the AUC
the MATLAB neural nets both with early stopping (25% of size) of LDA and QDA models are 93.03% and 91.98%, both of which are
and without early stopping. lower than the AUC of the LR model (93.22%). Therefore, in line
The basic parameters of all the fitted MLP models can be seen in with other authors (Lee et al., 2002; Ohlson, 1980), we find that
Table 2. Firstly, several MLP models are fitted using the traditional the LR model outperforms LDA and QDA.7 However, when the mis-
gradient descendent back-propagation training algorithm. Sec- classification cost criteria are employed, QDA has the lowest mis-
ondly, second-order training algorithms (quasi-Newton back-prop- classification costs (50.77%) of the all parametric models. Thus, in
agation, Levenberg–Marquardt back-propagation, resilient back- contrast with the results obtained with the AUC criteria, QDA shows
propagation, and scaled conjugate gradient back-propagation) are better performance, according to misclassification costs, than the
implemented in order to develop further MLPs. These six learning 6
Many other authors use this ratio (1:5); however, the real costs associated to each
type of error depend on each individual lender.
7
Since the LR approach has the highest AUC; all the MLPs use only the significant
5
The BFGS algorithm can be found in Bishop (1995). variables of the LR model as input nodes (for more details see Section 3.2. above).
362 A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364

Table 2
Basic parameters of multilayer perceptron models.

Models Training algorithm Software Hidden nodes Early stopping Regularization % Training % Validation
MLP 1 Gradient descent Matlab 14 No No 100 0
MLP 2 Gradient descent Matlab 14 Yes No 75 25
MLP 3 Gradient descent with momentum Matlab 10 No No 100 0
MLP 4 Gradient descent with momentum Matlab 10 Yes No 75 25
MLP 5 BFGS quasi-Newton Matlab 9 No No 100 0
MLP 6 BFGS quasi-Newton Matlab 9 Yes No 75 25
MLP 7 Levenberg–Marquardt Matlab 2 No No 100 0
MLP 8 Levenberg–Marquardt Matlab 2 Yes No 75 25
MLP 9 Scaled conjugate gradient Matlab 14 No No 100 0
MLP 10 Scaled conjugate gradient Matlab 14 Yes No 75 25
MLP 11 Resilient Matlab 9 No No 100 0
MLP 12 Resilient Matlab 9 Yes No 75 25
MLP 13 BFGS quasi-Newton R 10, k = 0 No No 100 0
MLP 14 BFGS quasi-Newton R 3, k = 0.2 No Yes 100 0

Table 3
AUC, Type I–II errors, and misclassification costs in the test sample.

MODELS AUC Type I errors (%) Type II errors (%) Misclassification costs
LDA (greedy.wilks) 0.9303 8.52 18.27 0.5143
QDA (qda) 0.9198 11.72 17.42 0.5077
LR (glm) 0.9322 5.94 20.96 0.5715
MLP 1 0.9023 9.40 24.40 0.6772
MLP 2 0.9124 8.20 22.90 0.6326
MLP 3 0.9015 15.30 21.50 0.6305
MLP 4 0.9458 7.60 16.70 0.4691
MLP 5 0.9079 11 15.70 0.4597
MLP 6 0.9427 7.60 17.10 0.4795
MLP 7 0.9389 4.40 22.40 0.6014
MLP 8 0.9413 3.70 22.40 0.5980
MLP 9 0.9148 12.60 18.30 0.5347
MLP 10 0.9459 7.60 16.70 0.4692
MLP 11 0.9395 10.70 15.30 0.4478
MLP 12 0.9357 8.50 17.60 0.4968
MLP 13 0.9236 6.68 22.81 0.6230
MLP 14 0.9543 7.76 15.30 0.4337

LDA and LR models. classification costs, so it is worthwhile this added parameter selec-
With respect to the non-parametric methodology, the results tion process.
show that, in at least several cases, the accuracy performance of In brief, we conclude, in line with other authors (for example, see
the MLP models is better than that of the LDA, QDA and LR models. Lee & Chen, 2005; West, 2000), that, in general, not only do MLP
However, in term of AUC, the results obtained for all methodolo- models have a greater AUC but also lower misclassification costs
gies are similar. Relevant differences are obtained in terms of the than the traditional LDA, QDA and LR approaches. These empirical
misclassification costs.8 For the MLP models, the highest AUC and results confirm the theoretical superiority (principally, non-linear
lowest misclassification cost are obtained when the second-order and non-parametric adaptive-learning properties) of the MLP mod-
algorithms re implemented. That is, our results suggest that the gra- els over the parametric and widely used LDA, QDA and LR models
dient descendent algorithm is less efficient than the second-order when applied to pattern classification problems. Moreover, there
algorithms considered in this study. However, when the gradient is no requirement for MLP models to assume the strict assumptions
descendent algorithm is implemented with momentum, then the of traditional statistical models, nor to assume pre-existing func-
performance, both in terms of AUC and misclassification costs, im- tional forms by relating response variables to predictor variables
proves considerably (see model MLP 4 in Table 3). Therefore, the tra- which result in their limited application in the real world. However,
ditional gradient descent is clearly superseded in our data set. the major disadvantages of an MLP model include: (a) its black-box
According to Table 3, the model with the highest performance is nature, which renders the resulting model very difficult to inter-
the MLP 14. It is a three-layer perceptron, with 20 input nodes, 3 hid- pret; and, (b) its long training process in designing the topology
den nodes and one output node. The training has been performed of the optimal network. However, despite these disadvantages of
with R, using a BFGS quasi-Newton learning rule, and both the size MLP models, we consider MFIs should use these models instead
of the hidden layer and the regularization parameter are selected of the traditional parametric models since even a minor improve-
by 10-fold cross-validation, the value of this latter parameter being ment in predictive accuracy of the MLP default-prediction model
0.2. Table 3 shows that early stopping in MATLAB models improve is of critical value. Just a mere 1% improvement in accuracy would
the AUC, but misclassification costs are not lower in all learning reduce losses in a large loan portfolio and save millions of dollars
rules. In the R model regularization improves both AUC and mis- (West, 2000). The differences, in terms of the misclassification
costs, between the best MLP (model MLP 14) with respect to the
8
LDA, QDA and LR models, are 8.06%, 7.04%, and 13.78%, respectively.
In this study, the values selected for the calculation of the misclassification costs
That is, the implementation of neural network approaches help to
are: C21 = 1 and C12 = 5 (as recommended by West (2000)), P21 and P12 are dependent
of each model; and p ^ 1 ¼ 0:482 and p ^ 2 ¼ 0:518. For further details on these reduce the MFI losses significantly, and therefore, provides a way
coefficients, see Section 3.4 above.
A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364 363

to obtain a competitive advantage over other MFIs which fail to Table A2


implement this methodology. Statistical description of qualitative independent variables.

Variable Categories Failed (%) Non-Failed (%)


5. Conclusion and futures research lines Zone Center 46.94 53.06
Outskirts 55.84 44.16
Credit scoring systems are currently in common use by the Sector Commerce 48.53 51.47
majority of financial institutions worldwide. However, the applica- Agriculture 60.68 39.32
tion of credit scoring within the microfinance industry is a rela- Production 53.22 46.78
Service 54.31 45.69
tively recent issue. In recent years, the use of non-parametric
methodologies and the introduction of non-financial variables into Purpose Work capital 47.07 52.93
Fixed asset 77.51 22.49
credit scoring models have boomed in the specialized literature.
However, very little research deals with both issues, and, to the Gender Male 51.32 48.68
Female 50.71 49.29
best of the authors’ knowledge, this is the first study which applies
a non-parametric methodology (MLP) to create a credit scoring Marital_St Single 50.73 49.27
Family unit 51.06 48.94
systems for the microfinance industry. For this reason, in this pa-
per, 14 multilayer perceptron (MLP) credit-scoring models are fit- Employm_St Owner 50.81 49.19
Dependent 70.73 29.27
ted and compared by using a Peruvian microfinance institution
Guarantee Sworn declaration 58.50 41.50
sample which contains financial and non-financial variables. In
Real guarantee 43.47 56.53
addition, these non-parametric models are benchmarked with
Currency PEN 89.30 92.10
the results of the traditional LDA, QDA and LR methodologies.
$ 10.70 7.90
Our findings show that multilayer-perceptron credit scoring can
Forecast Without problems 42.94 57.06
work for microfinance institutions, and obtain higher accuracy in
With Problems 97.27 2.73
performance and lower misclassification costs than the classic
LDA, QDA and LR models. These results imply major consequences
for the efficiency of MFIs due to the cost savings. Thus, the best
MLP involved provides a misclassification cost with a reduction Table A3
of 8.06%, 7.04%, and 13.78% in comparison with the LDA, QDA, Significant variables using linear discriminant analysis.

and LR models, respectively. That is, the implementation of a neu- Linear discriminant analysis model
ral network approach supposes that the MFIs reduce their losses in Variablea Coefficient
terms of millions of dollars, and therefore provides a way for the
Forecast 2.2062⁄
MFIs to achieve a competitive advantage over their competitors ER 0.1684⁄
(mainly commercial banks), since it constitutes a key to an increas- CPI 0.0956⁄
Total_Fees 0.0125⁄
Arrears 0.0232⁄
Table A1 Mfi_Class 0.7577⁄
Statistical description of quantitative independent variables. Guarantee 0.2508⁄
Duration 0.0684⁄
Variable Failed Non-Failed IR 0.0461⁄
Mean Standard Mean Standard Empl_R 0.0290⁄
deviation deviation Electricity 0.0125⁄
Purpose 0.3559⁄
R1 0.7637 0.8055 0.8436 0.8528
SEI 0.0040⁄
R2 3.9421 4.8284 3.8881 6.8548
GDP 0.0052⁄
R3 0.0683 0.0689 0.1448 3.2438
Zone 0.1412⁄
R4 0.1301 0.1368 0.1654 1.9812
R8 0.3811⁄
R5 0.1421 0.1617 0.1196 0.1474
Max_Arrears 0.0022⁄
R6 0.2242 0.3227 0.1810 0.2789
R2 0.0077⁄
R7 0.1531 0.1764 0.1771 0.2756
R8 0.1799 0.2015 0.2012 0.2911 a ⁄⁄⁄ ⁄⁄
p-Value < 0.001; p-value < 0.01. ⁄p-value < 0.05.
Old 2.3468 1.5110 2.2397 1.5099
Previous_Loan_Granted 5.3900 5.0040 5.0600 4.6940
Loan_Granted 3.4600 2.3040 4.3400 2.3400
Loan_Denied 0.3200 0.5380 0.3300 0.5360 Table A4
Mfi_Class 0.3500 0.4770 0.1100 0.3110 Significant variables using logistic regression.
Total_Fees 36.1800 25.8510 31.7100 22.8390
Logistic regression model
Arrears 13.0400 10.7870 13.3400 11.1700
Ave_Arrear 8.0000 8.1510 6.8600 6.4340 Variablea Coefficient
Max_Arrears 20.2000 27.7650 16.5500 21.5030
Forecast 4.2624⁄⁄⁄
Age 43.0175 10.6148 42.5628 10.4770
ER 0.3477⁄⁄⁄
Amount 0.7338 0.6548 0.6458 0.5998
Total_Fees 0.0221⁄⁄⁄
Duration 8.1100 4.7950 7.0300 3.5520
Arrears 0.0449⁄⁄⁄
Interest_R 4.9242 0.9183 5.1255 0.8801
Mfi_Class 1.2592⁄⁄⁄
GDP 8.8985 29.7134 4.8139 26.3989
Guarantee 0.6117⁄⁄⁄
CPI 2.6377 2.2101 3.1247 2.1318
IR 0.1011⁄⁄⁄
Empl_R 3.5702 10.6827 2.8671 9.6861
Empl_R 0.0247⁄⁄
ER 2.4123 4.4517 5.5607 3.8899
Purpose 0.6048⁄⁄
IR 5.9631 13.9525 12.1717 11.7493
GDP 0.0235⁄⁄⁄
SEI 44.5991 32.4527 49.5322 33.3754
Zone 0.4209⁄⁄⁄
Water 2.4576 3.7483 3.1681 4.2243
Water 0.0346⁄
Electricity 3.6054 12.2598 8.5162 10.4552
Duration 0.1275⁄⁄⁄
Phone -7.1809 8.0019 -1.7179 3.8308
Intercept 0.2685
a ⁄⁄⁄ ⁄⁄
p-Value < 0.001. p-value < 0.01. ⁄p-value < 0.05.
364 A. Blanco et al. / Expert Systems with Applications 40 (2013) 356–364

ingly constrained environment. Moreover, empirical evidence has Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (5th
ed.). Upper Saddle River: Prentice-Hall.
also been attained which supports the fact that MLP models
Karels, G., & Prakash, A. (1987). Multivariate normality and forecasting of business
trained with second-order algorithms obtain a significantly better bankruptcy. Journal of Business Finance Accounting, 14(4), 573–593.
performance (both in terms of AUC and misclassification costs) Kim, K. J. (2003). Financial time series forecasting using support vector machines.
than those that use the traditional gradient descent. Therefore, Neurocomputing, 55(1–2), 307–319.
Kim, H. S., & Sohn, S. Y. (2010). Support vector machines for default prediction of
we suggest that microfinance institutions apply neural network SMEs based on technology credit. European Journal of Operational Research,
approaches, especially those using second-order training rules, 201(3), 838–846.
when setting up their credit scoring models, instead of employing Kleimeier, S., & Dinh, T. A. (2007). Credit scoring model for Vietnam’s retail banking
market. International Review of Financial Analysis, 16(5), 471–495.
the parametric LDA, QDA and LR models. Lee, T. S., & Chen, I. F. (2005). A two-stage hybrid credit scoring model using
This paper offers an appropriate solution so that the MFIs can artificial neural networks and multivariate adaptive regression splines. Expert
benefit from all the positive aspects that the implementation of Systems with Applications, 28(4), 743–752.
Lee, T. S., Chiu, C. C., Lu, C. J., & Chen, I. F. (2002). Credit scoring using the hybrid
the credit scoring systems involves, such as the increase in effi- neural discriminant technique. Expert Systems with Applications, 23(3), 245–254.
ciency, profitability and market share, reduction of costs and Malhotra, R., & Malhotra, D. K. (2002). Differentiating between good credits and bad
losses, and professional-image management. Hence MFIs will be credits using neuro-fuzzy systems. European Journal of Operational Research,
136(1), 190–211.
able to create competitive advantages and compete with commer- Markham, I. S., & Ragsdale, C. T. (1995). Combining neural networks and statistical
cial banks by using advanced risk-management tools. predictions to solve the classification problem in discriminant analysis. Decision
This study can be further improved in future research in several Sciences, 26(2), 229–242.
Neal, R. M. (1996). Bayesian learning for neural networks. New York: Springer.
ways. Firstly, more relevant variables may be collected in an effort
Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy.
to increase the prediction accuracies of the models. And secondly, Journal of Accounting Research, 18(1), 109–131.
other newly developed classification methodologies, such as other Patuwo, E., Michael, Y. H., & Ming, S. H. (1993). Two-group classification using
kinds of artificial neural networks (e.g. radial basis function, learn- neural networks. Decision Sciences, 24(4), 825–845.
Piramuthu, S. (1999). Financial credit-risk evaluation with neural and neurofuzzy
ing vector quantization, fuzzy adaptive resonance, and Bayesian systems. European Journal of Operational Research, 112(2), 310–321.
learning neural networks), classification and regression trees Rayo, S., Lara, J., & Camino, D. (2010). A credit scoring model for institutions of
(CART), and support vector machines (SVM), can be employed microfinance under the Basel II normative. Journal of Economics, Finance &
Administrative Science, 15(28), 89–124.
and their results can then be compared with those of the MLP, Reichert, A. K., Cho, C. C., & Wagner, G. M. (1983). An examination of the conceptual
LDA, QDA and LR models established in this paper. issues involved in developing credit-scoring models. Journal of Business and
Economic Statistics, 1(2), 101–114.
Reinke, J. (1998). How to lend like mad and make a profit: A micro-credit paradigm
Appendix A versus the start-up fund in South Africa. Journal of Development Studies, 34(3),
44–61.
Rhyne, E., & Christen, R. P. (1999). Microfinance enters the marketplace. USAID
Tables A1–A4.
Microenterprise Publications.
Rumelhart, D. E., Hinton, D. E., & Williams, R. J. (1986). Learning internal
References representations by error propagation in parallel distributed processing.
Cambridge, MA: MIT Press.
Arminger, G., Enache, D., & Bonne, T. (1997). Analyzing credit risk data: A Schreiner, M. (2004). Scoring arrears at a microlender in Bolivia. Journal of
comparison of logistic discriminant classification tree analysis and Microfinance, 6(2), 65–88.
feedforward networks. Computational Statistics, 12, 293–310. Sharma, M., & Zeller, M. (1997). Repayment performance in group-based credit
Bishop, C. M. (1995). Neural networks for pattern recognition. New York: Oxford programs in Bangladesh: An empirical analysis. World Development, 25(10),
University Press. 1731–1742.
Chung, H. M., & Gray, P. (1999). Special section: Data mining. Journal of Management Sing, T., Sander, O., Beerenwinkel, N., & Lengauer, T. (2005). ROCR: Visualizing the
Information Systems, 16, 11–16. performance of scoring classifiers, R package version 1.0-1. <http://
Crook, J. N., Hamilton, R., & Thomas, L. C. (1992). A comparison of discriminations rocr.bioinf.mpi-sb.mpg.de/>.
under alternative definitions of credit default. In L. C. Thomas, J. N. Crook, & D. B. Srinivasan, V., & Ruparel, B. (1990). CGX: An expert support system for credit
Edelman (Eds.), Credit scoring and credit control (pp. 217–245). Oxford: Oxford granting. European Journal of Operational Research, 45(2–3), 293–308.
University Press. Tang, Z., & Fishwick, P. A. (1993). Feedforward neural nets as models for time series
Davis, R. H., Edelman, D. B., & Gammerman, A. J. (1992). Machine learning forecasting. ORSA Journal on Computing, 5(4), 374–385.
algorithms for credit-card applications. IMA Journal of Mathematics Applied in Vapnik, V. N. (1998). Statistical learning theory. New York: Springer.
Business & Industry, 4(1), 43–51. Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural network in business: A
Desai, V. S., Conway, J. N., Crook, J. N., & Overstreet, G. A. (1997). Credit scoring survey of applications (1992–1998). Expert Systems with Applications, 17(1),
models in the credit-union environment using neural networks and genetic 51–70.
algorithms. IMA Journal of Mathematics Applied in Business & Industry, 8(4), Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S-PLUS. New
323–346. York: Springer.
Desai, V. S., Crook, J. N., & Overstreet, G. A. (1996). A comparison of neural networks Viganò, L. A. (1993). Credit scoring model for development banks: An African case
and linear scoring models in the credit union environment. European Journal of study. Savings and Development, 17(4), 441–482.
Operational Research, 95(1), 24–37. Vogelgesang, U. (2003). Microfinance in times of crisis: The effects of competition,
Demuth, H., & Beale, M. (1997). Neural network toolbox for use with Matlab.user’s rising indebtness, and economic crisis on repayment behaviour. World
guide. The Math Works Inc.. Development, 31(12), 2085–2114.
Dillon, W. R., & Goldstein, M. (1984). Multivariate analysis methods and applications. Weihs, C., Ligges, U., Luebke, K., & Raabe, N. (2005). klaR. Analyzing German
New York: Wiley. business cycles. In D. Baier, R. Decker, & L. Schmidt-Thieme (Eds.), Data analysis
Eisenbeis, R. (1978). Problems in applying discriminant analysis in credit scoring and decision support (pp. 335–343). Berlin: Springer-Verlag.
models. Journal of Banking and Finance, 2(3), 205–219. West, D. (2000). Neural network credit scoring models. Computers and Operations
Hand, D. J., & Henley, W. E. (1997). Statistical classification methods in consumer Research, 27(11–12), 1131–1152.
credit scoring: A review. Journal of the Royal Statistical Society, Series A, 160(3), Wong, F. S. (1991). Time series forecasting using backpropagation neural networks.
523–541. Neurocomputing, 2, 147–159.
Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning: Zeller, M. (1998). Determinants of repayment performance in credit groups: The
Data mining, inference, and prediction. New York: Springer. role of program design, intra-group risk pooling, and social cohesion. Economic
Henley, W. E., & Hand, D. J. (1996). A k-nearest neighbor classifier for assessing Development and Cultural Change, 46(3), 599–620.
consumer credit risk. Journal of the Royal Statistical Society, Series D (The Zhang, G. P., Patuwo, B. E., & Hu, M. Y. (1998). Forecasting with artificial neural
Statistician), 44(1), 77–95. networks: The state of the art. International Journal of Forecasting, 14, 35–62.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy