0% found this document useful (0 votes)

7 views11 pages

1 PB

This study develops a machine learning model to predict customer churn in the banking industry, utilizing techniques such as Random Forest, AdaBoost, and Support Vector Machine (SVM). The model achieves an F1 score of 91.90 and an overall accuracy of 88.7% by applying the Synthetic Minority Oversampling Technique (SMOTE) to address data imbalance. The research emphasizes the importance of customer retention strategies in banking, highlighting that even a small increase in retention can significantly boost profits.

Uploaded by

Shad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views11 pages

1 PB

Uploaded by

Shad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

10Indonesian Journal of Electrical Engineering and Computer Science

Vol. 26, No. 1, April 2022, pp. 539~549

ISSN: 2502-4752, DOI: 10.11591/ijeecs.v26.i1.pp539-549  539

Predicting customers churning in banking industry: A machine

learning approach

Amgad Muneer1, Rao Faizan Ali1, Amal Alghamdi2, Shakirah Mohd Taib1, Ahmed Almaghthawi2,
Ebrahim Abdulwasea Abdullah Ghaleb1
1
Department of Computer and Information Sciences, Faculty of Science and Information Technology, Universiti Teknologi
PETRONAS, Seri Iskandar, Malaysia
2
Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering University of Jeddah,
Jeddah, Saudi Arabia

Article Info ABSTRACT

Article history: In this era, machines can understand human activities and their meanings.
We can utilize this ability of machines in various fields or applications. One
Received Jul 15, 2021 specific field of interest is a prediction of churning customers in any
Revised Jan 30, 2022 industry. Prediction of churning customers is the state of art approach which
Accepted Feb 7, 2022 predicts which customer is near to leave the services of the specific bank.
We can use this approach in any big organization that is very conscious
about their customers. However, this study aims to develop a model that
Keywords: offers a meaningful churn prediction for the banking industry. For this
purpose, we develop a customer churn prediction approach with the three
AdaBoost intelligent models random forest (RF), AdaBoost, and support vector
Banking industry machine (SVM). This approach achieves the best result when the synthetic
Churning minority oversampling technique (SMOTE) is applied to overcome the
Random forest unbalanced dataset and the combination of undersampling and
SMOTE oversampling. The method on SMOTED data has produced excellent results
Support vector machine with a 91.90 F1 score and overall accuracy of 88.7% using RF. Furthermore,
the experimental results show that RF yielded good results for the full
feature-selected datasets.
This is an open access article under the CC BY-SA license.

Corresponding Author:
Amgad Muneer
Department of Computer and Information Sciences, Universiti Teknologi PETRONAS
32610 Seri Iskandar, Malaysia
Email: muneeramgad@gmail.com

1. INTRODUCTION
Every day there is much competition growing in the banking industry [1]. Thus, if any bank wants
to increase its market share by acquiring new customers, it must follow customer retention strategies. It is
shown that improving the retention rate by up to 5% can increase a bank’s profit by up to 85% [2]. Different
banks offer attractive plans like internet banking, mobile banking, debit card, credit card, savings accounts
with nil balance, credit points based on the usage of the customers [3], best plans for various loans like
education loan, housing loan, agricultural loan, vehicle loan, mortgage loan, and startups loan. In the group of
all these facilities or plans, crediting a loan to a customer is a critical task because, in this case, each bank has
to analyze the customer's capacity prior to offering that loan [4]. To complete the crediting loan process to
customers, there are a number of banks that have decided to incorporate a credit card scheme that will ensure
that whenever a customer applies for a credit card, his or her ability to avail of the card will be evaluated.
Many banks initiate the request for providing credit cards to new customers based on their credit points [5].
However, there will be multiple opportunities for clients to churn out of a particular bank for every customer

Journal homepage: http://ijeecs.iaescore.com

540  ISSN: 2502-4752

who has more than one credit card with more than one bank [4], [6], [7]. Whenever a customer realizes that
Bank A offers many facilities at a low-interest rate compared to Bank B, the customer churning prediction for
Bank B is high. Therefore, it is the bank credit card account management system responsibility to ensure that
the existing customers are maintained through low interest rates. Churn analysis algorithms currently exist,
but they are limited by the nature of the churn prediction problem. These three features are typically
associated with this problem: i) The data is imbalanced; for example, the number of churn customers
represents a tiny fraction of the total samples (usually 2% of the total samples); ii) Data from large learning
applications will inherently contain noise; and iii) To predict churn, it is necessary to rank subscribers
according to their likelihood to churn [8], [9]. Nowadays, with the intense machine learning advancement, it
is beneficial to build a prediction approach that able to predict whether a credit cardholder or a customer will
churn out from a particular bank or not [4]. This prediction will be possible on previously available data
collected from the old customers history records. Machine learning (ML) methods like Naive Bayes, decision
trees, logistic regression, random forest, artificial neural networks, and support vector machines will
determine the churn [10]. All these ML techniques are implemented not only in the banking field but also
applied in various sectors like insurance [11], medical systems [12], cyberbullying [13], retail marketing [14],
automobile industry, gaming industry [12]. Therefore, the contribution of this study summarizes in threefold;
i) We collect credit card churn customer data of around 10,000 from Kaggle repository; ii) We have
conducted an exploratory data analysis (EDA) at the first stage based on available data and employ the
hybridization of SMOTE data sampling and random forest classifier to overcome inherent class imbalance
problem; iii) At the final stage of model selection and evaluation, we have implemented three models
(random forest (RF), AdaBoost, support vector machine (SVM)) and we have performed a detailed
comparison between model results.
The remainder of this paper is organized as shown in: Section 2 discusses the background of the
study and its related research. Research methodology is outlined in section 3, while experimental findings are
presented in section 4. Finally, section 5 concludes the paper by describing future directions.

2. LITERATURE REVIEW
Many data mining techniques can research credit card churn prediction systems. Related work of
available methods is listed out here briefly. For example, according to Dias et al., [15] have predicted in
advance whether a given customer will end his relationship with an organization or not. They use six
different methods using machine learning like the random forest, support vector machine, logistic regression,
multivariate adaptive regression splines, classification and regression techniques, and stochastic boosting
applied on the retail banking customer churn prediction problem, considering predictions up to 6 months in
advance. The best results are concluded from the stochastic boosting data mining technique. According to
Dalmia et al. [16] have used a supervised machine learning technique, a proprietary algorithm has been
created to predict and inform the bank about the customers at the highest risk of leaving the bank. Different
classifiers are able to achieve different accuracies with different datasets. K-nearest neighbour (KNN) is a
groundbreaking new approach based on weighted scales and the XGBooster algorithm for high and improved
accuracy. The dataset is appropriately grouped into training and testing models based on weighted scales and
the KNN algorithm. According Gholamiangonabadi et al. [17] proposed a study to find customer churn
predictions of an Iranian bank; they introduced a new procedural approach. First, they normalize their data
using data pre-processing. Then, a data cluster is formed by using a k-medoids method. The Davies-Bouldin
index is used to assess clustering performance. Various neural network (NN) approaches were utilized in
order to discover patterns within the data, including radial basis function (RBFNN), generalized regression
(GRNN), multilayer perceptron (MLPNN), and SVM. According to the results, MLPNN and SVM models
had higher precisions and lower costs. According to Ahmad et al. [18] have proposed three machine learning
techniques to be applied to predict churn, namely, Decision trees (DT), Naive Bayes, SVM, using two
benchmark datasets IBM Watson dataset, which contains 7033 observations, 21 attributes, and cell2cell
dataset that contains 71,047 observations and 57 attributes. Therefore, data unbalanced is one of the key
drawbacks of the aforementioned works.
The performance of the models has been measured using the area under the curve (AUC), which
they scored 0.82, 0.87, 0.77 respectively for the IBM dataset and 0.98, 0.99, 0.98 for the cell2cell dataset. In
[18], [19] the authors focus on applying data mining techniques in telecommunications to predict the
churning behaviour of customers. In this research work, they use the CART algorithm to predict customer
churning. In [20] research, they have built a computer system based on the application of artificial neural
networks (ANN) and SVM approaches. According to the model, there are three different states of customers:
active (i.e., those that are fully engaged in business with a positive balance in their account), non-active (i.e.,

Indonesian J Elec Eng & Comp Sci, Vol. 26, No. 1, April 2022: 539-549
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752  541

those with low balances in their accounts and those who do not have any investments), and churning (closed
bank account). They have demonstrated excellent results with their computer software [21].

3. RESEARCH METHOD
3.1. Data collection and description
This section describes the methods used to predict customer churning within the banking industry,
explain the dataset and the proposed approach utilized. The dataset used for the prediction process task is
publicly available on the Kaggle website [22]. The variables included in the dataset are listed in Table 1. Of
the 23 variables, the last two columns should be removed since they do not contribute to the classification
process. Removing the last two columns from the dataset now contains 21 variables, 20 predictor variables,
and one class variable. It contains 10,127 records, of which 8,496 (83.9%) are non-churners and 1,630
(16.1%) are churners. Therefore, the dataset is highly unbalanced in terms of the proportion of churners and
non-churners. Furthermore, we conducted an exploratory data analysis to determine the percentages between
genders, age groups, and so on. Before inputting the data to the classifier, it is necessary to balance the data
so that the classifiers do not tend towards the majority class consisting of non-churners while predicting the
future. A mixture of synthetic minority oversampling techniques (SMOTE), undersampling, and
oversampling is used to achieve the balancing.

Table 1. The Description of the data

Variable Description Value
CLIENTNUM Client number. Unique identifier Positive real number
for the customer holding the
account
Attrition_Flag Internal event (customer activity) if the account is closed, then 1 else 0
variable
Customer_Age Demographic variable Customer's Age in Years
Gender Demographic variable M=Male, F=Female
Dependent_count Demographic variable Number of dependents
Education_Level Demographic variable Educational Qualification of the account holder
Marital_Status Demographic variable Married, Single, Divorced, Unknown
Income_Category Demographic variable Annual Income Category of the account holder (< $40K, $40K
- 60K, $60K - $80K, $80K-$120K, > $120K, Unknown)
Card_Category Product variable Type of Card (Blue, Silver, Gold, Platinum)
Months_on_book Timespan Period of relationship with the bank
Total_Relationship_Count Product variable Total no. of products held by the customer
Months_Inactive_12_mon Timespan No. of months inactive in the last 12 months
Contacts_Count_12_mon Contact variable No. of Contacts in the last 12 months
Credit_Limit Credit variable Credit Limit on the Credit Card
Total_Revolving_Bal Credit variable Total Revolving Balance on the Credit Card
Avg_Open_To_Buy Open to Buy Credit Line Average of last 12 months
Total_Amt_Chng_Q4_Q1 Change in Transaction Amount Q4 over Q1
Total_Trans_Amt Total Transaction Amount Total Transaction Amount (Last 12 months)
Total_Trans_Ct Total Transaction Count Total Transaction Count (Last 12 months)

3.2. Exploratory data analysis

In machine learning, exploratory data analysis (EDA) is the process of analysing datasets in order to
summarize their main characteristics. Data analysis is used to determine what can be learned from the data
before modelling is performed [23]. It is very difficult to determine important data characteristics by
reviewing a column of numbers or a whole spreadsheet. Figure 1 illustrates the distribution of customer ages
as shown in Figure 1(a), and illustrates the distribution of customers for a given month as shown in Figure
1(b). Figure 2 shows the distribution of credit limits as shown in Figure 2(a), Figure 2(b) shows the
distribution of total transaction amounts in the last year. Lastly, Figures 3 represent the percentage of churned
and non-churned customers as shown in Figure 3(a) and the number of inactive months in Figure 3(b). The
following steps will use SMOTE to up sample the churn samples in order to make them comparable with the
regular customer sample size so the later selected models have a better chance of detecting small details that
would be lost otherwise.

Predicting customers churning in banking industry: A machine learning approach (Amgad Muneer)
542  ISSN: 2502-4752

(a)

(b)

Figure 1. Illustration of (a) distribution of customer age and (b) Distribution of months the customer is part of
the bank

3.3. Data pre-processing

This section pre-processed the data before introducing it to our proposed model. In the first instance,
we modified the values of our class variable (Attrition_Flag). This column contains two values. The
"Attrition Customer" value is changed from "1" to "0" while the "Existing Customer" value remains
unchanged. The gender column is then modified. Female is replaced with 1, and male is replaced with 0.
Finally, there are some Unknown values in Education_Level, Income_Category, and Marital_Status. These
values have been eliminated from our dataset.

Indonesian J Elec Eng & Comp Sci, Vol. 26, No. 1, April 2022: 539-549
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752  543

3.4. Data upsampling using SMOTE

The synthetic minority oversampling technique (SMOTE) can be described as a statistical
technique. This technique aims to increase the number of cases in our dataset in a balanced manner. We
generate new instances from our existing minority cases to feed our model. In this way, new instances are not
simply copies of existing minority cases; instead, the algorithm takes a sample of the feature space for each
target class and its nearest neighbours and creates new examples that combine features of the target case and
those of its neighbours. The new approach increases the number of features available to each class and makes
the samples more general. In order to increase the percentage of minority cases that are not attrited customers
to twice the rate of majority cases, we use SMOTE.

(a)

(b)

Figure 2. Illustration of (a) Distribution of the credit limit and (b) Distribution of total transaction amount
Predicting customers churning in banking industry: A machine learning approach (Amgad Muneer)
544  ISSN: 2502-4752

3.5. Proposed models employed in the prediction

The Random Forest method developed by Breiman and Cutler creates several classification trees. In
order to classify a new object from an input vector, it must put the input vector down each tree in the forest.
Every tree has a classification, and we say that its 'votes' for that classification. A forest selects the
classification that has received the most votes (over all the trees in the forest).
The SVM classifies data by creating an N-dimensional hyperplane that divides it into two groups.
The fundamental goal of SVM modelling is to find an ideal hyperplane that divides data in such a way that
samples belonging to one category of the target variable are on one side of the plane and samples belonging
to the other category are on the other side [13]. AdaBoost is one of the first boosting algorithms to be adapted
to solver practices. Adaboost combines multiple "weak classifiers" into a single "strong classifier" [13].

(a)

(b)

Figure 3. The results of (a) Proportion of churn vs does not churn customers and
(b) Number of inactive months

4. RESULTS AND DISCUSSION

In the following section, we discuss the results obtained from the experiments conducted in this
study. Firstly, we introduce a well-known evaluation measure to evaluate the performance and effectiveness

Indonesian J Elec Eng & Comp Sci, Vol. 26, No. 1, April 2022: 539-549
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752  545

of the proposed classifiers. Secondly, we show the 5-corss validation and then we described the experimental
results obtained in this study. Finally, the comparative analysis was provided to provide the readers a clear
comparison between the proposed classifiers in this study and the state of the art.

4.1. Evaluation measures

To evaluate the effectiveness of our classifier, we used four well-known evaluation matrices since our
data is balanced. These mectrics with their mathematical represntaion and difnation are discussed in this
section. These metrics are as given in the follows;

4.1.1. Accuracy
Accuracy is a ratio of the true detected cases to the total cases, and it has been utilized to evaluate
models on a balanced dataset [24]. Accordingly, it can be calculated as (1):

(𝑡𝑝+𝑡𝑛)
Accuracy =(𝑡𝑝+𝑓𝑝+𝑡𝑛+𝑓𝑛) (1)

where tp means true positive, tn is true negative, fp denotes false positive, and fn is a false negative.

4.1.2. Recall and F1-score

Recall: calculates the ratio of retrieved relevant churns over the total number of a relevant customer
churning [25]. F1-score allows combining both precisions and recall into a single measure that captures both
properties.
𝑡𝑝
Recall= (2)
(𝑡𝑝 + 𝑓𝑛)

2 × 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
F-measure= (3)
𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙

4.2. 5-Fold cross-validation

We have conducted a 5-fold cross-validation of our three models. The F1 validation score for the
random forest is higher than that of the Adaboost and SVM models. Figure 3 shows the performance
evaluation using F1.

Figure 3. Performance evaluation for three proposed models using F1-score metrics
Predicting customers churning in banking industry: A machine learning approach (Amgad Muneer)
546  ISSN: 2502-4752

4.3. Proposed models experimental results

Table 2 presents the results of the three models proposed in this research. The results shown in
Table 2 are based on upsampling the original data (SMOTE). Random forest outperforms both AdaBoost and
SVM classifiers with an F1-score of 0.91 and an accuracy of 88.7. The SVM classifier has achieved the
highest recall (1.00), whereas AdaBoost has achieved the lowest recall (0.87). Additionally, the proposed
models were tested and evaluated using the original data before applying the SMOTE technique. These
results are presented in Table 3.

Table 2. The performance of proposed three models with SMOTE technique

Proposed Model Recall F1 Score Accuracy
Random Forest 0.89 0.91 0.887%
AdaBoost 0.87 0.88 0.872%
SVM 1.00 0.89 0.776%

Table 3. The performance of proposed three models on original data before applying SMOTE
Model Recall F1 Score Accuracy
Random Forest 0.64 0.63 0.637%
AdaBoost 0.62 0.57 0.622%
SVM 0.75 0.55 0.562%

Table 2 and Table 3 show that the results based on random forest models are significantly higher
than those based on other models. As a result, we selected the random forest model to forecast customer
churning in the banking industry. The results of this prediction are presented in Figure 4.

Figure 4. Confusion matrix for random forest prediction on the original data

4.4. Comparison with literature

This section compares the proposed three classifiers with the state-of-the-art methods. Several
methods have been used to predict customer churn in the banking industry, including KNN, XGBoost, SVM,
Naive Bayes, Decision Trees, ANN, and RF. In Table 4, we compare three proposed models with related
literature contributions. The comparison is limited to the available metrics, but it essentially provides the
reader with the promising results of the proposed RF predictor. Our results demonstrate that the proposed
method surpasses the previous six methods for predicting customer churning in the banking industry.

Indonesian J Elec Eng & Comp Sci, Vol. 26, No. 1, April 2022: 539-549
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752  547

Table 4. Comparison of the proposed models with related literature contributions

Prediction Model Recall F1 Score Accuracy %
Proposed RF Predictor 0.89 0.91 88.7
Proposed AdaBoost Predictor 0.87 0.88 87.2
Proposed SVM Predictor 1.00 0.89 77.6
KNN [16] Not Reported Not Reported 83.85
XGBoost [16] Not Reported Not Reported 86.85
Naïve Bayes [26] 0.280 0.394 82.4
Decision Trees [26] 0.423 0.561 86.5
Random Forest [26] 0.474 0.588 86.4
ANN [26] 0.464 0.587 86.7

5. CONCLUSION
The proposed study conducted the most comprehensive investigation of the credit card churn
prediction problem in banks using machine learning techniques. We proposed a customer churn prediction
system with Random Forest, AdaBoost, and SVM intelligent models. The best results are achieved when the
unbalanced original data is SMOTED and undersampling is combined with oversampling. When the SMOTE
technique was applied to overcome the class imbalances in the data, the results revealed that RF
outperformed the other two predictors with an accuracy of 88.7% and an F1 score of 0.91. The experimental
results also demonstrated that RF performed well for the full feature-selected datasets. Accordingly, the
proposed RF predictor can be used to calculate customer churn periodically from various perspectives.
Churning can be measured in terms of the number of customers lost, the ratio of customers lost, or the
percentage of customers lost compared to the total number of customers in the bank. This churning can be
measured quarterly or annually. An accurate forecast provides insight into the future, which allows for
developing a strategy. Lastly, in future work, we seek to implement a deep learning model in order to
improve the accuracy of the proposed study.

REFERENCES
[1] I. Japparova and R. Rupeika-Apoga, “Banking business models of the digital future: The case of Latvia,” European Research
Studies Journal, vol. 20, no. 3, pp. 864–878, 2017, doi: 10.35808/ersj/749.
[2] G. Nie, W. Rowe, L. Zhang, Y. Tian, and Y. Shi, “Credit card churn forecasting by logistic regression and decision tree,” Expert
Systems with Applications, vol. 38, no. 12, pp. 15273–15285, Nov. 2011, doi: 10.1016/j.eswa.2011.06.028.
[3] R. Goel, S. Sahai, A. Vinaik, and V. Garg, “Moving from cash to cashless economy: A study of consumer perception towards
digital transactions,” International Journal of Recent Technology and Engineering, vol. 8, no. 1, pp. 1220–1226, Jun. 2019, doi:
10.17492/pragati.v7i1.195425.
[4] R. Rajamohamed and J. Manokaran, “Improved credit card churn prediction based on rough clustering and supervised learning
techniques,” Cluster Computing, vol. 21, no. 1, pp. 65–77, Mar. 2018, doi: 10.1007/s10586-017-0933-1.
[5] L. Bursztyn, B. Ferman, S. Fiorin, M. Kanz, and G. Rao, “Status Goods: Experimental evidence from platinum credit cards,”
Quarterly Journal of Economics, vol. 133, no. 3, pp. 1561–1595, Aug. 2018, doi: 10.1093/QJE/QJX048.
[6] H, Jain, G. Yadav, and R. Manoov. "Churn prediction and retention in banking, telecom and IT sectors using machine learning
techniques." Advances in Machine Learning and Computational Intelligence. Springer, Singapore, 2021. 137-156.
[7] G. G. Sundarkumar and V. Ravi, “A novel hybrid undersampling method for mining unbalanced datasets in banking and
insurance,” Engineering Applications of Artificial Intelligence, vol. 37, pp. 368–377, Jan. 2015, doi:
10.1016/j.engappai.2014.09.019.
[8] Y. Xie, X. Li, E. W. T. Ngai, and W. Ying, “Customer churn prediction using improved balanced random forests,” Expert Systems
with Applications, vol. 36, no. 3, pp. 5445–5449, Apr. 2009, doi: 10.1016/j.eswa.2008.06.121.
[9] K. G. M. Karvana, S. Yazid, A. Syalim, and P. Mursanto, “Customer churn analysis and prediction using data mining models in
banking industry,” in 2019 International Workshop on Big Data and Information Security, IWBIS 2019, Oct. 2019, pp. 33–38,
doi: 10.1109/IWBIS.2019.8935884.
[10] M. A. H. Farquad, V. Ravi, and S. B. Raju, “Data mining using rules extracted from SVM: An application to churn prediction in
bank credit cards,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics), vol. 5908 LNAI, 2009, pp. 390–397.
[11] N. A. Akbar, A. Sunyoto, M. Rudyanto Arief, and W. Caesarendra, “Improvement of decision tree classifier accuracy for
healthcare insurance fraud prediction by using Extreme Gradient Boosting algorithm,” in Proceedings-2nd International
Conference on Informatics, Multimedia, Cyber, and Information System, ICIMCIS 2020, Nov. 2020, pp. 110–114, doi:
10.1109/ICIMCIS51567.2020.9354286.
[12] S. M. Fati, A. Muneer, N. A. Akbar, and S. M. Taib, “A continuous cuffless blood pressure estimation using tree-based pipeline
optimization tool,” Symmetry, vol. 13, no. 4, 2021, doi: 10.3390/sym13040686.
[13] A. Muneer and S. M. Fati, “A comparative analysis of machine learning techniques for cyberbullying detection on twitter,”
Future Internet, vol. 12, no. 11, pp. 1–21, Oct. 2020, doi: 10.3390/fi12110187.
[14] M. Al-Ghobari, A. Muneer, and S. M. Fati, “Location-aware personalized traveler recommender system (lapta) using
collaborative filtering knn,” Computers, Materials and Continua, vol. 69, no. 2, pp. 1553–1570, 2021, doi:
10.32604/cmc.2021.016348.
[15] J. Dias, P. Godinho, and P. Torres, “Machine learning for customer churn prediction in retail banking,” in Lecture Notes in
Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12251
LNCS, 2020, pp. 576–589.

Predicting customers churning in banking industry: A machine learning approach (Amgad Muneer)
548  ISSN: 2502-4752

[16] H. Dalmia, C. V. S. S. Nikil, and S. Kumar, “Churning of bank customers using supervised learning,” in Lecture Notes in
Networks and Systems, vol. 107, 2020, pp. 681–691.
[17] D. Gholamiangonabadi, S. Nakhodchi, A. Jalalimanesh, and A. Shahi, “Customer churn prediction using a meta-classifier
approach; A case study of Iranian banking industry,” in Proceedings of the International Conference on Industrial Engineering
and Operations Management, 2019, vol. 2019, no. MAR, pp. 364–375.
[18] A. K. Ahmad, A. Jafar, and K. Aljoumaa, “Customer churn prediction in telecom using machine learning in big data platform,”
Journal of Big Data, vol. 6, no. 1, p. 28, Dec. 2019, doi: 10.1186/s40537-019-0191-6.
[19] V. K. Nijhawan, M. Madan, and M. Dave, “An analytical implementation of CART Using RStudio for Churn Prediction,”
Information and Communication Technology for Competitive Strategies, vol. 40. Springer Singapore, 2019.
[20] S. Osowski and L. Sierenski, “Prediction of customer status in corporate banking using neural networks,” in Proceedings of the
International Joint Conference on Neural Networks, Jul. 2020, pp. 1–6, doi: 10.1109/IJCNN48605.2020.9206693.
[21] K. Ebrah and S. Elnasir, “Churn prediction using machine learning and recommendations plans for telecoms,” Journal of
Computer and Communications, vol. 07, no. 11, pp. 33–53, 2019, doi: 10.4236/jcc.2019.711003.
[22] Churn for Bank Customers. (2020). Accessed: 21 March 2021. [Online]. Available: https://www.kaggle.com/mathchi/churn-for-
bank-customers
[23] A. Omar and A. Almaghthawi, “Towards an integrated model of data governance and integration for the implementation of digital
transformation processes in the Saudi Universities,” International Journal of Advanced Computer Science and Applications, vol.
11, no. 8, pp. 588–593, 2020, doi: 10.14569/IJACSA.2020.0110873.
[24] S. Naseer, S. M. Fati, A. Muneer, and R. F. Ali, “iAceS-Deep: Sequence-based identification of acetyl serine sites in proteins
using PseAAC and deep neural representations,” IEEE Access, vol. 10, pp. 12953–12965, 2022, doi:
10.1109/access.2022.3144226.
[25] A. Muneer and S. M. Fati, “Efficient and automated herbs classification approach based on shape and texture features using deep
learning,” IEEE Access, vol. 8, pp. 196747–196764, 2020, doi: 10.1109/ACCESS.2020.3034033.
[26] S. E. Charandabi, “Prediction of Customer Churn in Banking Industry,” Age, vol. 18, no. 92, pp. 38–92, 2020.

BIOGRAPHIES OF AUTHORS

Amgad Muneer received the B.Eng. degree (Hons.) in mechatronic

engineering from the Asia Pacific University of Technology and Innovation (APU),
Malaysia, in 2018. He is currently pursuing the master’s degree in information technology
with Universiti Teknologi PETRONAS, Malaysia. He has authored several ISI and Scopus
journal articles/conference papers. He is currently working as a Research Officer with the
Department of Computer and information Sciences, University Technology Petronas,
Perak, Malaysia. His research interests include machine and deep learning, image
processing, the Internet of Things, computer vision, and condition monitoring. He is a
Reviewer in some international impact-factor journals, and he has published more than 30
scientific publications. He can be contacted at email: muneeramgad@gmail.com.

Rao Faizan Ali received the bachelor’s degree in computer science from
COMSATS University Islamabad, Pakistan, and the M.Phil. degree in computer science
from the University of Management and Technology, Lahore, Pakistan. He is currently
pursuing the Ph.D. degree with University Technology PETRONAS, Malaysia. He has
eight years of experience in teaching and research. He has been with various computer
science positions in financial, consulting, academia, and government sectors. He is
currently working as a Research Officer with the Department of Computer and information
Sciences, University Technology Petronas, Perak, Malaysia. He can be contacted at email:
rao_16001107@utp.edu.my.

Amal Alghamdi Currently, she is a master student in computer science and

artificial intelligence at Jeddah University. She received her bachelor’s degree in Computer
Science from the Al-Baha University in 2014. her interests in Artificial intelligence,
machine learning and deep learning. She can be contacted at email:
dr.amal.alghamdi@gmail.com.

Indonesian J Elec Eng & Comp Sci, Vol. 26, No. 1, April 2022: 539-549
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752  549

Shakirah Mohd Taib is a lecturer and researcher at Centre for Research in

Data Science (CeRDaS) in Universiti Teknologi PETRONAS (UTP), Malaysia. She
obtained a bachelor’s degree in information technology from Universiti Utara Malaysia
and Master of Computing from University of Tasmania, Australia. She has more than 15
years working experience at Universiti Teknologi Petronas (UTP). Her area of
specialization includes data science, machine learning, knowledge discovery and
information retrieval using Artificial Intelligence techniques. Shakirah is a member of
international organization such as IEEE, Malaysia Board of Technologists (MBOT) and
Association for Information Systems (AIS). She can be contacted at email:
shakita@utp.edu.my.

Ahmad Almaghthawi received his bachelor’s degree in Computer Science

from Taibah University in 2015. He has a master’s degree in the program computer science
and artificial intelligence at Jeddah University. Currently, he works as adjunct lecturer at
college of computer science and artificial intelligence at Jeddah university. His scientific
interests are related to artificial intelligence, image and video processing, machine
learning, and in IoT. He can be contacted at email: ahmed.almaghthawi.1991@gmail.com.

Ebrahim Abdulwasea Abdullah Ghaleb received the B.Sc. and M.Sc.

Bachelor of information technology (Hons) in Networking Technology Infrastructure
University Kuala Lumper, Malaysia, and He hold Master. degree in Information system
from The National University of Malaysia (Malay: Universiti Kebangsaan Malaysia,
abbreviated as UKM). He is a Ph.D. student on information system with UTP Universiti
Teknologi PETRONAS. He has authored or coauthored more than 9 refereed journal and
conference papers, with Sustainability, Journal of Theoretical & Applied Information
Technology, Solid State Technology and International Congress of Advanced Technology
and Engineering, IEEE and Springer. my research interests include the applications of Big
Data, Healthcare evolutionary and heuristic optimization techniques to power system
planning, operation, and control. He can be contacted at email:
ebrahim_1800342@utp.edu.my.

Predicting customers churning in banking industry: A machine learning approach (Amgad Muneer)

Fidp in Business Finance
100% (2)
Fidp in Business Finance
19 pages
New PPT On Work Ethics
100% (10)
New PPT On Work Ethics
18 pages
DRS1
No ratings yet
DRS1
5 pages
Machine Learning To Develop Credit Card Customer Churn Prediction
No ratings yet
Machine Learning To Develop Credit Card Customer Churn Prediction
14 pages
56 Customer Churn Analysis and Prediction Using Data Mining Models in Banking Industry
No ratings yet
56 Customer Churn Analysis and Prediction Using Data Mining Models in Banking Industry
6 pages
Ensemble Based Customer Churn Prediction in Banking: A Voting Classifier Approach For Improved Client Retention Using Demographic and Behavioral Data
No ratings yet
Ensemble Based Customer Churn Prediction in Banking: A Voting Classifier Approach For Improved Client Retention Using Demographic and Behavioral Data
28 pages
Bank Customer Churn Prediction
No ratings yet
Bank Customer Churn Prediction
5 pages
Customers Churn Prediction in Financial Institution Using Artificial Neural Network
No ratings yet
Customers Churn Prediction in Financial Institution Using Artificial Neural Network
9 pages
To Design and Implement Application For Bank Customer Churning Rate Prediction and Analysis Using Machine Learning Algorithm
No ratings yet
To Design and Implement Application For Bank Customer Churning Rate Prediction and Analysis Using Machine Learning Algorithm
4 pages
A Data Mining Approach To Predict Prospective Business Sectors For Lending in Retail Banking Using Decision Tree
No ratings yet
A Data Mining Approach To Predict Prospective Business Sectors For Lending in Retail Banking Using Decision Tree
10 pages
Rahman 2020
No ratings yet
Rahman 2020
6 pages
Literature Review
No ratings yet
Literature Review
4 pages
Machine Learning Based Customer Churn Prediction in Banking: November 2020
No ratings yet
Machine Learning Based Customer Churn Prediction in Banking: November 2020
7 pages
Customer Churn Analysis in Banking Sector Using Data Mining Techniques
No ratings yet
Customer Churn Analysis in Banking Sector Using Data Mining Techniques
10 pages
Churn Modeling
100% (1)
Churn Modeling
11 pages
Enhanced Churn Prediction Model With Boosted Trees Algorithms in The Banking Sector
No ratings yet
Enhanced Churn Prediction Model With Boosted Trees Algorithms in The Banking Sector
6 pages
Customer Churn Prediction in Banking Industry Using Power-Bi
No ratings yet
Customer Churn Prediction in Banking Industry Using Power-Bi
9 pages
Comparative Study of Customer Churn Prediction Based On Data Ensemble Approach
No ratings yet
Comparative Study of Customer Churn Prediction Based On Data Ensemble Approach
10 pages
Reseacch
No ratings yet
Reseacch
29 pages
Churn Prediction Using Machine Learning Models
No ratings yet
Churn Prediction Using Machine Learning Models
6 pages
Customer Churn Prediction On Credit Card Services Using Random Forest Method
No ratings yet
Customer Churn Prediction On Credit Card Services Using Random Forest Method
8 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
6 pages
Financial Churn Modeling
No ratings yet
Financial Churn Modeling
20 pages
Ali Tamaddoni Jahromi, Mehrad Moeini, Issar Akbari, Aram Akbarzadeh
No ratings yet
Ali Tamaddoni Jahromi, Mehrad Moeini, Issar Akbari, Aram Akbarzadeh
11 pages
Research Paper - Tushar Agrawal
No ratings yet
Research Paper - Tushar Agrawal
3 pages
An Effective Method To Understand Bank Customer Re
No ratings yet
An Effective Method To Understand Bank Customer Re
5 pages
Final KHDL
No ratings yet
Final KHDL
32 pages
Class Imbalance Paper
No ratings yet
Class Imbalance Paper
18 pages
Project Report..
No ratings yet
Project Report..
36 pages
Ref 1
No ratings yet
Ref 1
10 pages
Abdu Proposal Last
No ratings yet
Abdu Proposal Last
20 pages
Customer Churn in Mobile Markets: A Comparison of Techniques
No ratings yet
Customer Churn in Mobile Markets: A Comparison of Techniques
14 pages
SSRN 4976040
No ratings yet
SSRN 4976040
14 pages
Admin, Guliyev
No ratings yet
Admin, Guliyev
15 pages
CHURNFORGE Research Paper Kajal
No ratings yet
CHURNFORGE Research Paper Kajal
6 pages
Sat - 90.Pdf - Prediction of Bank Customer Churn Using Machine Learning Technique
No ratings yet
Sat - 90.Pdf - Prediction of Bank Customer Churn Using Machine Learning Technique
11 pages
Sample Major Project-1 Report-7th Sem Word
No ratings yet
Sample Major Project-1 Report-7th Sem Word
36 pages
Churn PredictionITNACC
No ratings yet
Churn PredictionITNACC
7 pages
Customer Churn Prediction Model For Telecommunication Industry
No ratings yet
Customer Churn Prediction Model For Telecommunication Industry
7 pages
Classification of Customer Churn Prediction Model For Telecommunication Industry Using Analysis of Variance
No ratings yet
Classification of Customer Churn Prediction Model For Telecommunication Industry Using Analysis of Variance
7 pages
61 Nikhil
No ratings yet
61 Nikhil
12 pages
Décortication Article 1
No ratings yet
Décortication Article 1
4 pages
Seminar Presentation PKD21IT012
No ratings yet
Seminar Presentation PKD21IT012
31 pages
PFEreport
No ratings yet
PFEreport
43 pages
GRP 10 Report
No ratings yet
GRP 10 Report
16 pages
Report
No ratings yet
Report
79 pages
2024 Article 63750
No ratings yet
2024 Article 63750
13 pages
131 574 1 PB
No ratings yet
131 574 1 PB
12 pages
Research Proposal - FINANCE: Customer Churn & Cultural Shift in Banking Industry
No ratings yet
Research Proposal - FINANCE: Customer Churn & Cultural Shift in Banking Industry
6 pages
Customer Churn Prediction Using Machine Learning Algorithms
No ratings yet
Customer Churn Prediction Using Machine Learning Algorithms
6 pages
Expert Systems With Applications: Guangli Nie, Wei Rowe, Lingling Zhang, Yingjie Tian, Yong Shi
No ratings yet
Expert Systems With Applications: Guangli Nie, Wei Rowe, Lingling Zhang, Yingjie Tian, Yong Shi
3 pages
2017 Paper 10
No ratings yet
2017 Paper 10
5 pages
A Survey and Implementation of Machine Learning Algorithms For Customer Churn Prediction
No ratings yet
A Survey and Implementation of Machine Learning Algorithms For Customer Churn Prediction
7 pages
Predicting Customer Churn A Systematic Literature Review
No ratings yet
Predicting Customer Churn A Systematic Literature Review
22 pages
Customer Churn Prediction Employing Ensemble Learning
No ratings yet
Customer Churn Prediction Employing Ensemble Learning
5 pages
Churn Forecasting Using Deep Ljearning Model
No ratings yet
Churn Forecasting Using Deep Ljearning Model
5 pages
Industry: Internship Report
No ratings yet
Industry: Internship Report
27 pages
Electronics 13 04527 With Cover
No ratings yet
Electronics 13 04527 With Cover
34 pages
Grade 7 - Unit Plans: Shanghai Golden Apple School
No ratings yet
Grade 7 - Unit Plans: Shanghai Golden Apple School
15 pages
KMBN408 RPR Notice
No ratings yet
KMBN408 RPR Notice
4 pages
School-Research-Committee ConcepcionNHS Tabina
No ratings yet
School-Research-Committee ConcepcionNHS Tabina
2 pages
Summer Internship Opportunities at Younity-Final
No ratings yet
Summer Internship Opportunities at Younity-Final
6 pages
Final-Na-Ata - CHAPTER-1-5 Checked and Edited
No ratings yet
Final-Na-Ata - CHAPTER-1-5 Checked and Edited
107 pages
Taming The Table-Instructions For The Game: Times
No ratings yet
Taming The Table-Instructions For The Game: Times
3 pages
BSSW 3 2 Proposal
No ratings yet
BSSW 3 2 Proposal
5 pages
8274 - Language & Literature in English (US)
100% (1)
8274 - Language & Literature in English (US)
37 pages
Physics PP1 Quiz
No ratings yet
Physics PP1 Quiz
12 pages
Work Withdrawl
No ratings yet
Work Withdrawl
1 page
Mikao Usui Senseis Birthday
No ratings yet
Mikao Usui Senseis Birthday
2 pages
HANDKE, Peter. Kaspar - The Mechanics of Language - A Fractionating
100% (1)
HANDKE, Peter. Kaspar - The Mechanics of Language - A Fractionating
21 pages
Choice of Sainik Schools
No ratings yet
Choice of Sainik Schools
7 pages
TPACK Template: C o N T e N T Subject Science (Living Systems) Grade Level 3rd Grade Learning Objective
No ratings yet
TPACK Template: C o N T e N T Subject Science (Living Systems) Grade Level 3rd Grade Learning Objective
3 pages
Daad Courses 2024 11 23
No ratings yet
Daad Courses 2024 11 23
6 pages
LKPD Bahasa Inggris Kelas IX - Congratulation
100% (19)
LKPD Bahasa Inggris Kelas IX - Congratulation
2 pages
Mark Scheme (Results) January 2025: Pearson Edexcel International Advanced Level in Pure Mathematics 2 (WMA12) Paper 01
No ratings yet
Mark Scheme (Results) January 2025: Pearson Edexcel International Advanced Level in Pure Mathematics 2 (WMA12) Paper 01
23 pages
Mathematics-In-The-Modern-World (Gec 102)
No ratings yet
Mathematics-In-The-Modern-World (Gec 102)
15 pages
RWS-L4-Properties of A Well-Written Text - Student's
No ratings yet
RWS-L4-Properties of A Well-Written Text - Student's
82 pages
77-Presentation RFT Siri
No ratings yet
77-Presentation RFT Siri
15 pages
ADIT Awards Distinctions and Overall Pass List (December 2023)
No ratings yet
ADIT Awards Distinctions and Overall Pass List (December 2023)
5 pages
Heritage & Culture
No ratings yet
Heritage & Culture
11 pages
English 2 DLL Q2 Week 7
No ratings yet
English 2 DLL Q2 Week 7
5 pages
Contributions To Nonlinear Elliptic Equations and Systems
No ratings yet
Contributions To Nonlinear Elliptic Equations and Systems
434 pages
Ife Cv.
No ratings yet
Ife Cv.
5 pages
Sonalika Project NEW (1)
No ratings yet
Sonalika Project NEW (1)
83 pages
The University: An Introduction: TH TH
No ratings yet
The University: An Introduction: TH TH
21 pages
EdTech KSA
No ratings yet
EdTech KSA
19 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

1 PB

Uploaded by

1 PB

Uploaded by

10Indonesian Journal of Electrical Engineering and Computer Science

Vol. 26, No. 1, April 2022, pp. 539~549

Predicting customers churning in banking industry: A machine

Article Info ABSTRACT

Journal homepage: http://ijeecs.iaescore.com

Table 1. The Description of the data

3.2. Exploratory data analysis

3.3. Data pre-processing

3.4. Data upsampling using SMOTE

3.5. Proposed models employed in the prediction

4. RESULTS AND DISCUSSION

4.1. Evaluation measures

4.1.2. Recall and F1-score

4.2. 5-Fold cross-validation

4.3. Proposed models experimental results

Table 2. The performance of proposed three models with SMOTE technique

4.4. Comparison with literature

Table 4. Comparison of the proposed models with related literature contributions

Amgad Muneer received the B.Eng. degree (Hons.) in mechatronic

Amal Alghamdi Currently, she is a master student in computer science and

Shakirah Mohd Taib is a lecturer and researcher at Centre for Research in

Ahmad Almaghthawi received his bachelor’s degree in Computer Science

Ebrahim Abdulwasea Abdullah Ghaleb received the B.Sc. and M.Sc.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.