0% found this document useful (0 votes)
26 views3 pages

IEEE Conference Template

not

Uploaded by

tasnia.rifah009
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views3 pages

IEEE Conference Template

not

Uploaded by

tasnia.rifah009
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Credit Card Fraud Detection using Machine

Learning
1st Rifah Tasnia 2nd Sultan Mahmud Sagor
Department of Computer Science and Engineering Department of Computer Science and Engineering
North South University North South University
Dhaka, Bangladesh Dhaka, Bangladesh
rifah.tasnia1@northsouth.edu sultan.sagor@northsouth.edu

Abstract—Credit card fraudulent is an escalating phenomenon [2]S P Maniraj and his associates did such a survey. They have
which is being maneuvered by the credit card companies. This pa- used two algorithms, Isolation Forest and Local Outlier Forest,
per intends to aid the fraud detection process through a machine where both of the algorithms reached 99.6 percent accuracy.
learning approach. A model build upon experts’ rules, i.e., rules
based on knowledge from fraud experts require manual tuning [3] Another paper was done by Vaishnavi NathDornadula and
and human supervision. is selected to analyze the authenticity S.Geetha, who also worked on this topic and used six algo-
of a transaction. Our goal is to completely identify fraudulent rithms Local Outlier factor, Isolation forest, SVM, Logistic
transactions while reducing erroneous fraud categories. regression, Decision tree, and Random forest. They have found
Index Terms—Credit card fraudulent, fraud transaction, that Logistic Regression, Decision Tree, and Random Forest
Credit Card Fraud Detection data set, Logistic Regression
gave better results than all other algorithms.
[1]
I. I. INTRODUCTION
A card that is issued to the customer (cardholder), typically III. III. METHODOLOGY
enabling them to make purchases up to their credit limit or In this project, The model will be trained by Decision
withdraw cash in advance is known as credit card. Credit Tree classifier and Random Forest algorithm. This supervised
cards give the cardholder an advantage of time, allowing their learning will be used to detect if a transaction is fraudulent or
customers to pay back the debt at a later date by rolling it over not through yes or no binary values.
to the subsequent payment cycle. Credit card fraud refers to
the unauthorized and unwelcome use of a credit card account IV. DATA SET
by someone other than the account owner. The abuse can The dataset includes credit card transactions performed by
be stopped with the help of preventative measures, and the European cardholders in September 2013. We have 492 frauds
behavior of such fraudulent operations can be researched to out of 284,807 transactions in our dataset of transactions that
lessen it and prevent future occurrences. A data set containing took place over the course of two days. The dataset is seriously
the past fraud transaction determines the validity of a new out of balance, with frauds making up 0.172 percent of all
transactions. Using Machine Learning we may quickly identify transactions in the positive class. [4]
fraudulent tendencies and foresee transactions that are likely
to be fraudulent by using machine learning (ML) approaches. V. RESULTS AND ANALYSIS
A prediction model is inferred using ML techniques based Problem Definition:
on a set of examples. This model uses advanced machine -We will predict if a transaction is fraudulent or not.
learning techniques to assist fraud investigators. The model -Credit Card Fraud Detection data set has 284807 rows and
is a parametric function that allows, given a collection of 31 columns
features describing the transaction, to forecast the likelihood -The data-set contains transactions made by credit cards
that a transaction would be fraudulent. [1] in September 2013 by European cardholders. This data-set
presents transactions that occurred in two days, where we
II. II. LITERATURE REVIEW have 492 frauds out of 284,807 transactions.
Fraud is an unlawful or criminal deception that results in -The data-set’s csv file is saved in Desktop. We can access it
financial or personal benefit. It is a deliberate act against the through Jupyter’s notebook.
law, rule, or policy to attain unauthorized financial benefit.
Numerous literature on anomaly or fraud detection in this
domain have been published and are available for public use.
Even though these methods and algorithms fetched unexpected - The data-set is highly unbalance for which causes a class
success in some areas, they failed to provide a permanent and imbalance.
consistent solution to fraud detection. Number of Genuine transactions: 284315
Number of Fraud transactions: 492
Percentage of Fraud transactions: 0.1727

-As a result of a PCA transformation, the input variables are


numerical. The original data and any supporting information
were not made public due to confidentiality concerns.
Time and Value are the only variables that were not
transformed using the PCA. The seconds between each
transaction and the first transaction in the data set are stored
in the variable ”Time.” The transaction’s value is represented
by the variable ”Amount.”
The response variable (Target), ”Class,” has a value of ”1” in
cases of fraud and ”0” in all other cases.
-We have trained different models on our data-set and observe
which algorithm works better for our problem.
We applied Random Forests and Decision Trees algorithms
to our data-set.

Performance Measurement:

-We expected above 90 percent accuracy in-case of both


models. There were no null value. We split our data into
-Decision Tree Classifier is considered as a base model to features and targets. Targets are either 0 or 1.
set a benchmark to compare against. Clearly, Random Forest model works better than Decision
Trees
But, the data-set contains a serious issue regarding class
imbalance. The genuine (not fraud) transactions are more
than 99 percent whereas the fraud transactions constituting
Decision Tree Model Random Forest Model of 0.17With such kind of distribution, if we train our model
Accuracy: 0.99923 0.99963 without considering the imbalance issues, it predicts the label
Precision: 0.72727 0.94068 with higher importance given to genuine transactions (as
Recall: 0.82353 0.81618 there are more data related to non-fraudulent transaction).
F1-score: 0.77241 0.87402 Hence, it obtains more accuracy.

The class imbalance problem can be solved by various


techniques. Over sampling is one of them.
-Confusion Matrix: One approach to address the imbalanced data-sets and reach
a proper solution is to over-sample the minority class. The
simplest approach involves duplicating examples in the
minority class, although these examples don’t add any new
information to the model.
Instead, new examples can be synthesized from the existing
examples. This is a type of data augmentation for the
minority class and is referred to as the Synthetic Minority
Oversampling Technique, or SMOTE for short.

VI. CONCLUSION
This method proves accurate in deducting fraudulent trans-
action and minimizing the number of false alerts. Genetic al-
gorithm is a novel one in this literature in terms of application
domain. If this algorithm is applied into bank credit card fraud
detection system, the probability of fraud transactions can be
predicted soon after credit card transactions. And a series of
antifraud strategies can be adopted to prevent banks from great
losses and reduce risks. The objective of the study was taken
differently than the typical classification problems in that we
had a variable misclassification cost. As the standard data
mining algorithms does not fit well with this situation, we
decided to use multi population genetic algorithm to obtain
an optimized parameter. Decision trees, Logistic Regression
and algorithms were used in developing four fraud detection
models to classify a transaction as fraudulent or legitimate.
Three metrics were used in evaluating their performances.
R EFERENCES
[1] V. N. Dornadula and S. Geetha, “Credit card fraud detection using
machine learning algorithms,” Procedia computer science, vol. 165,
pp. 631–641, 2019.
[2] “Researchgate.” https://www.researchgate.net/publication/336800562
Credit Card Fraud Detection using Machine Learning and Data
Science/link/5db2f6dd92851c577ec2e973/download. (Accessed on
08/10/2022).
[3] “Credit card fraud detection using machine learning algorithms
- sciencedirect.” https://www.sciencedirect.com/science/article/pii/
S187705092030065X. (Accessed on 08/10/2022).
[4] “Credit card fraud detection — kaggle.” https://
www.kaggle.com/datasets/mlg-ulb/creditcardfraud?fbclid=
IwAR0EVDM-CkrzkaIB1kurZApxfFYFYb5jHau00hMaFYMkkRLs4pyibv0zknY.
(Accessed on 07/19/2022).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy