0% found this document useful (0 votes)

39 views9 pages

JETIR2404299

Uploaded by

mohangowdagr3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views9 pages

JETIR2404299

Uploaded by

mohangowdagr3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

UPI Fraud Detection using Machine Learning

Abstract:-
The UPI fraud detection system is to enhance the security and reliability of digital payment
transactions, ultimately safeguarding users from fraudulent activities. Firstly, the paper aims to
employ advanced machine learning algorithms and data analytics to analyze transaction patterns
and detect anomalies that may indicate potential fraud. Secondly, it seeks to develop a robust
system that can identify and mitigate various types of UPI fraud, including phishing, identity
theft, and unauthorized transactions. The paper also aims to create a real-time monitoring
mechanism to promptly identify suspicious activities and trigger alerts for immediate
intervention.

The scope of developing a UPI fraud detection system is vast and holds significant potential in
addressing the emerging challenges in the digital payment landscape. Firstly, the paper
encompasses the implementation of cutting-edge technologies such as machine learning,
artificial intelligence, and data analytics to create a sophisticated fraud detection model. This
model will have the capability to analyze massive datasets of UPI transactions in real-time,
identifying patterns, anomalies, and trends associated with fraudulent activities.

I. Introduction
This introduction will provide an overview of the key components and challenges involved in
UPI fraud detection using machine learning, highlighting the importance of staying ahead in the
ongoing battle against financial fraud in the digital age. With the increasing popularity of digital
payment systems like UPI (Unified Payments Interface), there is a growing concern about fraud
in these platforms. This paper aims to develop a robust fraud detection system for UPI
transactions using machine learning techniques. UPI fraud detection using machine learning is
a proactive approach to safeguarding financial transactions by leveraging the power of artificial
intelligence. Machine learning algorithms analyze vast volumes of transaction data, patterns,
and user behaviors to identify and prevent fraudulent activities in real-time. This technology
holds the potential to minimize financial losses, protect user privacy, and enhance the overall
security of digital payment ecosystems.

In this era of constant technological evolution, it is crucial for financial institutions, finch
companies, and payment service providers to implement advanced machine learning models and
algorithms to stay ahead of fraudsters. This approach not only helps in detecting known fraud
patterns but also adapts to emerging threats through continuous learning and optimization.
The project focuses on the development of a machine learning model that can analyze UPI
transaction data in real-time to identify fraudulent activities. The primary objective is to create
a system that enhances the security of UPI transactions and reduces financial losses due to fraud.

Literature Survey
In fraud detection, we often deal with highly imbalanced datasets. For the chosen dataset
(Payism), we show that our proposed approaches are able to detect fraud transactions with very
high accuracy and low false positives – especially for TRANSFER transactions. Fraud detection
often involves a tradeoff between correctly detecting fraudulent samples and not
misclassifying many non-fraud samples. This is often a design choice/business decision which
every digital payments company needs to make. We’ve dealt with this problem by proposing
our class weight based approach. We can further improve our techniques by using algorithms
like Decision trees to leverage categorical features associated with accounts/users in Paysim
dataset. Payism dataset can also be interpreted as time series. We can leverage this property to
build time series based models using algorithms like CNN. Our current approach deals with
entire set of transactions as a whole to train our models. We can create user specific models -
which are based on user’s previous transactional behavior - and use them to further improve
our decision making process. All of these, we believe, can be Very effective in improving our
classification quality on this dataset. Now a days Digital transactions are rapidly increasing as
it results in increasing online
Payment frauds too. In fact, according to the Reserve Bank of India, comparing March 2022 to
March 2019, digital payments have risen in volume and value by 216% and 10%, respectively.
People are starting to go all-in with digital transactions, but one can’t deny the security issues that
loom, and know- how when it comes to online payments. Few years ago, we could have barely
seen the online payment, but today UPI payment QR code installed at doorstep. This invited the
hoaxers and attackers to develop fraudulent transactions and fool people for some amount of
money. Fortunately, the online transactions are monitored and hence could be analyses using the
latest tools. In this system, an attempt is made to develop a machine learning model to identify
fraudulent transactions in a transaction’s dataset.

Fraud detection for credit/debit card, loan defaulters and similar types is achievable with the
assistance of Machine Learning (ML) algorithms as they are well capable of learning from
previous fraud trends or historical data and spot them in current or future transactions.
Fraudulent cases are scant in the comparison of non-fraudulent observations, almost in all the
datasets. In such cases detecting fraudulent transaction are quite difﬁcult. The most effective
way to pre-vent loan default is to identify non-performing loans as soon as possible. Machine
learning algorithms are coming into sight as adept at handling such data with enough computing
inﬂuence. In this paper, the rendering of different machine learning algorithms such as Decision
Tree, Random Forest, linear regression, and Gradient Boosting method are compared for
detection and prediction of fraud cases using loan fraudulent manifestations. Further model
accuracy metric have been performed with confusion matrix and calculation of accuracy,
precision, recall and F-1 score along with Receiver Operating Characteristic (ROC) curves

Financial fraud, considered as deceptive tactics for gaining financial benefits, has recently
become a widespread menace in companies and organizations. Conventional techniques such as
manual verifications and inspections are imprecise, costly, and time consuming for identifying
such fraudulent activities. With the advent of artificial intelligence, machine- learning-based
approaches can be used intelligently to detect fraudulent transactions by analyzing a large
number of financial data. Therefore, this paper attempts to present a systematic literature review
(SLR) that systematically reviews and synthesizes the existing literature on machine learning
(ML)-based fraud detection. Particularly, the review employed the Kitchenhand approach, which
uses well defined protocols to extract and synthesize the relevant articles; it then report the
obtained results. Based on the specified search strategies from popular electronic database
libraries, several studies have been gathered. After inclusion/exclusion criteria, 93 articles were
chosen, synthesized, and analyzed. The review summarizes popular ML techniques used for
fraud detection, the most popular fraud type, and evaluation metrics. The reviewed articles
showed that support vector machine (SVM) and artificial neural network (ANN) are popular ML
algorithms used for fraud detection, and credit card fraud is the most popular fraud type
addressed using ML techniques. The paper finally presents main issues, gaps, and limitations in
financial fraud detection areas and suggests possible areas for future research.

System Diagram

Fig: Home page

Fig: Sign-Up Page

Fig: Fraud Detection

Fig: Transaction History Fig: Payment Receipt Upload

Fig: Results for Transaction Receipt

Working Methodology
Data Cleaning: Some preprocessing of the data was necessary. Our chosen method could not
handle all comments from the datasets without failing. Since the data files were read line by
line, newlines () within the comments had to be removed. Certain emoji’s couldn’t be properly
encoded in our chosen file format (UTF8) so those emoji characters had to be deleted. This did
not affect the results whatsoever since the word preprocessing and tokenization we implemented
through Scikit-learn (Count Vectorizer) only considers alphanumeric characters for words with
the parameters we used [39]. Regex and character replacing were used to make all datasets
adhere to the same format.

Training: All classifiers were trained on the training datasets with a test train split of 80/20
percent. This enabled us to see the accuracy of the classifiers on the training datasets. The
same random state was used between the classifiers to make sure that the training is reproducible
between the classifiers. Text feature extraction was done using the bag-of-words model using
the Count Vectorizer in Scikit-learn. As mentioned in section 2.5 Sentiment Analysis, this a
popular approach to feature extraction.

Classifiers: All used classifiers were used with the standard parameters in Scikit-learn except
for logistic regression where the max parameter was increased from the default value of 100 to
1000. This was done since the logistic regression classifier reached the maximum allowed
iterations before the optimal solution to the classifying problem was found. Classifiers were
selected based on what is suitable for text and social media sentiment analysis and what has been
used in previous work in the field. Naive Bayes classifiers such as multinomial and complement
naive Bayes are common for use in text classification due to being fast and simple to implement
[18]. Stochastic gradient descent classifier was recommended for use on tweets by Bifet and
Frank [32]. Since YouTube comments are also part of social media and tend to be of short
length, like tweets, we believe this to be appropriate for this study. Support vector machines are
used since they are effective at a variety of traditional text categorization tasks and generally
outperform naive Bayes classifiers [18], [40]. Logistic regression is another classifier commonly
used in sentiment analysis [41]. The International Workshop on Semantic Evaluation (SemEval)
had between 2013 - 2018 a task about sentiment analysis on Twitter. Several years this task
included variations of classifying the tweet on a scale from positive or negative. SVM- and
logistic regression-based classifiers were used by several teams attempting the task of
classifying tweets on a scale from positive to negative.

Prediction: Four formulas for making the prediction were tested. This will be explained below.
Prediction 1 / the base prediction assumes that only the number of comments classified as
positive and negative contributes to the like proportion. The formula for the base prediction is
given below: predicted like proportion = Npositive Npositive + Nnegative where Npositive &
Nnegative are the number of comments classified as positive and negative respectively. A
consequence of this formula for the base prediction is that the videos whose comments are only
labeled as neutral had to be excluded since the denominator would be 0. This causes the
size of the testing dataset to vary by small amounts between the classifiers for the base
prediction. The following three predictions consider neutral comments to some extent. Any
factor for the neutral comments could be used in the numerator of the predicted like proportion
but we have only considered those cases we believe make reasonable assumptions. Prediction
2 assumes that all comments labeled as neutral contribute to dislikes. The predicted like
proportion for prediction 2 is given below: predicted like proportion = Npositive Npositive +
Nneutral + Nnegative where Npositive, Nneutral &Nnegative are the number of comments
classified as positive, neutral and negative respectively. Prediction 3 assumes that half of the
neutral comments contribute to likes and that half of the neutral comments contribute to dislikes.
The predicted like proportion for prediction 3 is given below: predicted like proportion = N
positive + 0.5 · N neutral N positive + N neutral + N negative where N positive, N neutral &N
negative are the number of comments classified as positive, neutral and negative respectively.
Prediction 4 assumes that all neutral comments contribute to likes. The formula is given below:
predicted like proportion = N positive + N neutral N positive + N neutral + N negative whereN
positive, Neutral & Negative are the number of comments classified as positive, neutral and
negative respectively.

Evaluation: The accuracy of all classifiers on the training dataset was calculated. Knowing the
actual and predicted like proportions on the YouTube trending dataset, the Pearson correlation,
the p-value for the Pearson correlation, mean absolute error, and standard deviation of
differences were calculated. This way the performance of the four different predictions and
using all configurations of classifiers and training datasets could be compared

Result Interpretation

Result analysis is a critical phase in building a UPI fraud detection system as it assesses the
effectiveness and performance of the implemented solution.

Accuracy Assessment:

Evaluate the overall accuracy of the UPI fraud detection system by comparing the total number
of correctly identified fraudulent and non-fraudulent transactions against the total number of
transactions processed. This provides a high-level understanding of the system's efficacy.

Precision and Recall:

Calculate precision and recall to understand the trade-off between false positives and false
negatives. Precision measures the accuracy of positive predictions, while recall measures the
system's ability to capture all actual positives. Striking a balance between these metrics is crucial
for a reliable fraud detection system.

False Positive Rate:

Analyze the false positive rate, which indicates the proportion of legitimate transactions
incorrectly flagged as fraudulent. A low false positive rate is essential to minimize disruptions
for genuine users while maintaining effective fraud detection.

Receiver Operating Characteristic (ROC) Curve:

Plot an ROC curve to visualize the trade-off between true positive rate and false positive rate at
various thresholds. The area under the ROC curve (AUC) provides a comprehensive measure
of the model's performance, with a higher AUC indicating better overall performance.
Confusion Matrix Analysis:

Break down the results using a confusion matrix to understand the number of true positives, true
negatives, false positives, and false negatives. This detailed analysis helps in identifying
specific areas for improvement and fine-tuning the model.

Conclusion

As we progress into an increasingly digitized world, the importance of securing digital payment
systems cannot be overstated. The implementation paper on UPI fraud detection serves as a
proactive measure to mitigate risks, protect users, and foster the widespread adoption of digital
transactions. Hence, we concluded UPI fraud detection using machine learniing which is current
landscape demands innovative solutions, and the development of a UPI fraud detection system
aligns with the imperative to create a secure and trustworthy environment for financial
transactions.

Reference:-

[1] Aditya Oza “Fraud Detection using Machine Learning” -

https://github.com/aadityaoza/CS-229- project.

[2] Ms. Kishori Dhanaji Kadam, Ms. Mrunal Rajesh Omanna, Ms. Sakshi Sunil Neje, Ms.

Shraddha Suresh Nandai. “Online Transactions Fraud Detection using Machine Learning”
Volume 5, Issue 6 June 2023, pp: 545-548 www.ijaem.net

[3] M. Valavan and S. Rita “Predictive-Analysis- based Machine Learning Model for Fraud

Detection with Boosting Classiﬁers” Computer Systems Science & enggering .

[4] IEEE Journal called Fraud Detection in Banking Data by Machine Learning Techniques

, Corresponding author: Seyedeh Leili Mirtaheri (Mirtaheri@khu.ac.ir), Issue Received 2

December 2022, accepted 19 December 2022, date of publication 26 December 2022, date
of current version 11 January 2023.

Algorithm Analysis and Design
No ratings yet
Algorithm Analysis and Design
83 pages
SRM Institute of Science and Technology: Artificial Intelligence Is About
No ratings yet
SRM Institute of Science and Technology: Artificial Intelligence Is About
7 pages
UPI Fraud Detection
100% (1)
UPI Fraud Detection
5 pages
Financial Fraud Detection Using Machine Learning Techniques
No ratings yet
Financial Fraud Detection Using Machine Learning Techniques
43 pages
Researcch Paper
No ratings yet
Researcch Paper
27 pages
Online Payment Fraud Detection
No ratings yet
Online Payment Fraud Detection
5 pages
Spatial Filtering
No ratings yet
Spatial Filtering
51 pages
Machine Learning Algorithm For Financial Fruad Detection
100% (1)
Machine Learning Algorithm For Financial Fruad Detection
25 pages
CH 11 Powerpoint
No ratings yet
CH 11 Powerpoint
62 pages
A Case Study On-3 Level Security Using 3D Password
No ratings yet
A Case Study On-3 Level Security Using 3D Password
13 pages
Jwasham - Google-Interview-University - A Complete Daily Plan For Studying To Become A Google Software Engineer
No ratings yet
Jwasham - Google-Interview-University - A Complete Daily Plan For Studying To Become A Google Software Engineer
42 pages
Synopsis FinalFINAL
No ratings yet
Synopsis FinalFINAL
4 pages
Routh-Hurwitz Criterion: Special Cases: 1) Zero Only in The First Column
No ratings yet
Routh-Hurwitz Criterion: Special Cases: 1) Zero Only in The First Column
12 pages
Fraud Detection Project Report
No ratings yet
Fraud Detection Project Report
4 pages
Upi Fraud Detection Using Machine Learning
No ratings yet
Upi Fraud Detection Using Machine Learning
4 pages
Ad3311 - Ai Lab Manual
No ratings yet
Ad3311 - Ai Lab Manual
37 pages
Project Zero
No ratings yet
Project Zero
15 pages
Windowing Functions Improve FFT Results,: Richard Lyons
No ratings yet
Windowing Functions Improve FFT Results,: Richard Lyons
7 pages
FDS Project Report
No ratings yet
FDS Project Report
7 pages
9 Speech Recognition
No ratings yet
9 Speech Recognition
26 pages
IJRPR16322
No ratings yet
IJRPR16322
15 pages
Res Ayu
No ratings yet
Res Ayu
16 pages
AI and DS Final Document For Phase 5
No ratings yet
AI and DS Final Document For Phase 5
9 pages
Phase 5
No ratings yet
Phase 5
10 pages
Pre-Quiz - Attempt Review
No ratings yet
Pre-Quiz - Attempt Review
2 pages
EEET2197 Tute9 Soln
No ratings yet
EEET2197 Tute9 Soln
10 pages
Topic 2
No ratings yet
Topic 2
5 pages
Report
No ratings yet
Report
14 pages
JETIR2404299
No ratings yet
JETIR2404299
8 pages
Case Study Front Page
No ratings yet
Case Study Front Page
11 pages
Mini Project
No ratings yet
Mini Project
3 pages
Nityananda Vyawhare 2223216 Case Study 5
No ratings yet
Nityananda Vyawhare 2223216 Case Study 5
5 pages
CH8 - Queues
No ratings yet
CH8 - Queues
30 pages
Final Project Document
No ratings yet
Final Project Document
8 pages
Abstract Artificial Intelligence Otherwise Known As AI
No ratings yet
Abstract Artificial Intelligence Otherwise Known As AI
11 pages
Abstract
No ratings yet
Abstract
9 pages
Group10 PPT
No ratings yet
Group10 PPT
31 pages
Upi Fraud Detection Using Machine Learning
No ratings yet
Upi Fraud Detection Using Machine Learning
3 pages
Automatic Image Analysis: Berlin University of Technology
No ratings yet
Automatic Image Analysis: Berlin University of Technology
13 pages
Financial Fraud Detection
No ratings yet
Financial Fraud Detection
11 pages
Research Proposal Template For Master Student
No ratings yet
Research Proposal Template For Master Student
15 pages
Mlproject
No ratings yet
Mlproject
8 pages
Final Year Project
No ratings yet
Final Year Project
27 pages
Financial Fraud Detection Using Machine Learning Techniques
No ratings yet
Financial Fraud Detection Using Machine Learning Techniques
43 pages
Fraud Detection Synopsis
No ratings yet
Fraud Detection Synopsis
14 pages
A Study On Heston-Nandi GARCH Option Pricing Model: Abstract
No ratings yet
A Study On Heston-Nandi GARCH Option Pricing Model: Abstract
5 pages
Internship Project
No ratings yet
Internship Project
8 pages
Synopsis ML Projectpdf
No ratings yet
Synopsis ML Projectpdf
13 pages
Final Synopsis Fraud Detection
No ratings yet
Final Synopsis Fraud Detection
15 pages
New Synopsis
No ratings yet
New Synopsis
18 pages
Fraud Detection in Financial Transactions - PPT.PPTX - 20240805 - 175608 - 0000
No ratings yet
Fraud Detection in Financial Transactions - PPT.PPTX - 20240805 - 175608 - 0000
22 pages
UPI Fraud Detection Using Machine Learning
No ratings yet
UPI Fraud Detection Using Machine Learning
7 pages
1 s2.0 S1877050922015058 Main
No ratings yet
1 s2.0 S1877050922015058 Main
11 pages
Archive 1
No ratings yet
Archive 1
13 pages
Phase 1 Doc - Fraud Detection in Financial Transaction
No ratings yet
Phase 1 Doc - Fraud Detection in Financial Transaction
6 pages
Phase 5 Fraud Detection in Financial Transactions
No ratings yet
Phase 5 Fraud Detection in Financial Transactions
17 pages
QM Notes 3
No ratings yet
QM Notes 3
2 pages
Lesson 4 EDA
No ratings yet
Lesson 4 EDA
3 pages
Online Payment Fraud Detection
No ratings yet
Online Payment Fraud Detection
24 pages
Rattle Brochure
No ratings yet
Rattle Brochure
1 page
Mesosphere Stratosphere Troposphere (MST) Radar Signal Using DWT With OGS
No ratings yet
Mesosphere Stratosphere Troposphere (MST) Radar Signal Using DWT With OGS
4 pages
Upi Fraud Detection Using Machine Learning Algorithms
No ratings yet
Upi Fraud Detection Using Machine Learning Algorithms
12 pages
TASK 2 - Decisions Under Risk - 212066 - 75
No ratings yet
TASK 2 - Decisions Under Risk - 212066 - 75
39 pages
10.1201 9781003559085-130 Chapterpdf
No ratings yet
10.1201 9781003559085-130 Chapterpdf
6 pages
Exponential Function
No ratings yet
Exponential Function
22 pages
DR +R +kavitha
No ratings yet
DR +R +kavitha
7 pages
11
No ratings yet
11
15 pages
PDS Gtu-Qp W2023
No ratings yet
PDS Gtu-Qp W2023
2 pages
UPI Fraud Transaction Detection Using Machine Learning
No ratings yet
UPI Fraud Transaction Detection Using Machine Learning
79 pages
Fraud Detection
No ratings yet
Fraud Detection
16 pages
ONLINE PAYMENT FRAUD DETECTION USING MACHINE LEARNING MODEL - Key
No ratings yet
ONLINE PAYMENT FRAUD DETECTION USING MACHINE LEARNING MODEL - Key
12 pages
Open Ended Lab
No ratings yet
Open Ended Lab
4 pages
Support Vector Machine
No ratings yet
Support Vector Machine
21 pages
DEEPA
No ratings yet
DEEPA
8 pages
Fraud Detection Using Machine Learning
No ratings yet
Fraud Detection Using Machine Learning
46 pages
HR Template
No ratings yet
HR Template
6 pages
Real-Time Fraud Detection System
No ratings yet
Real-Time Fraud Detection System
3 pages
Secureswipe Pioneering Strategies For Next-Gen Credit Card Fraud Prevention 1
No ratings yet
Secureswipe Pioneering Strategies For Next-Gen Credit Card Fraud Prevention 1
9 pages
Literature Review
No ratings yet
Literature Review
8 pages
CS822 DataMining Week3
No ratings yet
CS822 DataMining Week3
91 pages
Fraud Detection
No ratings yet
Fraud Detection
19 pages
Bs 341 Exam Tutorial 1
No ratings yet
Bs 341 Exam Tutorial 1
6 pages
Algebra 1 Fall 2024 Syllabus
No ratings yet
Algebra 1 Fall 2024 Syllabus
4 pages
AI-Powered Fraud Detection in Real-Time Financial Transactions
No ratings yet
AI-Powered Fraud Detection in Real-Time Financial Transactions
11 pages
BE Honours (Text, Web and Social Media Analytics
No ratings yet
BE Honours (Text, Web and Social Media Analytics
1 page
Online Transactions Fraud Detection Using Machine Learning
No ratings yet
Online Transactions Fraud Detection Using Machine Learning
4 pages
SSRN 5240326
No ratings yet
SSRN 5240326
8 pages
Latency 2
No ratings yet
Latency 2
25 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

JETIR2404299

Uploaded by

JETIR2404299

Uploaded by

UPI Fraud Detection using Machine Learning

Fig: Home page

Fig: Sign-Up Page

Fig: Fraud Detection

Fig: Results for Transaction Receipt

Precision and Recall:

False Positive Rate:

Receiver Operating Characteristic (ROC) Curve:

[1] Aditya Oza “Fraud Detection using Machine Learning” -

Detection with Boosting Classiﬁers” Computer Systems Science & enggering .

, Corresponding author: Seyedeh Leili Mirtaheri (Mirtaheri@khu.ac.ir), Issue Received 2

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.