0% found this document useful (0 votes)

41 views6 pages

Machine Learning For Detecting The Phishing Threats

This research presents a machine learning-based system for detecting phishing threats by analyzing websites. Various models, including Random Forests, Decision Trees, and K-Nearest Neighbors, are evaluated for their effectiveness in distinguishing between legitimate and phishing sites, with a hybrid model achieving the highest accuracy of 96%. The study emphasizes the importance of continuous adaptation of detection models to counter evolving phishing tactics and highlights the significance of feature extraction in improving classification results.

Uploaded by

nithinab67

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views6 pages

Machine Learning For Detecting The Phishing Threats

Uploaded by

nithinab67

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Proceedings of the 6th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI-2025)

IEEE Xplore Part Number: CFP25US4-ART; ISBN: 979-8-3315-2266-7

Machine Learning for Detecting the Phishing

Threats
Mohammed Ali Shaik Gangula Rakshitha Katakam Saipriya
SR University, Warangal, Telangana- SR University, Warangal, Telangana- SR University, Warangal, Telangana-
506371, India 506371, India 506371, India
2025 6th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI) | 979-8-3315-2266-7/25/$31.00 ©2025 IEEE | DOI: 10.1109/ICMCSI64620.2025.10883227

niharali@gmail.com sairakshithagangula@gmail.com saipriya.katakam22@gmail.com

Thakalapally Thrisha Morthala Varshini Jasti Geethika Sai

SR University, Warangal, Telangana- SR University, Warangal, Telangana- SR University, Warangal, Telangana-
506371, India 506371, India 506371, India
thrisharaothakkallapeli@gmail.com varshinimorthala@gmail.com minnujasti@gmail.com

Abstract— In this research, a system is built that detects systems, remain current to counter new and emerging forms
whether or not a website is getting phished by using machine of phishing. The highest aim is to build a reliable and adapted
learning techniques to help enhance cybersecurity and system that supports the website security and defend the users
safeguard user data. First, a dataset of legitimate and phishing against the phishing threats.
websites is collected for training and evaluation of different
detection models. Random Forests, Decision Tree and K nearest II. LITERATURE REVIEW
neighbors (KNN) are used with a discussion on the performance
of Random Forests and a suggested hybrid model. The models
In predicting the phishing categories and threats, ML is
of the approaches proved more accurate on the task of used as a very important tool in identifying phishing by
separating legitimate and fraudulent websites and underscore analyzing particular features of phishing websites and emails.
the importance of the feature extraction and selection strategies ML exhibits a growing improvement in detection rates for
for boosting classification results. This thesis highlights the studies done based on supervised and unsupervised learning,
requirement to adapt phishing detection model continuously to and this will increase the detection rates by both supervised
mitigate techniques employed by cybercriminals which changes and unsupervised learning [2]. The most used ML algorithms
continually. This research is to contribute to the field of for phishing detection [3] are Decision tree, Support Vector
cybersecurity by implementing such adaptive systems to infuse machine and hybrid models (combination of many models).
internet environments with better security for users.
Feature engineering in these systems is essential and they
Keywords— Phishing Detection, Machine Learning, Cyber want to extract [4] characteristics to distinguish phishing
security, Random Forest, Decision Trees, Naive Bayes attacks from legitimate websites such as URL structure,
domain age and content analysis.
I. INTRODUCTION
Nevertheless, taken together these advances still present
Phishing is one of the most dangerous cybersecurity challenges—most notably in terms of models that can keep up
threats that poisoning victims and request personally sensitive with the ever changing phishing attacker tactics. In addition,
and financial information related to login credentials or bank detection systems would need more collaborative work
account number or credit card number, etc. These attacks harm between various disciplines [5-11].
individuals and organisations including identity theft and data
breach. As phishing methods become more elaborate, there is In the future, research should leverage privacy preserving
a real need to better detect such activities and prevent them. techniques and incorporate natural language processing in
order to develop detection capabilities that over URL based
In this research a system need to be created to detect schemes [12]. Since attackers will always come up with new
phishing based on websites utilizing machine learning. ways to attack rigours work is to be performed to keep on
Various algorithms, ranging from DecisionTrees, Random researching and innovating with the phishing detection
Forest, K-Nearset Neighbors (KNN), Naive Bayes to techniques [13].
XGBoost are tested upon a dataset with both legitimate and
phishing websites. Random Forest, for 1 ’unica mark, gets Phishing schemes remain as one of the most common
95% accuracy, while XGBoost obtains 93%; yet when types of cyber threats targeting the weakness of people and
combined, with a weighted average of them, the hybrid model technologies [14]. Historical approaches like blacklisting and
Random Forest + XGBoost + KNN reaches 96% and is proven rule based are inadequate when it comes to keeping up with
to be highly efficient for identifying fake websites. the modern world of phishing [15]. Recent work has
investigated the use of ML to counter the risks associated with
The features that give them the best test performance are phishing attacks and has found that its use of automatic
also examined in the study. Better accuracy is derived from detection enhances effectiveness and efficiency [16].
characteristics that aid in identifying phishing websites apart Classification algorithms such as decisional trees, support
from legitimate websites. This aspect also points out how it is vector machines (SVM) and deep neural networks have also
crucial that detection systems, and by extension defence been applied to classify between a phishing and non phishing

979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1221

Authorized licensed use limited to: PES Institute of Technology & Management. Downloaded on March 14,2025 at 11:07:55 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 6th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI-2025)
IEEE Xplore Part Number: CFP25US4-ART; ISBN: 979-8-3315-2266-7

message [17]. Authors also noted the aspect of feature • Model Evaluation: is the ability of model which was
extraction based on URL features and domain metadata, the determined interms of metrics.
content of emails to define the existence of phishing indicators
[18]. Ensemble models and hybrid frameworks for proactivity • Model Interpretation and Tuning: The feature
teamed up multiple algorithms to decrease false positives and importance was checked, and the hyperparameters
increase precise levels [19]. Furthermore, developments in the such as number of trees or maximum depth and
real-time detection system using NLP and reinforcement minimum samples for splits were tuned.
learning shed light towards highly intelligent phishing attacks • Deployment and Monitoring: further integrated the
[20]. All these advancements however come with certain trained Random Forest model is implemented by
downsides, including more complexity when dealing with performing the phishing detection system and kept on
issues such as class imbalance, robustness to adversarial monitoring it for effectiveness.
attacks or maintaining scalability in a constantly evolving
context, pointing towards future works on this topic [21]. B. Mathematical Model
For Decision Tree Classifier were used to predict class y ̂
III. PROPOSED METHODOLOGY for a sample x as the predicted class at the leaf node traversed
The proposed model denotes the methodology through a tree according to the sample's feature value. The
implemented in this research is depicted in figure1. final prediction at a leaf node can be expressed as:
A. Decision Tree for Phishing Detection 𝑦̂ = arg max 𝑝𝑐 (1)
𝑐
• Data Collection and Pre-processing: the dataset of
labeled instances of phishing and legitimate URLs. To The above equation is defined as c for each possible class,
extract relevant features like domain length, HTTPS pc is the proportion (or probability) of samples in the leaf node
usage, presence of special characters, other metadata that are in class c. It means 𝑦̂ is the member of the chosen class
that can help us to classify the phishing URLs and the c with the largest proportion in leaf node.
legitimate URLs will pre-process the dataset. A labeled dataset of phishing and legitimate URLs was
• Data Pre-processing: Missing values were handled used. The procedure was then followed to perform feature
with corrections for inconsistencies and outliers were extraction which was similar to that in the decision tree model.
removed from the data. The categorical data is encoded Values missing or outliers or inconsistencies were dealt with.
through variables in to numerical values. It is well No normalization or standardization was required as Random
known that decision trees are generally insensitive to Forest does not need feature scaling. The same metrics as the
scaling, however the ensured numerical features were earlier decision tree model were used to measure the model
performance with trained Random Forest model which was
standardized where necessary.
then incorporated into the phishing detection system and
effectiveness was continuously monitored. Decision trees are
an ensemble learning technique. When the model trains the
single decision tree which constructs the multiple decision
trees and the class which is the mode of predicted by distinct
trees forms the output of the model.
𝑦̂ = 𝑚𝑜𝑑𝑒(ℎ1(𝑥) , ℎ2(𝑥) , … , ℎ 𝑇(𝑥) ) (2)
where ℎ𝑖(𝑥) is prediction and T is the number of trees.
KNN, however, is sensitive to the scale of features and further
cleaned up and normalized the data before applying it. To
perform Model performance is will evaluate the usage of
measures with an appropriate distance metric (Euclidean or
Manhattan) is chosen, and number of neighbours are selected.
Imbalanced classes were handled using oversampling or under
sampling. A non-parametric algorithm will tend to classify the
sample based on the majority of class through k-nearest
neighbours prediction rule:
𝑦̂ = 𝑚𝑜𝑑𝑒(𝑦𝑖1 , 𝑦𝑖2 , … 𝑦𝑖𝑘 ) (3)
Where 𝑦𝑖1 , 𝑦𝑖2 , … , 𝑦𝑖𝑘 are the labels and the continuous
features, the Gaussian Naive Bayes variant of the model is
trained and model evaluated using evolution metrics. A
Bayesian probabilistic classifier that would assume
Fig. 1. Proposed Model independence of features. It calculates the posterior
probability:
• Building the Decision Tree Model: Data was 𝑃(𝑥1 , 𝑥2 , … , 𝑥𝑛 |𝑦).𝑃(𝑦)
recursively split based on the feature which 𝑃(𝑦|𝑥1 , 𝑥2 , … , 𝑥𝑛 ) = (4)
𝑃(𝑥1 ,𝑥2 ,…,𝑥𝑛 )
minimizes impurity at each node and trained the
decision tree. The definition of impurity was
calculated using Gini Index or Entropy.

Where 𝑃(𝑥1 , 𝑥2 , … , 𝑥𝑛 |𝑦) is assumed to be Gaussian. Step 7.2: handle feature complexity over suspicious
keywords
Step 7.3: perform predictive performance of non-linear
relationships in data
Step 8: Generate comparison of all the models with hybrid
model
Step 9: Stop.
E. Model Training:
The Decision Tree, Random Forest, K-Nearest Neighbors
(KNN) and Naive Bayes decisions are trained separately in
the training phase of the classifiers. After preprocessed
training data, each model is trained so they learn the patterns
Fig. 2. Layers of Proposed Model and association of features in the data set.
Four different machine learning algorithms will tend to The model partitions the data in a tree like structure by
propose the integration of proposed model for phishing iteratively splitting the data based on the features that have
detection. As the hybrid approach which aims to take the maximum amount of information gain, and is highly
advantage of the specific strengths of both the algorithm, and interpretable. Random Forest model is trained by
combine the two for best phishing detection. constructing multiple decision tree and aggregating all the
predictions by all of these decision trees for better accuracy
C. Data Collection and Preprocessing: and reduced possibility of overfitting.
The proposed method begins by performing standard data
KNN is an algorithm that is considered non parametric; K
collection and preprocessing which is common to all
stands for number of neighbors were identified through the
individual models. This includes to create a labeled dataset of
neighbors that are in the same vicinity. Lastly, a probabilistic
phishing and legitimate websites/styles, Emails, or other parts
model, Naive Bayes classifier, is trained where applying
of the system. Cleaning, handling missing values, encoding
Bayes’ theorem predicts class probabilities, and works
categorical variables, normalizing/standardizing features then
particularly well if features are independent. To capture the
performing the preprocessing of the dataset before the models
various aspects of the data each algorithm is trained in
are trained on it.
isolation and combined later using ensemble methods to get
D. Proposed Algorithm better accuracy.
Algorithm: Proposed Algorithm F. Model Evaluation:
Input: Phishing dataset Thus after training the models, the models were evaluated
Output: comparative accuracy of the model and their performance for key classification metrics were
Step 1: Clean the data and replace missing values reported. Metrics that are used throughout the evaluation
Step 2: Splitting the data in train and test metrics.
Step 3: features to be selected based on external links and Accuracy is a measure of the overall percent correct
presence of suspicious keywords (regardless of whether or not phish, is predicted), while
Step 4: Train the models through Random Forest, KNN, and precision tells us how many of the things that were predicted
XGBoost analyze the data and generate useful features. to be phishing actually are phishing. On the other hand, recall
Step 4.1: Load the dataset by splitting training and testing sets will tell how many of actual phishing instances the model
Step 4.2: Train the model to predict on the test data and correctly identified.
evaluate metrics
Step 4.3: Evaluate the model using the same metrics and Th F1-score weight ratio of precision and recall, which is
compare with other algorithms particularly important for an imbalanced dataset, such as
phishing detection with a rare class (poison websites) but a
Step 5: Ensemble Model-1 KNN
common class (legitimate websites).
Step 5.1: capturing local data distributions, which may
help detect phishing by identifying subtle differences in We present a test dataset not used during training for each
benign over malicious websites model to evaluate it on a fair and unbiased basis. By
Step 5.2: Helps to distinguish borderline cases where comparing these metrics across different models, strengths
phishing and legitimate sites appear similar but exhibit and weakness of each algorithm can be found.
minor differences Through this evaluation the models will outperform to be
Step 6: Ensemble Model-2 Random Forest the best and is suitable for incorporation into the ensemble
Step 6.1: Identify critical phishing-related features method such that the final phishing detection system leverages
Step 6.2: select feature importance to select the most the merits of each individual algorithm.
influential features for classification
Step 6.3: Identify key patterns and selecting G. Enesemble
important features An ensemble approach is used to combine the evaluate and
Step 7: Ensemble Model-3 predict the discrete models to obtain an improved overall
Step 7.1: Exploit feature interactions and handle residual performance of the phishing detection system. Combining
misclassifications. models has two common ways, Voting Classifier and
Stacking. The voting classifier makes prediction by using
majority voting, every model in the aforementioned ensemble

will make a prediction. However, this technique biases and

variances of individual models down, which results in more
precise outcomes. Conversely, Stacking uses a meta model
that accepts prediction from the base models and combines
them to produce an end output. The meta model can then itself
be a machine learning model, trained on the outputs from the
base models and frequently yields superior performance as it
learns how to best sum the individual predictions. They utilize
these strengths to create an ensemble method where each
model CNN, random forest and logistic regression to improve
the solution of the other models.
H. Model Interpretation and Tuning:
Once the ensemble has been created and further measured
is intended to explain the performance of the models that
compose the ensemble. Take Decision Trees and Random
Forests as an example; feature importance analysis in making Fig. 4. Comparision of results
the decision. Whether the models are learning appropriate
indicators and, if so, what they are, is made clear by this step.
Finally, hyperparameter tuning is important to increase the J. Deployment and Monitoring:
performances of each of these models and to the ensemble Once the process of training, tuning and evaluating are
method as well. To find an optimal combination of performed over the ensemble model will initiate the
hyperparameters, as depth of the trees in considered models deployment process . The real world applications used here
will usually try grid search or random search techniques. for phishing detection are in Email Filter, Browser security
Model accuracy and efficient, that means more accurate on tool, Network monitoring system etc. But, as the case with any
detecting phishing attacks can be significantly enhanced by technique of attack, phishing techniques are always changing,
tuning these hyperparameters. so the monitoring must not stop or else the model will become
useless. Periodically the model is retained using new data so
I. Handling Class Imbalance: that it can adjust to new phishing tactics and, in the process,
Phishing detection systems often suffer from class retain high detection accuracy. Furthermore, performance
imbalance problem: the number of legitimate websites is metrics need to be monitored constantly for deteriorating
much larger than phishing websites. To tackle this unbalanced model performance to require its adjustment. The system can
situation, different class balancing methods are performed. always stay up to date and deliver phishing detection
effectively against changing threats by creating an ongoing
feedback loop.
K. Deployment and Monitoring:
Once the training, tuning and evaluating process is
finished the ensemble model will initiates deployment. The
real world applications used here for phishing detection are in
Email Filter, Browser security tool, Network monitoring
system etc. But, as the case with any technique of attack,
phishing techniques are always changing, so the monitoring
must not stop or else the model will become useless.
Periodically to retrain the model using new data so that it can
adjust to new phishing tactics and, in the process, retain high
detection accuracy. Furthermore, performance metrics need to
be monitored constantly for deteriorating model performance
to require its adjustment. The system can always stay up to
date and deliver phishing detection effectively against
changing threats by creating an ongoing feedback loop.
Fig. 3. Comparision of results

TABLE I. COMPARISION OF VARIOUS MODELS intelligence and secure deployment models to enhance the
Model Accuracy Precision Recall F1 protecting manner of the phishing detection system against
Score adversarial attacks. There is phishing detection with the help
Decision 0.93 0.89 0.97 0.93 of the labelled datasets that train machine learning instances
Tree to identify phishing and legitimate instances focusing on URL
Random 0.96 0.95 0.98 0.96
Forest
patterns, domain names and some characteristics of e-mails.
KNN 0.94 0.91 0.97 0.94 To detect the features on the basis of phishing related
Naive 0.77 0.69 1.00 0.82 characteristics which includes lexical features, http headers,
Bayes URL length, domain age and some attributes of the content of
XGBoost 0.95 0.93 0.98 0.95 emails and websites they contain the selected supervised
Hybrid 0.97 0.96 0.98 0.97 learning models with a new training set that include decision
Model trees, SVMs, or neural networks which will give a
classification to inputs as being or not being phishing.
Standardize data by normalizing inputs by scaling in which
data must be scaled in order to bring their values in to the
standard range of 0-1 by using Min-Max or Z-score
normalization. Deployment of a model may involve ranking
and assessing the model using accuracy, precision, recall, F1-
score and ROC with AUC to determine the effectiveness of
this model since not all predicted spam emails are phishing
attempts as well as ensuring that it does not wrongly mark
genuine emails as phishing attempts.
IV. CONCLUSION
In this research, a hybrid approach for detecting phishing
has been successfully proven efficient to utilize multiple
Machine Learning algorithms such as Decision Tree, Random
Forest, KNN, and Naive Bayes. The hybrid model was able to
combine the strength of individual classifiers through
Fig. 5. Heatmap representing performance matrics advanced data pre-processing, model training, and by
application of ensemble techniques such as Voting Classifier
The figure5 provides a visual comparison of various and Stacking. This rendered accurate, precise, and robust, thus
machine learning models. The darker the color, the higher the promising reliable predictions along with an ability to
value of that metric. It is very evident from the heatmap that accommodate changing phishing tactics. The results
the Hybrid Model is the overall best per-former, with all the demonstrate the practical applicability of the system in real
metrics, especially Accuracy (0.97), Precision (0.96), Recall world phishing URL detection scenarios.
(0.98) and F1 Score (0.97) being maximized. The values of
Random Forest and XGBoost models are also strong like
Hybrid model. On the other hand, Naive Bayes exhibits a
lower performance except in Recall (1.00), in which Accuracy REFERENCES
and Precision are incredibly low. This shows that Naive Bayes [1] B. B. Gupta, A. Gaurav, P. Chaurasia and K. T. Chui, "Lightweight
is highly sensible to detect positive cases, but fails to identify Deep Learning Model and Genetic Algorithm Based Optimal Phishing
negative cases accurately. Website Detection," 2024 IEEE 13th Global Conference on Consumer
Electronics (GCCE), Kitakyushu, Japan, 2024, pp. 1403-1404, doi:
The table1 presents the performance metrics over the 10.1109/GCCE62371.2024.10760723.
considered algorithms along with the Hybrid model. Out of all [2] B. Desai, K. Patil, I. Mehta and A. Patil, "Explainable AI in
Cybersecurity: A Comprehensive Framework for enhancing
the models, Hybrid Model with the highest values for each transparency, trust, and Human-AI Collaboration," 2024 International
metric is 97% Accuracy, 96% Precision, 98% Recall, and 97% Seminar on Application for Technology of Information and
F1 Score, that being the best model with Precision and Recall Communication (iSemantic), Semarang, Indonesia, 2024, pp. 135-150,
to a balance. Both Random Forest and XGBoost run on high doi: 10.1109/iSemantic63362.2024.10762690.
Accuracy (96 and 95 respectively) and excellent Precision and [3] H. Rangani and K. Chandrashekar, "Detection and Prevention of Cyber
Recall making them robust models for classification tasks. Threats in Smart Cities Using Machine Learning and Intrusion
Detection Systems," 2024 2nd International Conference on Self
KNN is essentially what follows closely behind, having a Sustainable Artificial Intelligence Systems (ICSSAS), Erode, India,
slightly lower Precision (91%), but very good Accuracy and 2024, pp. 1232-1237, doi: 10.1109/ICSSAS64001.2024.10760393.
Recall. While having perfect Recall (100%), naive Bayes [4] M. A. Shaik and N. L. Sri, “A Comparison of Stock Price Prediction
lacks Precision (69%) and accuracy (77%), and overpredicts Using Machine Learning Techniques”, 2024 5th International
the positive case, as it shows a great deal of false positives. Conference on Electronics and Sustainable Communication Systems
The above algorithm will address issues of noise by modifying (ICESC), Coimbatore, India, 2024, pp. 1-5, doi:
10.1109/ICESC60852.2024.10689767.
the existing hyperparameters, thin out the data sets by
undertaking techniques such as feature selection and ensemble [5] Mohammed Ali Shaik, Geetha Manoharan, B Prashanth, NuneAkhil,
Anumandla Akash and Thudi Raja Shekhar Reddy, (2022), "Prediction
to improve the accuracy of the machine learning models. of Crop Yield using Machine Learning", International Conference on
Enhance accuracy by using a balanced datasets, feature Research in Sciences, Engineering & Technology, AIP Conf. Proc.
selection with high accuracy, and selecting precision oriented 2418, 020072-1–020072-8; https://doi.org/10.1063/5.0081726,
loss functions to minimize those false positives when Published by AIP Publishing. 978-0-7354-4368-6, pp. 020072-1 to
020072-8
detecting phishing. In order to secure the system through the
use of strong encryption algorithm, real-time threats

[6] Z. Fu, S. Acharya, S. H. H. Ding, Y. Zhu, J. Fu and C. Xu, "Leveraging [20] Mohammed Ali Shaik, Praveen Pappula, T Sampath Kumar,
Human Knowledge in Large Language Model for Obfuscation- "Predicting Hypothyroid Disease using Ensemble Models through
Resisted Phishing URL Detection," 2024 Ninth International Machine Learning Approach", European Journal of Molecular &
Conference On Mobile And Secure Services (MobiSecServ), Miami Clinical Medicine, 2022, Volume 9, Issue 7, Pages 6738-6745.
Beach, FL, USA, 2024, pp. 1-9, doi: https://ejmcm.com/article_21010.html
10.1109/MobiSecServ63327.2024.10760006. [21] P. Bhatt, M. S. Obaidat, G. Dangwal, A. K. Das, M. Wazid and B.
[7] Mohammed Ali Shaik and Dhanraj Verma, (2022), "Predicting Present Sadoun, "Machine Learning-Based Security Mechanism for Detecting
Day Mobile Phone Sales using Time Series based Hybrid Prediction Phishing Attacks," 2024 International Conference on
Model", International Conference on Research in Sciences, Communications, Computing, Cybersecurity, and Informatics (CCCI),
Engineering & Technology, AIP Conf. Proc. 2418, Beijing, China, 2024, pp. 1-6, doi:
https://doi.org/10.1063/5.0081722, Published by AIP Publishing. 978- 10.1109/CCCI61916.2024.10736460.
0-7354-4368-6, pp. 020073-1 to 020073-9
[8] E. M. Damatie, A. Eleyan and T. Bejaoui, "Real-Time Email Phishing
Detection Using a Custom DistilBERT Model," 2024 International
Symposium on Networks, Computers and Communications (ISNCC),
Washington DC, DC, USA, 2024, pp. 1-6, doi:
10.1109/ISNCC62547.2024.10759011.
[9] Mohammed Ali Shaik and Dhanraj Verma, (2022), "Prediction of
Heart Disease using Swarm Intelligence based Machine Learning
Algorithms", International Conference on Research in Sciences,
Engineering & Technology, AIP Conf. Proc. 2418,
https://doi.org/10.1063/5.0081719, Published by AIP Publishing. 978-
0-7354-4368-6, pp. 020025-1 to 020025-9
[10] M. Hajarian, P. Diaz and I. Aedo, "On Privacy, Security and Trust for
Misuse Prevention in Social Networks," 2024 International
Symposium on Networks, Computers and Communications (ISNCC),
Washington DC, DC, USA, 2024, pp. 1-4, doi:
10.1109/ISNCC62547.2024.10759034.
[11] Mohammed Ali Shaik, Praveen Pappula, T. Sampath Kumar, Battu
Chiranjeevi, “Ensemble model based prediction of hypothyroid disease
using through ML approaches”, International Conference on Research
in Sciences, Engineering, and Technology, AIP Conf. Proc., 2971,
020038 (2024), https://doi.org/10.1063/5.0196055
[12] H. Zouahi, C. Talhi and O. Boudar, "VizCheck: Enhancing Phishing
Attack Detection through Visual Domain Name Homograph
Analysis," 2024 7th Conference on Cloud and Internet of Things
(CIoT), Montreal, QC, Canada, 2024, pp. 1-8, doi:
10.1109/CIoT63799.2024.10757049.
[13] Mohammed Ali Shaik, M. Varshith, S. SriVyshnavi, N. Sanjana and R.
Sujith, “Laptop Price Prediction using Machine Learning Algorithms”,
2022 International Conference on Emerging Trends in Engineering and
Medical Sciences (ICETEMS), Nagpur, India, 2022, pp. 226-231, doi:
10.1109/ICETEMS56252.2022.10093357.
[14] M. I. Ragab, R. O. Bakr and H. K. Aslan, "Advanced Phishing
Detection in Ethereum Blockchain Transactions Using Machine
Learning Models," 2024 6th Novel Intelligent and Leading Emerging
Sciences Conference (NILES), Giza, Egypt, 2024, pp. 331-336, doi:
10.1109/NILES63360.2024.10753229.
[15] M. A. Shaik, R. Sreeja, S. Zainab, P. S. Sowmya, T. Akshay and S.
Sindhu, “Improving Accuracy of Heart Disease Prediction through
Machine Learning Algorithms”, 2023 International Conference on
Innovative Data Communication Technologies and Application
(ICIDCA), Uttarakhand, India, 2023, pp. 41-46, doi:
10.1109/ICIDCA56705.2023.10100244.
[16] A. Ehsan et al., "Enhanced Anomaly Detection in Ethereum: Unveiling
and Classifying Threats with Machine Learning," in IEEE Access, doi:
10.1109/ACCESS.2024.3504300.
[17] Mohammed Ali Shaik, P. Praveen, T. Sampath Kumar, Masrath
Parveen, Swetha Mucha, "Machine learning based approach for
predicting house price in real estate", International Conference on
Research in Sciences, Engineering, and Technology, AIP Conf. Proc.
2971, 020041-1–020041-5; https://doi.org/10.1063/5.0196051
[18] R. Chataut, Y. Usman, C. M. A. Rahman, S. Gyawali and P. K.
Gyawali, "Enhancing Phishing Detection with AI: A Novel Dataset and
Comprehensive Analysis Using Machine Learning and Large
Language Models," 2024 IEEE 15th Annual Ubiquitous Computing,
Electronics & Mobile Communication Conference (UEMCON),
Yorktown Heights, NY, USA, 2024, pp. 0226-0232, doi:
10.1109/UEMCON62879.2024.10754710.
[19] Mohammed Ali Shaik and Dhanraj Verma, (2022), "Prediction of
Heart Disease using Swarm Intelligence based Machine Learning
Algorithms", International Conference on Research in Sciences,
Engineering & Technology, AIP Conf. Proc. 2418,
https://doi.org/10.1063/5.0081719, Published by AIP Publishing. 978-
0-7354-4368-6, pp. 020025-1 to 020025-9

Authorized licensed use limited to: PES Institute of Technology & Management. Downloaded on March 14,2025 at 11:07:55 UTC from IEEE Xplore. Restrictions apply.

Final PPT - Phishing Website
100% (1)
Final PPT - Phishing Website
23 pages
Uic Code: Harmonised Commodity Code (NHM)
No ratings yet
Uic Code: Harmonised Commodity Code (NHM)
10 pages
Omron PLC CP1E Manual
100% (1)
Omron PLC CP1E Manual
257 pages
Fake Url
No ratings yet
Fake Url
64 pages
Guide For Java Devs Vert.x
No ratings yet
Guide For Java Devs Vert.x
119 pages
Network Security Report
No ratings yet
Network Security Report
42 pages
A Comparative Analysis of Different Feature Set On The Performance of Different Algorithms in Phishing Website Detection
No ratings yet
A Comparative Analysis of Different Feature Set On The Performance of Different Algorithms in Phishing Website Detection
7 pages
LIST OF ICs
No ratings yet
LIST OF ICs
14 pages
Database Management System
No ratings yet
Database Management System
22 pages
SA Vol 30 MU-MIMO
100% (1)
SA Vol 30 MU-MIMO
73 pages
Phishing Detection With Machine Learning
No ratings yet
Phishing Detection With Machine Learning
9 pages
20mis0106 VL2023240102875 Pe003
No ratings yet
20mis0106 VL2023240102875 Pe003
42 pages
Phishing Website Detector Using ML
No ratings yet
Phishing Website Detector Using ML
8 pages
Phishing Attacks Detection Using Machine Learning Approach
No ratings yet
Phishing Attacks Detection Using Machine Learning Approach
7 pages
CSE3502-Final J Comp Report
No ratings yet
CSE3502-Final J Comp Report
20 pages
Selvakumari 2021 J. Phys. Conf. Ser. 1916 012169
No ratings yet
Selvakumari 2021 J. Phys. Conf. Ser. 1916 012169
9 pages
Alto Mistral 2500, 4000
100% (1)
Alto Mistral 2500, 4000
46 pages
IJRTI2207237
No ratings yet
IJRTI2207237
19 pages
Detection of Phising Websites Using Machine Learning Approaches
No ratings yet
Detection of Phising Websites Using Machine Learning Approaches
9 pages
FINALREPORT
No ratings yet
FINALREPORT
13 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
6 pages
Detection of Phishing WebsitesUsing Random Forest and XGBOOST
No ratings yet
Detection of Phishing WebsitesUsing Random Forest and XGBOOST
14 pages
Switched Mode Power Conversion: Devices For Efficient Power Conversion Switches Inductors Transformers
No ratings yet
Switched Mode Power Conversion: Devices For Efficient Power Conversion Switches Inductors Transformers
47 pages
LIS 2022 New 1-154-160
No ratings yet
LIS 2022 New 1-154-160
7 pages
Comparative Analysis of Features Based Machine Learning Approaches For Phishing Detection
No ratings yet
Comparative Analysis of Features Based Machine Learning Approaches For Phishing Detection
6 pages
Fake Website Detection
No ratings yet
Fake Website Detection
13 pages
Algorithms and Data Structures Exercises: Antonio Carzaniga University of Lugano Edition 1.2 January 2009
No ratings yet
Algorithms and Data Structures Exercises: Antonio Carzaniga University of Lugano Edition 1.2 January 2009
13 pages
Applsci 13 04649
No ratings yet
Applsci 13 04649
16 pages
Phishing Website Detection Using ML IJERTCONV9IS13006
No ratings yet
Phishing Website Detection Using ML IJERTCONV9IS13006
4 pages
Error Recognition Questions 16 To 20
No ratings yet
Error Recognition Questions 16 To 20
6 pages
Research Paper
No ratings yet
Research Paper
9 pages
IJCRTI020051
No ratings yet
IJCRTI020051
4 pages
1 s2.0 S187770581200940X Main
No ratings yet
1 s2.0 S187770581200940X Main
8 pages
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
No ratings yet
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
6 pages
Design of Energy Storage For PV
No ratings yet
Design of Energy Storage For PV
6 pages
Msme PDF
No ratings yet
Msme PDF
1 page
Final Synopsisi 2
No ratings yet
Final Synopsisi 2
11 pages
Computer Organization: 1st Sem 2018-2019 1
No ratings yet
Computer Organization: 1st Sem 2018-2019 1
13 pages
Phishing Web Site Detection Using Diverse Machine Learning Algorithms
No ratings yet
Phishing Web Site Detection Using Diverse Machine Learning Algorithms
16 pages
Samsung V-NAND Technology: Yield More Capacity, Performance, Endurance and Power Efficiency
No ratings yet
Samsung V-NAND Technology: Yield More Capacity, Performance, Endurance and Power Efficiency
8 pages
MoorDyn Users Guide 2017-08-16
No ratings yet
MoorDyn Users Guide 2017-08-16
17 pages
CH 2. Literature Survey
No ratings yet
CH 2. Literature Survey
5 pages
A 0-Contributing PDF
No ratings yet
A 0-Contributing PDF
8 pages
AUR450C Specification 1
No ratings yet
AUR450C Specification 1
16 pages
Final Research Paper
No ratings yet
Final Research Paper
6 pages
The Ultimate WordPress Speed Optimization Guide
100% (1)
The Ultimate WordPress Speed Optimization Guide
74 pages
Eaton-Ceag-El-Cps-Datasheet-Zb-S - Sku Cg-S 2 X 3 A - GB
No ratings yet
Eaton-Ceag-El-Cps-Datasheet-Zb-S - Sku Cg-S 2 X 3 A - GB
1 page
PHD Thesis Topics in Commerce in India
100% (2)
PHD Thesis Topics in Commerce in India
4 pages
Phishing Url Detection Using CNNLSTM and Random Forest Classifier
No ratings yet
Phishing Url Detection Using CNNLSTM and Random Forest Classifier
6 pages
DEH-P7400HD OwnersManual112811
No ratings yet
DEH-P7400HD OwnersManual112811
112 pages
Improved Detection of Phishing Websites Using Machine Learning 11-6-2024
No ratings yet
Improved Detection of Phishing Websites Using Machine Learning 11-6-2024
15 pages
Enhancing Phishing URL Detection Through Comprehen
No ratings yet
Enhancing Phishing URL Detection Through Comprehen
7 pages
Mythos Brochure Digital
No ratings yet
Mythos Brochure Digital
18 pages
Phishing Detection Using Clustering and Machine Learning
No ratings yet
Phishing Detection Using Clustering and Machine Learning
11 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
7 pages
Paper 1
No ratings yet
Paper 1
5 pages
Project
No ratings yet
Project
12 pages
Review Paper
No ratings yet
Review Paper
9 pages
Phishing Detection (Yamu Research Project)
No ratings yet
Phishing Detection (Yamu Research Project)
19 pages
Real Time Phishing Website Detectionusing ML
No ratings yet
Real Time Phishing Website Detectionusing ML
4 pages
Phish Guard Phishing Website Using Machine Learning Algorithms
No ratings yet
Phish Guard Phishing Website Using Machine Learning Algorithms
10 pages
Batch-5 ECE-D
No ratings yet
Batch-5 ECE-D
4 pages
Phishing Paper 2
No ratings yet
Phishing Paper 2
6 pages
Paper Major1
No ratings yet
Paper Major1
6 pages
Final Yr Project PhishingAttack
No ratings yet
Final Yr Project PhishingAttack
12 pages
155-Article Text-230-3-10-20230813
No ratings yet
155-Article Text-230-3-10-20230813
7 pages
BS 341 Practice Test Two
No ratings yet
BS 341 Practice Test Two
9 pages
Employees Registration in HH - User Manual
No ratings yet
Employees Registration in HH - User Manual
8 pages
2012-03-Elementary Approach To Modular Equations - Ramanujans Theory 7
No ratings yet
2012-03-Elementary Approach To Modular Equations - Ramanujans Theory 7
7 pages
Securing The Web, Machine Learning's Role
No ratings yet
Securing The Web, Machine Learning's Role
1 page
Company Profile
No ratings yet
Company Profile
29 pages
B5 - Project Synopsis
No ratings yet
B5 - Project Synopsis
5 pages
Phishing Detection in Email Using Deep Learning
No ratings yet
Phishing Detection in Email Using Deep Learning
8 pages
Phishing 5
No ratings yet
Phishing 5
5 pages
Major Project Final Report
No ratings yet
Major Project Final Report
53 pages
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
No ratings yet
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
4 pages
Updated Phishing Url Detection
No ratings yet
Updated Phishing Url Detection
13 pages
Phishing 4
No ratings yet
Phishing 4
6 pages
Jokes Advanced
No ratings yet
Jokes Advanced
6 pages
LMS (LLD)
No ratings yet
LMS (LLD)
5 pages
Edited Phishing Domains Detection Using Deep Learning
No ratings yet
Edited Phishing Domains Detection Using Deep Learning
11 pages
JETIR2504A41
No ratings yet
JETIR2504A41
7 pages
Final Paper On Phishing Domains Detection Using Deep Learning
No ratings yet
Final Paper On Phishing Domains Detection Using Deep Learning
11 pages
Paper 2
No ratings yet
Paper 2
10 pages
1229-Article Text-12170-1-10-20250203-2
No ratings yet
1229-Article Text-12170-1-10-20250203-2
13 pages
Automated Phishing Detection Through URL Analysis and Machine Learning
No ratings yet
Automated Phishing Detection Through URL Analysis and Machine Learning
9 pages
HRSD ServiceNow Resume Sample
No ratings yet
HRSD ServiceNow Resume Sample
2 pages
Curtis Oxburgh 2022 Understanding Cybercrime in Real World Policing and Law Enforcement
No ratings yet
Curtis Oxburgh 2022 Understanding Cybercrime in Real World Policing and Law Enforcement
20 pages
Phishing Detection Using ML
No ratings yet
Phishing Detection Using ML
11 pages
Machine Learning: Master Supervised and Unsupervised Learning Algorithms with Real Examples (English Edition)
From Everand
Machine Learning: Master Supervised and Unsupervised Learning Algorithms with Real Examples (English Edition)
Kamalkant Hiran
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Machine Learning For Detecting The Phishing Threats

Uploaded by

Machine Learning For Detecting The Phishing Threats

Uploaded by

Proceedings of the 6th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI-2025)

IEEE Xplore Part Number: CFP25US4-ART; ISBN: 979-8-3315-2266-7

Machine Learning for Detecting the Phishing

niharali@gmail.com sairakshithagangula@gmail.com saipriya.katakam22@gmail.com

Thakalapally Thrisha Morthala Varshini Jasti Geethika Sai

979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1221

979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1222

979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1223

will make a prediction. However, this technique biases and

979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1224

979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1225

979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1226

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.