0% found this document useful (0 votes)
7 views5 pages

81 Cse e

PHISHING DETECTION SYSTEM SYNOPSIS

Uploaded by

amangupta8168
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
7 views5 pages

81 Cse e

PHISHING DETECTION SYSTEM SYNOPSIS

Uploaded by

amangupta8168
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 5
Project Team No: ...81... yyhlday eK = = = UNIVERSITY DELHI-NCR, SONEPAT EARN ream DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Course Code +21CS4117 Project Title : PHISHING DE’ ‘TION SYSTEM. +7* SEM, 4" YEAR :AMAN KUMAR GUPTA (11021210074 :ANSH KUMAR NIMBORIA. 11021210089) Semester & Year Student’s Name & Id Name of Supervisor :MR. SATISH SINGH MEKALE Name of Co-Supervisor (if any): Le ——_____ SRM University, Sone Final Year Proj UNIVERSITY ‘Computer Science & Engineering Department jence and Artificial Intelligence ng Session: 2024-25 Program: Data Department: _ Computer Science & Engineer Project Title: Phishing Detection System Project Abstract: Phi iduals and organizations by deceiving ig attacks are a growing threat to digital security, targeting in users into revealing sensitive information, often leading to financial loss and identity theft. This project investigates the effectiveness of various machine learning and deep learning algorithms in detecting phishing websites. We explore advanced techniques like recurrent neural networks (RNNs), long short-term memory networks (LSTMSs), and pre-trained language models like BERT, alongside traditional methods such as random forests and isolation forests. The dataset consists of phishing website records, focusing on URLs and phishing types, categorized as benign, defacement, malware, and phishing. By analysing these algorithms, the project aims to identify the strengths and weaknesses of each approach, contributing to improved detection methods and stronger cybersecurity measures to combat phishing threats. Goals: © Develop an Efficient Phishing Detection System: Create a robust machine learning model capable of identifying phishing URLs and emails with high accuracy. ‘* Real-Time Detection: Implement a real-time detection mechanism that can quickly classify and alert users about phishing attempts without noticeable delays. Adaptability to Evolving Threats: Ensure the detection system can evolve and adapt to new phishing, techniques as attackers constantly change their strategies. Objectives: ‘© Data Collection: Gather a large dataset of legitimate and phishing URLs, emails, and other attack vectors for training and testing purposes. ‘* Feature Engineering: Identify key features (¢.g., URL length, domain age, presence of suspicious jicative of phishing attempts. keywords, etc.) that are ‘* Model Selection: Use advanced machine learning models such as Random Forests, Gradient Boosting, or deep learning models like BERT for textual analysis. Project Scope: The purpose of this project is to evaluate the effectiveness of various machine learning and deep learning algorithms in detecting phishing websites to improve cybersecurity. Phishing attacks are a growing threat, ‘often leading to financial loss and identity theft by tricking users into revealing sensitive information. This study will analyse both advanced techniques like Recurrent Neural Networks (RNNs), Long Short-Term Memory Networks (LSTMs), and pre-trained models like BERT, as well as traditional methods like random forests and isolation forests, to compare their strengths and weaknesses in detecting phishing websites. The project utilizes a phishing website dataset that includes URLs and classifications into categories like benign, defacement, malware, and phishing. By analysing these types, the study seeks to determine how well cach algorithm can detect and classify phishing threats. The goal is to provide actionable ir phishing detection systems, ultimately contributing to the development of stronger cybersecurity defences to ights to enhance protect users and organizations from phishing attacks. Tools / Technologies to be used in Project: © Programming Languages: Python: Python is a flexible, high-level programming language that is well-known for being readable and simple to use. It is compatible with several programming paradigms, such as functional, object- oriented, and procedural programming. Python is perfect for a wide range of activities, including web development, data analysis, machine learning, automation, and more because of its dynamic typing, large library, and vibrant community. Its extensive ecosystem and simple syntax have made it one of the most widely used languages for both novices and experts in a variety of industries. © Machine Learning and Deep Learning Frameworks: ‘TensorFlow/Keras: Google created the open-source TensorFlow machine learning framework, which makes it easy for programmers to create and implement machine learning models. It offers resources for large-scale machine leaming challenges, deep learning, and neural_networks. Developed on top of TensorFlow, Keras is a high-level API meant to make deep learning model creation casier. For both novice and seasoned developers, it provides an easy-to-use interface for creating and refining neural networks. TensorFlow and Keras work well together to create Al applications. PyTorch: Facebook's Al Research lab created the open-source machine learning framework PyTorch. It is extensively utilized for artificial intelligence and deep learning applications, especially in research because of its dynamic computing graph, which facilitates more flexible and natural model creation. PyTorch is well-liked for applications such as computer vision, reinforcement learning, and natural language processing because it offers robust support for GPU acceleration. It is a favourite among researchers and developers because of its simple Pythonic interface and capacity to debug using normal Python-tools. Scikit-Learn: A popular open-source machine learning library for Python used for data mining and reduction, clustering, analysis is called Scikit-leam, For a variety of applications, such as dimension: regression, and classification, it offers straightforward and effective techniques. Known for its user- friendliness and thoroughly documented API, Scikit-leam is a popular library built on top of NumPy, ‘SciPy, and Matplotlib. It is highly- for using machine learning algorithm: ed by both novices and specialists in data science since it is perfect real-world settings as well as for research and teaching. Pre-trained Language Models: BERT (Bidirectional Encoder Representations from Transformers): BERT (Bidirectional Encoder Representations from Transformers) is a state-of-the-art natural language processing (NLP) model developed by Google. It is based on the Transformer architecture and is unique in that it processes text bidirectionally, meaning it looks at the entire context of a word from both the left and right sides in a sentence. This enables BERT to capture the nuanced meaning of words in context more effectively than previous models. BERT is pre-trained on large datasets and can be fine-tuned for specific tasks like text classification, question answering, and language translation. It has significantly advanced NLP applications, improving the accuracy of models in various language understanding tasks. ‘Natural Language Processing (NLP) Libraries: NLTK/Spacy: NLTK (Natural Language Toolkit) and spacy are two popular Python libraries for Natural Language Processing (NLP), each with distinct use cases: NLTK: A comprehensive, academic-focused library that provides tools for text analysis, tokenization, stemming, parsing, and more. It includes a wide variety of linguistic resources like corpora and lexical tools, making it ideal for research, education, and experimentation. However, it can be slower and less efficient for large-scale applications. ‘Spacy: A fast, production-oriented library designed for efficient handling of large text data. It excels at tasks like tokenization, named entity recognition, dependency parsing, and part-of-speech tagging. spacy also integrates smoothly with deep leaning frameworks, making it better suited for industrial NLP applications. Ph (Res) Ofice: | | Mobile: | 9131539920 Group Students: No. [Uni. 1D Name E-mail Ph (Mob) [Signature 1, |11021210074 | Aman Kumar Gupta amangupta8168@gmail.com | 8168037501 2 |11021210089 | Ansh Kumar Nimboria | ansh4303@gmail.com 19717379445

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy