0% found this document useful (0 votes)

34 views7 pages

IJCRT23A5429

The document discusses using machine learning techniques to predict spam emails. It begins by introducing the problem of spam emails and how machine learning can help detect spam. It then describes preprocessing a dataset from Kaggle by translating it to Urdu and extracting word tokens. Finally, it outlines the methodology, including importing and preprocessing the data, using Count Vectorization to analyze word frequencies, and developing a model in Python using the Vowpal Wabbit algorithm on the Azure machine learning platform. The goal is to classify emails as spam or ham using these machine learning methods.

Uploaded by

abhiram2003pgd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views7 pages

IJCRT23A5429

Uploaded by

abhiram2003pgd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

www.ijcrt.

SPAM MAIL PREDICTION USING MACHINE

LEARNING
1
B. Uday Reddy, 2S. Nagasai Tej , 3Md. Shoheb, 4Dr. Krishna Samalla, 5Y. Sreenivasulu
1
Student, 2Student, 3Student, 4Professor, 5Associate Professor
1,2,3,4,5
Department of Electronics and Communication
1,2,3,4,5
Sreenidhi Institute of Science and Technology, Ghatkesar, Hyderabad, India

Abstract: Online reviews are often the primary factor in a customer’s decision to purchase a product or service, and are a valuable
source of information that can be used to determine public opinion on these products or services. Because of their impact,
manufacturers and retailers are highly concerned with customer feedback and reviews. This practice is known as Opinion (Review)
Spam, where spammers manipulate and poison reviews (i.e., making fake, untruthful, or deceptive reviews) for profit or gain. Since
not all online reviews are truthful and trustworthy, it is important to develop techniques for detecting review spam. By extracting
meaningful features from the text using Natural Language Processing (NLP), it is possible to conduct review spam detection using
various machine learning techniques. The majority of current research has focused on supervised learning methods, which require
labelled data, a scarcity when it comes to online review spam. Research on methods for Big Data are of interest, since there are
millions of online reviews, with many more being generated daily.

Keywords : Spam, Natural Processing Language (NLP), Ham, Big Data

I. INTRODUCTION :
Email is the worldwide use of communication applications. It is because of the ease of use and faster than other communication
applications. However, its inability to detect whether the mail content is either spam or ham degrades its performance. Nowadays, a
lot of cases have been reported regarding stealing of personal information or phishing activities via email from the user. This project
will discuss how machine learning helps in spam detection. Machine learning is an artificial intelligence application that provides the
ability to automatically learn and improve data without being explicitly programmed. Binary classifier will be used to classify the text
into two different categories; spam and ham.

II. LITERATURE SURVEY

Today, spam has become a big internet issue. In 2017, the statistics showed spam accounted for 55% of all e-mail messages, same
as during the previous year. Spam which is also known as unsolicited bulk email has led to the increasing use of email as email
provides the perfect ways to send the unwanted advertisement or junk newsgroup posting at no cost for the sender. This chance has
been extensively exploited by irresponsible organizations and resulting to clutter the mail boxes of millions of people all around
the world. Evolving from a minor to major concern, given the high offensive content of messages, spam is a waste of time. It also
consumed a lot of storage space and communication bandwidth. End user is at risk of deleting legitimate mail by mistake. Moreover,
spam also impacted the economy which led some countries to adopt legislation. Text classification is used to determine the path of
incoming mail/message either into the inbox or straight to the spam folder. It is the process of assigning categories to text according
to its content. It is used to organize, structure and categorize text. It can be done either manually or automatically. Machine learning
automatically classifies the text in a much faster way than manual technique. Machine learning uses pre-labelled text to learn the
different associations between pieces of text and its output. It used feature extraction to transform each text to numerical
representation in the form of a vector which represents the frequency of word in a predefined dictionary

PROBLEM STATEMENT:
“To study on how to use machine learning techniques for spam detection, modify machine learning algorithms in computer system
settings, leverage modified machine learning algorithms in knowledge analysis software and to test the machine learning algorithm
real data from a machine learning data repository"

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M114

www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
III. PROPOSED SYSTEM
Data Processing and Methodology
a. As for data model, it refers to the documenting a complex system and data flow between different data elements and design as
an easily understood diagram using text and symbols. The data flow below shows how the data flow of these projects in order
to detect the spam messages and classify them into two separate types which are spam and ham message. The data model flow
is essential to this project to show the structure of the project on how it should be built and how the process is related to each
other. It helps to organize the process of the project smoothly and clearly. Based on the framework, Azure ML Studio is used as
the platform to develop the project. First, study and discover all the functionality on Azure ML to make sure the project can
achieve its objectives. After that, make sure to download and use the real existing dataset from the machine learning data
repository as training and testing data. Preprocessing data starts with reformatting the dataset into 2 separate files which are
training.csv and testing.csv format. Then, upload the formatted dataset into Azure ML under dataset function/menu and drag
them onto the workspace to visualize the data. Choose any desired filters to clean the raw data such as “remove numbers'' filter.
In feature extraction, we will transform the data so that it can be used to train the classifier by using the Vowpal Wabbit
algorithm. First, use the feature hashing step to change the message hashing bit size. This step is important to extract all pairs
of bigrams, compute an 8 bit hash for each bigram and create a new column for each hast in the output dataset. 22 Model training
step include 2 steps which is picking a classifier and scoring the classifier. Two-Class Logistic Regression is used to predict the
probability of spam detection either it is spam or ham. After the data have been trained, the model needs to be tested to evaluate
its accuracy and overstrain the model so that it memorizes the data. Web service is set up so that the model can be used. First,
select only the message column by using Select Columns in the Dataset step so that the data can be tested in the web service.
b. The data model flow is essential to this project to show the structure of the project on how it should be built and how
the process is related to each other. It helps to organize the process of a project smoothly and clearly. Based on the
framework, Azure ML Studio is used as the platform to develop the project. First, study and discover all the
functionality on Azure ML to make sure the project can achieve its objectives.

The first stage is to import the dataset, which is downloaded from ‘Kaggle’ and then converted to CSV format in Urdu. The dataset
containing 5000 emails were already classified as spam and ham. The data was obtained while being written in the English Language.
As explained before, unlike the usual approach, we have translated the data set into Urdu, to achieve our goal of Urdu spam e-mail
detection. We created our own dataset, because Roman Urdu is written using English alphabets, and Urdu script is based on Arabic
alphabets. Urdu scripts are distinct from Roman Urdu scripts.
Being a critical phase of preprocessing, in this step, all the words from emails are gathered, and the number of times each word
appears and location of appearance are counted. With the aid of Count Vectorizer, we were able to find the repetition of words in
our dataset. Each word is given a unique number, and hence, they are called tokens, also depicting their occurrences and quantity of
occurrences. The token includes one of a kind feature values that will later help in the creation of feature vectors. In a tokenization
phase, every word is assigned a unique token.

IV . METHODOLOGY
Process model is a series of steps, concise description and decisions involved in order to complete the project implementation.
In order to finish the project within the time given, the flows of the project need to be followed. The framework below shows
the overall flow of this project in order to separate between a spam and ham message.
Methodology is an important role as a guide for this project to make sure it is on the right path and working as well as plan.
There are different types of methodology used in order to do spam detection and filtering. So, it is important to choose the right
and suitable methodology thus it is necessary to understand the application functionality itself.
IMPLEMENTATION AND CODING PHASE
This project is developed by using Python Language and combining with the Vowpal Wabbit algorithm. Azure machine learning
studio is the platform to develop the project. It contains an important function for preprocessing the dataset. Then, the dataset is
going to be used to train and test whether the model of machine learning will achieve the objectives.
3.3 PROJECT REQUIREMENT AND SPECIFICATION
System requirement is needed in order to accomplish the project goals and objectives and to assist in development of the project
that involves the usage of hardware and software. Each of these requirements is related to each other to make sure that system
can be done smoothly.

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M115

www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
Step 1. Pick a random mail from the collection for testing purposes.
Step 2. The e-mail in question is in its unprocessed state. E-mail must be preprocessed before the feature extraction and classification
procedure can begin. Tokenization, stemming, and stop word elimination are all steps in the preprocessing process:
(1)To begin, split down the e-mail into distinct words and tokenize it. Tokenization separates each word into its own token.
(2)Eliminate all punctuation marks from the characters you obtained through tokenization.
(3)Stemming is done with the tokens earned in the previous stage. The stemming process decreases the size of a word to its base word.
For stemming, a predetermined range of available words is examined, as well as the irrespective stem words.
(4)For stemming, a list of suffixes keywords is maintained in an array with their base words.
(5)Check to see whether there are any tokens available in the base input text.
(6)Stem the phrase to the proper base word from the array list if the test token’s suffixes are true.
(7)Otherwise, stemming is unnecessary. Word has already been converted to its root word format. Therefore, proceed to the next
token.
Step3. To use the feature extraction technique, select suitable attribute words from the validation set. Just the set of features that is
most nearly connected to the category is selected.
Step4. Use extracted features and created tokens to train ML and DL models. That model can easily distinguish between spam and
ham emails.
Step5. Tokens are classified as spam or ham based on their feature similarity as ML models determine.
Step 6. Finally, the likelihood of spam or ham tokens in a sentence is evaluated for final classification:
(1)The mail is regarded spam if the significance level of spam tokens is higher than zero
(2)Otherwise, e-mail is regarded as ham e-mail
Step7. Mark the e-mail as spam or ham and proceed with the rest of the emails.

ALGORITHM EMPLOYED :
To get started, first, run the code below:

spam = pd.read_csv('spam.csv')
In the code above, we created a spam.csv file, which we’ll turn into a data frame and save to our folder spam. A data
frame is a structure that aligns data in a tabular fashion in rows and columns, like the one seen in the following image.

Run the command below:

z = spam['EmailText']
y = spam["Label"]
z_train, z_test,y_train, y_test = train_test_split(z,y,test_size = 0.2)
z = spam['EmailText'] assigns the column EmailText from spam to z. It contains the data that we’ll run through the
model. y = spam["Label"] assigns the column Label from spam to y, telling the model to correct the answer. You can see a
screenshot of the raw dataset below.
The function z_train, z_test,y_train, y_test = train_test_split(z,y,test_size = 0.2) divides columns z and y into z_train for
training inputs, y_train for training labels, z_test for testing inputs, and y_test for testing labels.
test_size=0.2 sets the testing set to 20 percent of z and y. You can see an example of this in the screenshot below, where
the ham label indicates non-spam emails, and spam represents known spam emails:

For example, in the image above, the word corresponding to 1841 is used twice in email number 0.
Now, our machine learning model will be able to predict spam emails based on the number of occurrences of certain
words that are common in spam emails.
IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M116
www.ijcrt.org © 2023 IJCRT | Volume 11, Issue 5 May 2023 | ISSN: 2320-2882
Building the model
SVM, the support vector machine algorithm, is a linear model for classification and regression. The idea of SVM is
simple, the algorithm creates a line, or a hyperplane, which separates the data into classes. SVM can solve both linear
and non-linear problems
Let’s create an SVM model with the code below:
model = svm.SVC()
model.fit(features,y_train)
model = svm.SVC() assigns svm.SVC()
In the model.fit(features,y_train) function, model.fit trains the model with features and y_train. Then, it checks the
prediction against the y_train label and adjusts its parameters until it reaches the highest possible accuracy.
Testing our email spam detector
Now, to ensure accuracy, let’s test our application. Run the code below:
features_test = cv.transform(z_test)
print("Accuracy:{}".format(model.score(features_test,y_test)))
The features_test = cv.transform(z_test) function makes predictions from z_test that will go through count
vectorization. It saves the results to the features_test file.
print(model.score(features_test,y_test)) function, mode.score() scores the prediction of features_test against the
actual labels in y_test.
The full script for this project is below:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn import svm

spam=pd.read_csv('C:\\Users\\nethm\\Downloads\\spam.csv')
z = spam['EmailText']
y = spam["Label"]
z_train, z_test,y_train, y_test = train_test_split(z,y,test_size = 0.2)
cv = CountVectorizer()
features = cv.fit_transform(z_train)

model = svm.SVC()
model.fit(features,y_train)

features_test = cv.transform(z_test)
print(model.score(features_test,y_test))

Implementation and Results:

In this section, the programming language, libraries, implementation platform along with the data modeling and the observations
and results obtained from it are discussed.

Implementation Platform and Language :

ML models (i.e., SVM and Naive Bayes) are used to calculate evaluation parameters such as precision, recall, and f-measures,
which are described in Table 5. In terms of recall and f-measures, we found that Naive Bayes is more successful and produces
better results; however, SVM produces the highest precision percentage when compared to Naive Bayes. The results of the
comparative analysis provided in Table 5 show that Naive Bayes achieves better results in terms of recall and f-measure, while
SVM achieves better results with respect to precision.

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M117

In the mentioned table we have compared the accuracy of four different ML and DL models. We can see that the DL model
(LSTM) is the most accurate among all the models, but it takes a long time to train. ML models like SVM and Naive Bayes are
around the same accuracy percentage lower than LSTM/CNN, which is also a DL model and has the lowest accuracy
percentage. Below figure shows accuracy comparison of ML and DL models.

The following graphs (Figures 15 and 16) demonstrate the model loss for CNN and LSTM for each epoch. The graph line is
obviously decreasing as the epochs increase, as can be seen. When the number of epochs is increased, the model loss rate
decreases.

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M118

This depicts that DL models are particularly good at detecting and classifying Urdu spam emails, as they produce more accurate
and precise detection.
In this study, we used existing models for detection of Urdu spam emails, and more training and better detection were also
explained for SVM, Naive Bayes, CNN, and LSTM. Furthermore, the accuracy of each model was calculated, and evaluation
measures such as precision, recall, and f-measure for SVM and Naive Bayes and for CNN and LSTM as well as the measures
ROC-AUC and model loss were used for comparative evaluation. According to the findings, the LSTM model obtained higher
accuracy than the other models with a score of 98.4%.

Conclusion and Future Scope:

From this project, it can be concluded that Microsoft Azure Machine Learning Studio is a cloud collaborative tool which has
capabilities to predict analytics solutions on particular data. This research has leveraged the Azure Machine learning by
modifying Vowpal Wabbit algorithm in order to detect spam. The classification model and score weights based on words used
will determine the spam.

The performance of a classification technique is affected by the quality of data source. Irrelevant and redundant features of data
not only increase the elapsed time, but also may reduce the accuracy of detection. Each algorithm has its own advantages and
disadvantages. As stated before, supervised ML is able to separate messages and classify the correct categories efficiently. It is
also able to score the model and weigh them successfully. For instance, Gmail’s interface is using the algorithm based on a
machine learning program to keep their users’ inbox free of spam messages. During the implementation, only text (messages)
can be classified and scored instead of domain name and email address. This project only focuses on filtering, analysing and
classifying messages and does not block them. Hence, the proposed methodology may be adopted to overcome the flaws of the
existing spam detection.

With the increased usage of emails, this study focuses on using automated ways to detect spam emails written in Urdu. The study
uses various machine learning and deep learning algorithms to detect them. In the study, a translated emails dataset including
spam and ham emails is generated from Kaggle, which is preprocessed for various approaches. Accuracy, precision, recall, F-
measure, ROC-AUC, and model loss are used as comparative measures to examine performance. The study concludes that deep
learning models are more successful in classifying Urdu spam emails. Comparatively, the LSTM algorithm has a high accuracy
rate of around 98% with a low model loss rate of 5%. Even though LSTM takes a little longer to train than CNN, SVM, or Naive
Bayes, its efficiency and accuracy rate are far better than those of the other approaches. The creation of an actual dataset of Urdu
emails can be considered as a viable future task. In addition, more recent artificial intelligent approaches may also be considered
to detect spams.

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M119

1.N. Kumar, S. Sonowal, and Nishant, “Email spam detection using machine learning algorithms,” in Proceedings of the 2020
Second International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 108–113, IEEE,
Coimbatore, India, July 2020.

2. G. Jain, M. Sharma, and B. Agarwal, “Optimizing semantic lstm for spam detection,” International Journal of Information
Technology, vol. 11, no. 2, pp. 239–250, 2019.

3. F. Masood, G. Ammad, A. Almogren et al., “Spammer detection and fake user identification on social networks,” IEEE
Access, vol. 7, pp. 68140–68152, 2019.

4. A. Akhtar, G. R. Tahir, and K. Shakeel, “A mechanism to detect Urdu spam emails,” in Proceedings of the 2017 IEEE 8th
Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON), pp. 168–172, IEEE, New
York, NY, USA, Oct 2017.

5. H. Drucker, D. Donghui Wu, and V. N. Vapnik, “Support vector machines for spam categorization,” IEEE Transactions on
Neural Networks, vol. 10, no. 5, pp. 1048–1054, 1999.

6. H. Afzal and K. Mehmood, “Spam filtering of bi-lingual tweets using machine learning,” in Proceedings of the 2016 18th
International Conference on Advanced Communication Technology (ICACT), pp. 710–714, IEEE, PyeongChang, Korea
(South), Feb 2016.

7. S. K. Tuteja and N. Bogiri, “Email spam filtering using bpnn classification algorithm,” in Proceedings of the 2016
International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT), pp. 915–919, IEEE,
Pune, India, Sep 2016.

8. M. Mohamad and A. Selamat, “An evaluation on the efficiency of hybrid feature selection in spam email classification,” in
Proceedings of the 2015 International Conference on Computer, Communications, and Control Technology (I4CT), pp.
227–231, IEEE, Kuching, Malaysia, Apr 2015.

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M120

Cohesity ServiceNow Integration-User Guide
No ratings yet
Cohesity ServiceNow Integration-User Guide
29 pages
TM-1812 AVEVA™ E3D Design (2.1) Structural Modelling Rev 4.0
100% (2)
TM-1812 AVEVA™ E3D Design (2.1) Structural Modelling Rev 4.0
208 pages
Lift Traffic Design Spreadsheet - All Peaks
No ratings yet
Lift Traffic Design Spreadsheet - All Peaks
8 pages
Selenium Java Interview Questions
100% (5)
Selenium Java Interview Questions
22 pages
Kriti - Report FINAL
No ratings yet
Kriti - Report FINAL
11 pages
Report (1) 1
No ratings yet
Report (1) 1
35 pages
Aryan Blackbook 1
No ratings yet
Aryan Blackbook 1
29 pages
Final Report Spam Classifier
No ratings yet
Final Report Spam Classifier
24 pages
Second Progress Report
No ratings yet
Second Progress Report
17 pages
Spam Filter - Machine Learning
No ratings yet
Spam Filter - Machine Learning
25 pages
Spam Email Classifier
No ratings yet
Spam Email Classifier
17 pages
Presentation 3
No ratings yet
Presentation 3
13 pages
0 - Spam Mail Prediction
No ratings yet
0 - Spam Mail Prediction
29 pages
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
No ratings yet
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
7 pages
Email Spam Detection PPT Github
No ratings yet
Email Spam Detection PPT Github
11 pages
Real Time Spam Detection
No ratings yet
Real Time Spam Detection
65 pages
Final Report (Saie)
No ratings yet
Final Report (Saie)
38 pages
Spam Detection Synopsis
No ratings yet
Spam Detection Synopsis
8 pages
Zoom
No ratings yet
Zoom
20 pages
ML Lab
No ratings yet
ML Lab
13 pages
FICE Project Report Spam
No ratings yet
FICE Project Report Spam
14 pages
Email Spam Detection
No ratings yet
Email Spam Detection
8 pages
E-Mail Spam Detection
No ratings yet
E-Mail Spam Detection
8 pages
VBK23 Cse 041
No ratings yet
VBK23 Cse 041
6 pages
IJRPR8167
No ratings yet
IJRPR8167
7 pages
Email Spam Detection Using Machine Learning
No ratings yet
Email Spam Detection Using Machine Learning
2 pages
Synopsis Email Spam
No ratings yet
Synopsis Email Spam
9 pages
Major-Final Research Paper
No ratings yet
Major-Final Research Paper
3 pages
Pruthviraj Micor Foml
No ratings yet
Pruthviraj Micor Foml
26 pages
1822 B Deleted Merged Cropped
No ratings yet
1822 B Deleted Merged Cropped
40 pages
Email Classification Using Machine Learning
No ratings yet
Email Classification Using Machine Learning
22 pages
Review 2
100% (1)
Review 2
29 pages
Spam Mail Classifier
No ratings yet
Spam Mail Classifier
8 pages
$RVJ44FQ
No ratings yet
$RVJ44FQ
13 pages
Vishal FOML Micro Project Vishal & Milan
No ratings yet
Vishal FOML Micro Project Vishal & Milan
26 pages
Spam Detection & Classification Final
No ratings yet
Spam Detection & Classification Final
38 pages
Amrit Science Campus: Submitted by
No ratings yet
Amrit Science Campus: Submitted by
35 pages
Final PPT
No ratings yet
Final PPT
18 pages
B. Flowchart of The Model: Esult
No ratings yet
B. Flowchart of The Model: Esult
3 pages
Ijirt156181 Paper
No ratings yet
Ijirt156181 Paper
5 pages
44 Decision Tree Model For Email Classification
No ratings yet
44 Decision Tree Model For Email Classification
4 pages
Anti Spam
No ratings yet
Anti Spam
26 pages
Spam Mail Detection Using Machine Learning
No ratings yet
Spam Mail Detection Using Machine Learning
5 pages
2020CSEPID63 - Spam Alert System Synopsis Final
No ratings yet
2020CSEPID63 - Spam Alert System Synopsis Final
12 pages
Research Article On The Forensic
No ratings yet
Research Article On The Forensic
14 pages
NLP Report
No ratings yet
NLP Report
19 pages
Data Science Report
No ratings yet
Data Science Report
33 pages
Spam Filter Project Report Logistic Regression
No ratings yet
Spam Filter Project Report Logistic Regression
10 pages
Abhishek Mini Proj . File
No ratings yet
Abhishek Mini Proj . File
19 pages
Maths Answers
No ratings yet
Maths Answers
4 pages
Spam Detection
No ratings yet
Spam Detection
39 pages
Ass 3
No ratings yet
Ass 3
2 pages
Irjet V9i11154
No ratings yet
Irjet V9i11154
4 pages
Project Report Emaildetection 4 44
No ratings yet
Project Report Emaildetection 4 44
41 pages
Decision Tree Model For Email Classification: Ivana Čavor
No ratings yet
Decision Tree Model For Email Classification: Ivana Čavor
4 pages
Spam Email Classifier - Ramsanjay
No ratings yet
Spam Email Classifier - Ramsanjay
2 pages
Spam Message Classification: RTRP Review-1
No ratings yet
Spam Message Classification: RTRP Review-1
12 pages
Email Spam Filtering Using Machine Learning.1
No ratings yet
Email Spam Filtering Using Machine Learning.1
16 pages
E-Mail Spam Detection by Using NLP and Naïve Bayes Classification Through Machine Learning
No ratings yet
E-Mail Spam Detection by Using NLP and Naïve Bayes Classification Through Machine Learning
5 pages
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
No ratings yet
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
64 pages
Spam Email. Classifier
No ratings yet
Spam Email. Classifier
16 pages
Email Spam Detection Edited
No ratings yet
Email Spam Detection Edited
30 pages
Evaluation and Comparison of Machine Learning Models For Ham and Spam Email Classification
No ratings yet
Evaluation and Comparison of Machine Learning Models For Ham and Spam Email Classification
13 pages
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
From Everand
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
Margaux Masson-Forsythe
No ratings yet
BIAS FX 2 Mobile Product Manual
No ratings yet
BIAS FX 2 Mobile Product Manual
42 pages
AkarshKA MScASA-1
No ratings yet
AkarshKA MScASA-1
1 page
AD8232 Heart Rate Monitor Hookup Guide: Available Online at
No ratings yet
AD8232 Heart Rate Monitor Hookup Guide: Available Online at
15 pages
Software Requirements Specification
No ratings yet
Software Requirements Specification
5 pages
Capabilidade 501220 - (Doosan5024)
No ratings yet
Capabilidade 501220 - (Doosan5024)
6 pages
Option - 1 Option - 2 Option - 3 Option - 4 Correct Answer Marks
No ratings yet
Option - 1 Option - 2 Option - 3 Option - 4 Correct Answer Marks
4 pages
CO COMP9313 1 2024 Term2 T2 InPerson Standard Kensington
No ratings yet
CO COMP9313 1 2024 Term2 T2 InPerson Standard Kensington
10 pages
Civil Site Design V 1700
No ratings yet
Civil Site Design V 1700
6 pages
Iota Bits Mid-II
No ratings yet
Iota Bits Mid-II
18 pages
Alvin
No ratings yet
Alvin
25 pages
Beginners Simple Enhancement For SE38: Applies To
No ratings yet
Beginners Simple Enhancement For SE38: Applies To
16 pages
770 Resource Scheduler
No ratings yet
770 Resource Scheduler
8 pages
Loan Eligibility Service: Important Instructions
No ratings yet
Loan Eligibility Service: Important Instructions
4 pages
ALV in ABAP Using OOPS Concept
No ratings yet
ALV in ABAP Using OOPS Concept
11 pages
GRX 3
No ratings yet
GRX 3
2 pages
Sujan CV
No ratings yet
Sujan CV
4 pages
Holistic App Testing Strategies Ebook
No ratings yet
Holistic App Testing Strategies Ebook
12 pages
Drawing Realistic Ipad2 - Photoshop Tutorial: Step 1: Ipad Basic Shape
No ratings yet
Drawing Realistic Ipad2 - Photoshop Tutorial: Step 1: Ipad Basic Shape
54 pages
A Reinforcement Learning Approach To Job-Shop Scheduling: Wei Zhang Thomas G. Dietterich
No ratings yet
A Reinforcement Learning Approach To Job-Shop Scheduling: Wei Zhang Thomas G. Dietterich
7 pages
II Sem
No ratings yet
II Sem
11 pages
Storage Lab
No ratings yet
Storage Lab
4 pages
Zach Leach Portfolio
No ratings yet
Zach Leach Portfolio
11 pages
Midterm Project Report
No ratings yet
Midterm Project Report
39 pages
Step by Step Manual Upgrade Oracle Database From 12c To 19c - DBsGuru
100% (1)
Step by Step Manual Upgrade Oracle Database From 12c To 19c - DBsGuru
47 pages
WAF Product Brief
No ratings yet
WAF Product Brief
4 pages
Billdesk PG Interface Specs-Libord Brokerage Private Limited
No ratings yet
Billdesk PG Interface Specs-Libord Brokerage Private Limited
11 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

IJCRT23A5429

Uploaded by

IJCRT23A5429

Uploaded by

www.ijcrt.

SPAM MAIL PREDICTION USING MACHINE

Keywords : Spam, Natural Processing Language (NLP), Ham, Big Data

II. LITERATURE SURVEY

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M114

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M115

Run the command below:

Implementation and Results:

Implementation Platform and Language :

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M117

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M118

Conclusion and Future Scope:

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M119

IJCRT23A5429 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org M120

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.