Email Spam Filtering
Email Spam Filtering
Caution !
“SPAM - Spiced Ham ” is a popular
American canned meat brand…
12/18/24 Email Spam Filtering - Muthiyalu Jothir 3
Problem
With a tiny investment, a spammer can send over
100,000 bulk emails per hour.
Information about
interested customers
• etc.
complaints
Spammers kicked off
Disadvantage
Disguised Spammers.
email headers
12/18/24 Email Spam Filtering - Muthiyalu Jothir 8
Filters that Fight Back (FFB)
Majority of spam contain links to web pages.
Spam filters could auto retrieve the URLs and crawl back to
those pages, which would increase the load on the server.
If all the spam receivers do this at the same time, the server
might be crashed and so the cost of spamming increases.
Caution !
Disadvantage
This technique is rude
Use blacklists and white lists for the first level filtering
(before applying content checks) and not used as the only
tool for making decision.
Disadvantage
Prone to wrong configurations with legitimate servers unable to
exit from a list where they had been incorrectly inserted.
Rules could be
• words and phrases
• lots of uppercase characters
• exclamation points
• special characters
• Web links
• HTML messages
• background colors
• crazy Subject lines etc.
Example rules
• header __LOCAL_FROM_NEWS From /news@example\.com/i
Disadvantage
Static rules too general
Spammers find new ways to deceive the
rules
Validation
Implementation
Pr (words)
Caution!
Take care not to over train the network.
E.g.
Original Filter = {“Share Market”, “Higher Studies”}
Received filter = {“Share Market”, “Job Alerts”}
Future work :
Learning techniques for Filter Selector
Better Similarity measures