Spammer Detection and Fake User Identification On Social Networks
Spammer Detection and Fake User Identification On Social Networks
(UGC Care Group I Listed Journal) Vol-10 Issue-5 No. 6 May 2020
N.Kesava Rao
Associate Professor
Department of Computer Science &Engineering, Narayana Engineering College Gudur
kesavarn@gmail.com
V.Srinath Reddy
Department of Computer Science &Engineering, Narayana Engineering College Gudur
srinathr389@gmail.com
K.Nikhil
Department of Computer Science &Engineering, Narayana Engineering College Gudur
nikhil70977@gmail.com
ABSTRACT
Social networking sites include millions of users worldwide. The interaction of users with these social
networking sites, such as Twitter and Facebook, has a profound effect and sometimes has negative
effects on daily life. Large social networking sites have become a platform for beneficiaries to
disseminate huge amounts of misinformation. For example, Twitter has become one of the most
commonly used platforms and therefore unintentionally allows for the wrong amount of spam. False
users send unsolicited tweets to users to promote services or websites that not only affect legitimate users
and interfere with the use of the services In addition, Twitter marketing strategies that showcase
strategies are based on their ability to find: (i) false content, (ii) spam-based URLs, (iii) spam in the
headlines, and (iv) fake users. The strategies you presented were compared based on a variety of factors,
such as user characteristics, content features, graph features, layout features, and time elements. We hope
that this presented study will be a valuable resource for researchers to find highlights of the recent
development of Twitter spam.
Keywords: Classification, fake user detection, Online social network, spammer’s identification.
1.INTRODUCTION
Twitter is already ineligible to receive any kind of information from any source worldwide using the
Internet. The increasing demand for social networking sites allows users to gather more information and
data about users. The sheer amount of information available on these sites also attracts the attention of
fraudulent users [5]. Twitter is becoming an online source for getting real-time information about users.
Twitter is an online social network (OSN) where users can share anything and everything, such as their
news, ideas, and experiences. Many debates can be made on a variety of topics such as politics, current
affairs and important events. When a user launches something, they are immediately informed by their
followers, allowing them to disseminate more detailed information [2]. Most people who do not have
much experience with OSNs can be easily deceived by fraudsters. Identification of fake news [8] on
social media is an issue that needs to be addressed on an individual and collective level [1] because of the
negative effects of such issues .
Twitter offers a survey of new and innovative ways to detect spam. The above survey presents a
comparative study of the curve method. On the other hand, the author [10]conducted a survey of the
various behaviors displayed by the elite players on Twitter. This study also provides a literature review
that identifies the presence of spammers on the Twitter social network. Despite all the available studies,
there is still some literature available. So, to close the gap, we do a thorough review of spammer
detection and false user identification on Twitter. In addition, the survey presents a classification of the
Twitter spam detection approach and attempts to provide a detailed description of recent developments in
the domain. The purpose of this paper is to identify the different apps of spam detection on Twitter and to
classify these approaches into several categories and present a classification. For classification, we have
identified four means of reporting spammers that may be helpful in identifying fake identities of users.
Spammers can be identified based on: (i) fake content, (ii) URL-based spam detection, (iii) spam
detection in trending topics, and (iv) fake user-identification.
3. METHODOLOGY
SPAMMER DETECTION ON TWITTER
In this article, we defined a classification of spammer detection techniques. Fig. 1 shows the taxonomy
for identification of spammers on Twitter. The proposed taxonomy is categorized into four main classes,
namely,
Fake content
Each category of identification methods relies on a specific model, technique, and detection algorithm.
The first category (fake content) includes various techniques such as regression prediction model,
malware alerting system and Lfun scheme approach. In the second category (URL based spam
detection), the spammer is identified in URL through different machine learning algorithms. The third
category(spam in trending topics) is identified through Naïve Bayes classifier and language model
divergence. The last category(fake user identification) is based on detecting fake users through hybrid
techniques. Techniques related to each of the spammer identification categories are discussed in the
following subsections.
(i) Retweets
(ii) hash tags,
(iii) user mentions and
(iv) number of URLs.
The evaluation results show that changing feature distributions reduced performance, whereas no
differences were observed in training dataset distributions.
Collection of tweets in relation to trending topics on Twitter. After storing tweets in a particular file
format, tweets are subsequently analyzed.
Labeling of spam is done to check through all datasets that are available to detect malicious URLs.
Feature extraction distinguishes the creation of features based on language models that use the language
as a tool and helps determine whether tweets are fake or not.
Classification of data sets is done by shortlisting the set of tweets that is described by the set of features
provided by the classifier to instruct the model and gain knowledge to detect spam.
The spam detection uses the classification technique to accept tweets as the input and classify the spam
and non-spam.
The graph-based[9] feature is used to control theft strategies operated by spammers. Spammers use
various techniques to avoid detection. They can buy fake followers from different third-party web sites
and exchange their followers for another user to look like a legal user.
Fig-2.Home Page
Modules:
This project contains 2 modules
1. User
2. Tweet Server(Admin)
1.User:
If a user has already an account then he can login with those credentials otherwise he has to register a
new account .
User can register new account by providing all the details shown in the fig 3.
User has to provide all the details shown above to register.
Once the user can registered he has to wait until the admin (tweet server) should authorize the new user .
All the fields are mandatory in the registration process.
Fig-3.Registration Page
After admin can authorize the user then he can login with his credentials as shown in Fig 4.
Fig-4.Login Page
Once the user successfully logged in he can perform different operations like viewing profile,search for
friends ,create tweets, view friends etc. as shown in Fig.5
User can create tweets by clicking on create tweet and he can search for friends with their names.
User search for friend and he can put a request to him. He has to wait until his friend’s response.(Accept
or decline)
User can view his profile by clicking “My Profile” as shown in Fig.6
Fig-6.User profile
Fig-7.Search Friends
Fig-8.Server Login
Server can monitor all the tweets posted by the users as shown in Fig. 8
It will display tweet image, tweet name, tweet description and time and date of tweet posted as shown
in Fig. 9
Fig-9.Tweets Posted
Server can authorize the user account and can able to see all fake and authorized user.
Admin can filter the normal users and fake users based on the tweet content.If the tweet contains
any illegal and offensive words it will automatically falls into offensive section and the user can
will be marked as a suspicious.We can also see the fake user identification results and fake tweets
results in the form of bar charts. The chart is drawn between the number of re-tweets posted and
the user as shown in Fig 10.
4.CONCLUSION
In this paper, we reviewed the strategies used to find spam users on Twitter. In addition, we have also
introduced the taxonomy of the Twitter spam Detection method and classified them as fraudulent content
discovery, spam-based URL detection, spam detection in headlines, and illegal user access methods. We
also compare strategies presented based on all aspects such as user characteristics, content features, graph
features, layout features, and time elements. In addition, strategies are also used to target their stated
objectives and data. It is expected that the updated version will help researchers find information on ways
REFERENCES
[1] M. Babcock, R. A. V. Cox, S. Kumar, "Diffusion of pro- and anti-false information tweets: The black
panther movie case", Comput. Math. Org. Theory, vol. 25, no. 1, pp. 72-84, Mar. 2019.
[3] C. Chen, Y. Wang, J. Zhang, Y. Xiang, W. Zhou, G. Min, "Statistical features-based real-time detection
of drifted Twitter spam", IEEE Trans. Inf. Forensics Security, vol. 12, pp. 914-925, Apr .
[4] C. Chen, J. Zhang, Y. Xie, Y. Xiang, W. Zhou, M. M. Hassan, A. AlElaiwi, M. Alrubaian, "A
performance evaluation of machine learning-based streaming spam tweets detection", IEEE Trans. Comput.
Social Syst., vol. 2, no. 3, pp. 65-76, Sep. 2015
[5] B. Erçahin, Ö. Aktaş, D. Kilinç, C. Akyol, "Twitter fake account detection", Proc. Int. Conf. Comput.
Sci. Eng. (UBMK), pp. 388-392, Oct. 2017
[6] S. Gharge, M. Chavan, "An integrated approach for malicious tweets detection using NLP", Proc. Int.
Conf. Inventive Commun. Comput. Technol. (ICICCT), pp. 435-438, Mar. 2017.
[7] A. Gupta, H. Lamba, P. Kumaraguru, "1.00 per RT #BostonMarathon # prayforboston: Analyzing fake
content on Twitter", Proc. eCrime Researchers Summit (eCRS), pp. 1-12, 2013
[8] M. U. S. Khan, M. Ali, A. Abbas, S. U. Khan, A. Y. Zomaya, "Segregating spammers and unsolicited
bloggers from genuine experts on Twitter", IEEE Trans. Dependable Secure Comput., vol. 15, no. 4, pp.
551-560, Jul./Aug. 2018
[9] M. Mateen, M. A. Iqbal, M. Aleem, M. A. Islam, "A hybrid approach for spam detection for
Twitter", Proc. 14th Int. Bhurban Conf. Appl. Sci. Technol. (IBCAST), pp. 466-471, Jan. 2017.
[10] S. J. Soman, "A survey on behaviors exhibited by spammers in popular social media networks", Proc.
Int. Conf. Circuit Power Comput. Technol. (ICCPCT), pp. 1-6, Mar. 2016.