0% found this document useful (0 votes)
32 views8 pages

Kumar 2021

Uploaded by

Kuladeep P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Kumar 2021

Uploaded by

Kuladeep P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).

IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

Real-Time Hashtag based Event Detection Model


2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) | 978-1-6654-1960-4/20/$31.00 ©2021 IEEE | DOI: 10.1109/ICICV50876.2021.9388426

with Sentiment Analysis for Recommending user


Tweets
P M Ashok Kumar1, K.Guru Charan2, G.B V Sai Kumar3, K.Amith4 and K.Sai Krishna5
Department of Computer Science and Engineering
KL Deemed to be University, Vaddeswaram
Guntur District, Andhra Pradesh-522 502, India
profpmashok@gmail.com1, charan.kakaraparthi@gmail.com2, amith.k000@gmail.com3,
bhanuvenkat097@gmail.com4, kittu.kothuru@gmail.com5

Abstract— Recently, the usage of S ocial Networking Sites events typically encompass a sequence of major mo ments that
(SNS ) has increased tremendously. People use online social magnetize the eye of users.
networking applications like Twitter to specify their opinions and
feelings about many topics which they produce an awesome The abundance and real-time availab ility of Twitter data
amount of data every day. This generates an unlimited amount of have proved beneficial in detecting events in various domains
data requiring significant effort to read tweets relevant to the like emergencies, public health, and crime detection, soon.
user preference. In this paper, we proposed a model “Real Time Nevertheless, two crit ical challenges occur while detecting
Hash tag based Event Detection for Tweet Recommendation” on events using social media data. First, the uncertainty in
the level of individual posts. This model recommends Events at capturing the Events of posted tweets, which is that the results
various levels from the foremost abstracted to the foremost of the finite availability of the contextual information due to
definite. Firstly, pre-processing techniques like stop word the small length of tweets. Second, the high computation cost is
removal, frequent word removal, stemming, lemmatization, required in event detection [3].
lower-case conversation are applied. The proposed HashTag
Event Detection (HTED) algorithm is applied to detect events Twitter offers plenty of exhausting data that can aid in
corresponding to the tweets and find the sentiment polarity of presuming data like time-stamps, live location, and count of
each user interested in a particular event. This represents followers, likes, and dislikes of posts, etc. that has been used
indirectly the interest of the particular user. Finally, the S upport over the years to supply application-oriented outcomes. With
vector decomposition (S VD) technique is applied to predict the the rise in real-time data, The Real-Time Event Detection
user preference of events in social media streams using hashtags approaches became the first and foremost mode of detecting
in posts collected from Twitter. Experiments are performed abnormal patterns happening in numerous groups ahead of
using our dataset extracted over 1-Lakh tweets of different users. standard media. In Twitter, Hashtags are utilized within t weets.
The performance of the HTED system is evaluate d in terms of Hashtags permit users to form groups of individuals inquisitive
the Receiver Operating S ystems curve and mean average recall about identical events by creating it effortless for them to
curve. It is observed that the proposed HTED model performs discover and share out relevant data. For instance, if a user
better than the traditional Random Recommendation and
shares tips and tricks to lower sugar levels; this may be
Popularity Based Recommendation for users using hashtags. beneficial for patients with Heart Disease; however, it is not
Keywords- Real-Time Event Detection, Sentiment Analysis,
created by looking for a "Heart Attack" entity. Here, if the user
Personalized Recommender S ystem used the hashtag (#Healthytip) for the same, Fello w Twitter
Mates would quickly find it [3].
I. INT RODUCT ION Basically, the matter of identifying topics in the short text is
Online sources like Twitter made key essential a part of the addressed, especially for text obtained from microblogging
trendy social life of several users, within the past few years, websites, and Classification and clustering methods will be
they have fascinated the eyes of business websites, political used to classify the tweets into their relevant events. But this
parties, news agency, and so on, because of its functionality to process will consume longer time in the training phase due to
collect and unfold informat ion, to induce users feedback and to massive data and if the model did not train properly then the
push and recommend their brand new products, Sports Live irrelevant event may assign to the tweets and this method is
Co mmentary, soon and this data is beneficial for several forms also an existing approach, many researchers are still us ing
of data Exp lorations, like Real-Time Event Detection and these methods not only to categorize the tweets but also to
Sentiment Analysis and Recommending Hashtags to users . increase the reliability of the algorith m (Accuracy) rate. It
Particularly twitter stands out as a public stage because of its observed that this method is already in the working stage, so
distinctive way of permitting users to convey every bit of the tweets are classified according to events in numerous
numerous events through user postings and user interactions. approaches by proposing a new model for classification which
User posts, i.e., t weets, are often associated with events like can be a remedy to the existing model. So, to replace the
social events, etc. Repeatedly, it’s before newswire and gets to existing model, a labeling method “Efficient Hashtag Event
users instantly once events are unfolded. The majority of Labelling Model for Tweet Classification” is developed which

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1437

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

focuses mainly on hashtags in a very t weet. The Proposed gathered tweets identified with the Paris Attack during Nov -13
model will only Pick and Ext ract the Hashtags by discarding to Nov-17, 2015 as their dataset. Their experimental
unigrams in a very t weet and therefore the Extracted hashtag is performance offered improved methods for Event Prediction in
going to be treated as an Event and it is going to be assigned to social media streams, by applying HASHSTREAM and K-
the Tweet. By using this approach, the events can be detected Means. As a final outcome K-Means approach performed in a
in less time. Th is scheme will be shown as computationally way that is better than HASHSTREAM.
more feasible [5].
Harshal Kapase et al. [6], proposed “Sentiment analysis on
After the actual and relevant event detection, the Next Step the Twitter data” and it aims to find whether a tweet is positive
is Essential Sentiment analysis. It helps for opinion mining or negative or neutral, it helped them to grasp whether the
especially in Social Network Sites Sentiment analysis on situation in some disaster area is critical or not. So by utilizing
Twitter posts can help us know the way that how people react those results, it would be quite easy for official government
to a specific event to know that the user posted the tweet in social charities to take several decisions for the sake of people.
which sense i.e., whether the Posted tweet is Positive or Their proposed system brings in Dual Sentiment Analysis
Neutral or Negative type and how their opinion can change if (DSA) wh ich supplies their massive dataset. As a result, the
something unusual happened [7]. After this module, we’ll accuracy of their model additionally it will increases which is
know of every user interested event and on the other hand, the their main concern. Their experimental performance shown
user posted tweet polarity score. At last, Personalized that their model can be used to find whether the tweet is
Recommendation System has been created for recommending positive/negative, if they have a propensity to get the tweets
Hashtags to active users based on their interesting event and regarding disaster place and if they find out that most of the
Polarity Score [11]. tweets are showing “negative” which implies the situation in
that area is critical, and can directly inform govt charities about
As tagging culture becomes widely adopted, the
those major problems.
development of hashtag recommendation systems has gained
researchers’ attention. Some recent studies have proposed to Mayuri D Malas and Madhav V.Vaidya [7], introduced a
recommend predefined hashtags or general topics hidden in system focussed on twitter top and trending events mining, and
each tweet [15]. Though these systems are beneficial in investigating the user sentiments and acquiring an event
encouraging and assisting users to get into the tagging habit, it overview. Their system tries to make t weets into categories on
may not be sufficient for informat ion seekers who wish to find a specific event hashtag oriented and gives incisive overview
newly emerging hashtags. In contrast, recommending the most for every category. Further they determined user sentiments
popular hashtags does reflect timely topics, but it often associated with the events. The user sentiments are main ly
includes heavily used general hashtags and suggestions are not divided into categories like positive, neutral and negative and
personalized. Other studies have proposed recommending these categories have their unique purpose. Their experimental
hashtags based on similar tweets but an appropriate hashtag performance shown that their resultant sum of positive tweets,
recommender system is needed to assist users in choosing sum of neutral tweets and sum of negative tweets and also
relevant hashtags for their tweets. Therefore, the hashtags offers the final Sentiment Analysis result of each event.
recommender method is presented to encode the tweet vector-
Rabia Batool [8], proposed way to examine the tweets on
based representation by using SVD.
classification of large data and investigating tweets sentiments
more specifically. By using Entity-based feature extraction the
II. RELAT ED SURVEY A ND W ORK informative data fro m tweets is being extracted. Their proposed
In this particular section, offered the brief exp lanation of strategy encourages the extraction of entities fro m tweets data
research done in this domain and a brief survey of related that are utilized for tweet classification, sentiment analysis.
methods, relating a range of various methods that are emp loyed Their initiated framework is tried on a tweet corpus of 40K.
in studying the problem. There are various literature papers Their approach has executed in a way that is better than the
focuses on event detection models.Firstly, existing event current systems. Their final Experimental results on applying
detection methods that detect events from social media data is entity based feature enhancer; they got increment in data profit
stared and then we’ll propose and frame our own method in a very scope of 0.1% to 55%, this data profit has empowered
which can have high effective performance. their applied work to outline data for sentiments of user w.r.to
entity from a specific group.
Surekha et al.[2] introduced Harsh Set Theory based
approach for managing with irregularit ies in data and its Shubham [9], proposed a framework to organize tweets into
presentation is tried by offering the pre-processed data to the several categories like sports, News, etc. For classification of
Tree Based classifier. At last, the trial results uncovered the live tweets into numerous categories they employed Support
significance of information pre-processing. Vector Machine and Naïve Bayes algorithm. Their work states
SVM classifier is more accurate than the Naïve Bayes.
Shih-Feng Yang and Julia Taylor Ray z [5], proposed an
event prediction method that uses hashtags in tweets. They Yibo Ren [12], pro jected a model that states that, with the
employed the feature extraction in HASHSTREAM for K- advancement of electronic trade frameworks, the sizes of
Means. To the best of our understanding, this is the primary clients and things develop quickly, brought about the
analysis to improve the HASHSTREAM framework by outrageous sparsity of client rating informat ional index.
employing various clustering techniques to strengthen the Conventional likeness measure strategies work poor in this
initial HASHSTREAM. They conducted research on their own circu mstance, make the nature of proposal framework

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1438

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

diminished significantly. Sparsity of clients' appraisals is the not appraised by utility of SVD innovation, and afterward
major reason causing the low quality. To address this issue, a utilizes Pearson relationship comparability estimation to
community separating proposal calculation dependent on discover the objective clients' neighbors, finally creates the
solitary worth deterioration (SVD) smoothing is introduced. suggestions. The collective separating suggestion calculation
This methodology predicts thing evaluations that clients have dependent on SVD s moothing can ease the sparsity issues of

Fig 01: Proposed Architecture for Event Detection

Fig 02: S ystem flow for Extracting Live Tweets

the client thing rating dataset, and can give better polarity scores and recommended the hashtag based events to
suggestion than customary cooperative separating calculations. active users.
wu yuan-hong [13], used Singular Value Deco mposition
(SVD) in combination with hybrid collaborative filtering (CF), A. Merits of Proposed Model:
proved to be an excellent answer for sparsity problem. SVD is 1) This introduced model makes the usage simple.
used in order to diminish the size of the user-page view matrix
brought from web usage mining. Later, both low-rank matrices
are employed in order to obtain item-based and user based 2) This introduced model is ascendible to m variety of
predictions. A system for building automatic webpage social networking platforms only if data is in open category.
recommendations in real-time platforms is designed. The
recommendation engine which occurs in the online phase gets 3) The data redundancy of the proposed model is
the user’s request and provides the recommended links in real improbably low; it also increases the proficiency of
time. This will has benefit in real world. proposed system. Hence, this model manages time
additionally and effectively.
III. PROPOSED M ODEL B. Data Modelling : Extracting Live Tweets
In this particular section, an in depth explanation of the A Huge collection of data is extracted using Twitter API.
steps concerned in the proposed method has been specified. Once the flow of this data committed, and are able to ready to
The proposed approach consists of Six Modules: (1) Extracting run the additional checks of data mining. Tweets affiliated to
the live Tweets (2) Primary Exploratory Data Analysis (EDA) chosen events are extracted by indicating outcome kind as
(3) Essential Pre-Processing and Cleaning (4) Classification of ‘very recent’ and hashtag is the formal query defined. Few
Events i.e. Event Detection (5) Sentiment Analysis Tweets and methods demonstrated the hashtag utilization is the finest
(6) Reco mmender System. The block diagram of proposed progressive method to gather tweets affiliated to a certain
model is given in Figure 1. First fall, corpus of tweets extracted hashtag. The Data is gathered around 1-Lakh tweets from
through Twitter Public API and Pre -processed the data by several users with repetition of tweets on real t ime events.
playacting cleaning methods; Converting Upper to lo wer case Ext raction done based on supported Hashtag basis. Note that
then enforced our model on data which is able to pick and the chosen events are generally different, including both
Ext ract the Hashtag within a tweet and performed Sentiment Human oriented and natural claimed disasters taken place in
analysis on tweets. By using the user interested events and

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1439

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

numerous zones of the world. Hence, the language variety of only the hashtags by removing the Unigrams within each and
the tweets are often expected to be miscellaneous further [10] every tweet. Feature definit ion of the hashtag has been adopted
and developed in-house model and entitled as “Efficient
C. Exploratory Data Analysis and Data Visualization Hybrid Event Labeling Model for Tweet Classification”. This
EDA is a step in the Data Analysis Process; wherever is often quite similar to RA KE (Rapid Automatic Keyword
varieties of techniques are used to better understand the dataset Ext raction) Algorithm but the operating model and dealing
getting employed. It’s how of visualizing, summarizing and principles are vary to each other i.e., in one hand RAKE
interpreting the information that’s hidden in rows and colu mn algorithm will extract a bundle of keywords in a tweets and on
format. Once EDA is co mplete and insights are drawn, its the other hand proposed model will only focuses on hashtag
feature may be used for supervised and unsupervised models. i.e., it will ext ract only hashtag. This model will helps us to
However, several EDA techniques will remedy some common classify the Tweets in accord with respective Event. This
issues that are present in every dataset. EDA techniques to scheme will shown to be computationally more feasible.
Emp loyed in this work to Detect Data Shape, Dis tinguish
Attributes, Analytical Statistics, Detect Missing Values, and
Tweet/Retweet Featuring [1]. TABLE 2: Hashtag (h) Algorithm
Input: Extracted T weet Set E(x)
TABLE 01: Number of Tweets Collected For O utput: Hashtag(T)
Each Hash tag Begin
1. Hashtag Extraction() File
S.No Hash tags Collection 2. While not the end of lines do
1 Climatechange 20,512 3. Extraction of Hashtag:
4. For each ‘line’ Extract H(T)= #[Tagging_Text]
2 Globalwarming 10,537 5. Append the Extracted Hashtag to Equivalent Tweet
3 Football 5,989 6. Ensure that each row Must having hashtag that relates to the event
tagged in tweet
4 Ipl2020 7,543 7. End for loop &Terminate from loop after the Last item
5 Attack 2,607 End
6 Makeinindia 9,962
7 Nep2020 14,073 Fro m our end, assumed each and every tweet must and
should contains a minimu m one hashtag i.e., hashtag ≥1U {Ti}
8 Fdp 10,117
where Ti is that the Tweet present in i-th position of data.
9 Entertainment 8,690 Hashtag (h) Algorith m (as shown in Table 02) provided to give
10 Politics 9,970 you a close in-depth description and working analysis of the
T otal Extracted Tweets 1-Lakh Hashtag Extraction. The Model will Append the Hashtag
(event) to tweet so that it can be treated as the “Tweet (T)
D. Data Pre-Processing and Cleaning Techniques belongs to particular and relevant Event (E)” respectively [5].
Data pre-processing is a data mining strategy which is
F. Effective Sentiment Analysis Using VADER
employed to remodel the source data in a very meaningful
format. The Twitter data in source format might have several Sentiment Analysis is used for finding the user tweet
un-cleaned data which will establish much intrusion throughout sensitivity. Sentiment analysis on twitter posts can help us
the process. To boost the quality standards of data for more know the way that how people react to a specific event in order
processing, the dataset is first pre-processed. This step is meant to know that the user posted the tweet in which sense i.e.,
to get rid of characters and words that might result in whether the Posted tweet is Positive or Neutral or Negative
extraneous information in our analysis [2]. type. Sentiment analysis for social networks is one among the
foremost popular and mature research issues in Information
1) Steps incorporated in Data Pre-Processing Retrieval. Several corporations already give services aimed to
a) Cleaning Text: The data can have several unrelated discover, and summarize opinions of social network us ers on
and null values. To deal with these tasks, cleaning data is completely different topics. Sentiment analysis gives precious
required. It involves dealing of missing data etc., intuitions from social networking platforms by finding client
b) Removing Stop words: Stop words are ordinarily opinions or user emotions from an oversized amount of data
occurring words that for a few co mputational processes give existing in unorganized format. Novel technique is proposed
very little data or in some cases introduce unnecess ary noise that gives statistically significant estimates of sentiment
and thus must be removed and all the terms that are made out category proportions in social media analytics, and particularly
of stop words are deleted. within the Micro blogging Retrieval that is for the result set of
c) Case Sensitivity: Uppercase letters are changed over tweets of a topic. We’ve shown that sentiment estimation can
to lowercase letters be conducted in real time tweets. Evaluation of user sensitivity
is done by using Valence Aware Dictionary for s entiment
E. Real-Time Hashtag Based Event Detection Reasoning (VADER) inbuilt package [7]. The entire process is
Early researchers used ML Algorith ms for classification shown in Table 3.
purpose, as it doesn’t give much accuracy rate. So, a novel
Model is designed for classification which can select and pick

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1440

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

TABLE 3: Sentiment Analysis using VADER Algorithm similarity between different events also system will give more
Input: File [“ Extracted Tweets”] efficient reco mmendations if their is a large volume of
O utput: “Sentiment Polarity Score & Sentiment Category” informat ion about users and events. So that, they have plenty
Be gin of options in their hands to access the new content. Hashtag
1. Sentiment Analysis() File
2. While not the end of lines do recommender system uses Collaborative filtering technique
3. Calculate Sentiment Polarity Score: using Single Value Deco mposition(SVD) and Matrix
4. for each ‘line’ Factorization.
5. If Sentiment Polarity Score(line) > 0 then
6. Sentiment Positive The main aim of this technique is that, users ideas and
7. e lse if Sentiment Polarity Score(line) = 0 then content are matched. Fo r example, for Hashtag
8. Sentiment Neutral recommendations, in this work, features used such as User,
9. e lse if Sentiment Polarity Score(line) < 0 then Events, Polarity Scores to find similarity between users.
10. Sentiment Negative
11. e nd for loop Singular value deco mposition also known as the SVD
End
algorith m is used as a collaborative filtering method in
recommendation systems. SVD is a matrix factorization
method that is used to reduce the features in the data by
G. Personalized Recommender System reducing the dimensions from N to K where (K<N).
In this module, Mild function has been created for For the part of the recommendation, the only part which is
Recommendation System which will take the inputs Events, taken care of is matrix factorizat ion that is done the user-item
Users, Sentimental polarity Values and recommends Events rating matrix. Matrix-factorization is all about taking 2
according to user interest. For a instance if the user posted the matrices whose product is the original matrix. Vectors are
tweets relates to IPL and this mild function will reco mmend used to represent item ‘qi’ and user ‘pu’ such that their dot
him the tweets related to IPL (Sports) [13]. product is the expected rating.
Expected rating = rˆui = qTi pu
Where ‘qi’ and ‘pu’ can be calculated in such a way that
the square error d ifference between the dot product of user and
item, the original ratings in the user-item matrix is least
[12][14]. The whole process is summarized in Table 4.
TABLE 4 : HT ED Recommender System Using SVD
Input: File [“ Contains : User, Events, Polarity Score”]
O utput: “Recommending Hashtag Based Events ”
Begin
1. Create Sparse Matrix with given input Data
Fig 03: System flow for Recommender System 2. Fill Null Entries with suitable and related to method
3. Implement the SVD on created Sparse Matrix
Usually there are t wo main ways in building a strong 4. K Compute Energy for finding low-rank matrix
recommendation system. Methods used for building 5. If K is Optimal Solution
recommendation systems:-Popularity-based, Random-based, 6. Acquire expected rating (r) from rK
Collaborative Filtering(CF) 7. Else repeat from step 5 until we get optimal solution
8. Recommend Hashtag Based Events based on K-Value
1) Popularity-Based Recommendation System End
It is a type of recommendation system wh ich works on the 3) Random Recommender System
principle o f popularity and or anything which is in trend.
In these kind of recommender systems, related to the
These systems check about the tweet which are in trend or are
most popular among the users and directly reco mmend those. tweet dataset calculated the frequency of the events and assign
the probability weights to each discrete events. Traditional
These systems does not suffer fro m cold start problems and
inversion technique is used for finding cummulat ive
also it can reco mmend Hashtag Based Events on various
probability of events and sampling a rando m nu mber ‘r’ in the
different filters. There is no need for the user's historical data.
range of [0,1] and then finding the inverse image of ‘r’ to get
Example: YouTube: Trending videos and Google News: News
filtered by trending and most popular news [11]. the random event.

2) HTED Recommender System Model IV. EXPERIMENT AT ION A ND RESULT S


HTED Stands for Hashtag Based Event Detection.This is a This particular section provides experimental outcomes of
also a Reco mmender System created with in-house abilities. It the proposed method.First, the details of the live Extracted
is considered to be one of the smart reco mmender systems that dataset is provided. Then, predicted the performance of the
work on the similarity between different users and also events proposed models and compared with different state of the art
that are widely used in twitter. It checks about the taste of Effective and Efficient ML Techniques.
similar users and recommends. The similarity is not restricted
to the taste of the user moreover there can be consideration of

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1441

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

A. Overview of Dataset: DAT ASET


S.No Column Name and Data Type
In this work, Twitter API is used to create Tweet dataset 1. T weet_ID (Int 64)
related to all latest events with all the addit ional attributes ( as 2. User Name (Object)
shown in Table 2). This twitter datset is processed to remove 3. User Posted T weet (Object)
stop words, remove null values, punctuations, case sensitve 4. T weet Posted Region (Object)
letters. 5. 5.1 Likes 5.2 Dislikes
6. Followers (Int 64)
TABLE 05: DESCRIPT IVE INFORMAT ION OF 7. Language of T weet (Object)

Fig 05: ROC of HT ED Recommender System Fig 06: ROC of Popularity Recommender System

Fig 07: ROC of Random Recommender System Fig 08: Comparison of Applied Models

B. Real-Time Hashtag Based Event Detection TABLE 07: FINDING SENSE OF EACH T WEET
Hash tag based event detection ( as shown in Table 2) is User Tweet Event Sentiment
applied to each selected tweet to find the events. To reduce the Guru Charan IPL2020 is Live IPL2020 Positive
number o f events, the tags with semantically similar mean ings Ashok Kumar FDP is Interesting FDP2020 Positive
are combined. The sample output is shown in Table 6.
TABLE 06: HASHT AG BASED EVENT DET ECT ION D. Various Proposed Recommender Systems
User Tweet Event The proposed HTED algorith m ( as shown in Table 4 is
Guru Charan #IPL2020 is Live IPL2020
Ashok Kumar #FDP is Interesting FDP2020 executed with user name as input and output as a set of
recommended events. The output events are ranked as per the
C. Effective Sentiment Analysis (Opinion Mining) predicted score for all the events specified. For co mparison
The next main objective of this work is to find user preference purposes, the whole process of tweet processing, sentiment
for various events. In this work, an open source package analysis are performed followed by the imp lemention of
recommender algorith m using popularity based technique and
named VIDER along with NLTK package is used find the
random event generating technique. The samp le output of all
sentiment value of each tweet. This value in turn indicates the
three algorith ms: HTED, Popularity, Random techniques are
user preference over a particular tweet. The sample output is shown in Table 8. It can be seen clearly, the proposed HTED
shown in Table 07. recommends the tweet event better than other techniques .

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1442

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

TABLE 08: COMPARISON OF DIFFERENT RECOMMENDER each tweet. Sentiment analysis conveys people's perspectives
SYST EMS towards various topics. Sentiment Analysis provides
User Type s Recommended Events additional depth understanding regarding views of individuals
Charan Actual [IPL2020] associated with the particular event and the tweets are
Charan HT ED [IPL2020, climatechange, globalwarming, ipl, categorized as positive, neutral, and negative types. Hashtag
football….] Based Events will be reco mmended to Active Users b y using
Charan Popularity [attack, climatechange, entertainment, ipl, fdp….] Sentimental Polarity Scores. Experiments were carried out by
Charan Random [attack, entertainment, climatechange, football,…] extracting real-time Twitter data and analysis were carried out
Ashok Actual [FDP2020] by comparing the ROC curves and mean Recall cu rves of the
Ashok HT ED [FDP2020, climatechange, globalwarming, ipl, proposed HTED with Random and Popularity based
football,…] recommender systems. All the analysis suggest that the
Ashok Popularity [attack, climatechange, entertainment, ipl,fdp, …] proposed method performs better than the previous methods.
Ashok Random [nep2020, makeinindia, climatechange, attack,…] The work can be additionally stretched out to make it zone-
centric and generate region-wise events. By using the
E. Model Performance Evaluation location-depended live occurring events, mo re related events
Performance Evaluation Metrics are categorized [16, 17]: can be recommended to the active users which will improve
into statistical-accuracy and decision-support metrics. the quality of user Engagement with our predicted
Receiver Operat ing Characteristic (ROC) sensitivity is recommendations.
emp loyed from decision-support metrics. ROC is used to
measure dependence of two variables without Assumptions. REFERENCES
1) Procedure for finding good model using ROC:
x ML Model is built using single attribute, class [1] Ryan Hafen, T erence Critchlow, “EDA and ML – A Perfect Pair for
variable. Large-Scale Data Analysis” , IEEE 27th International Symposium on
Parallel & Distributed Processing Workshops and PhD Forum, 2013.
x Rank the Feature according to ROC May 20-24, 2013, Cambridge, MA, Pages: 1894-1898.
x Pick the Features with high score. [2] Surekha Samsan, “An RST based Efficient Preprocessing T echnique for
x [.90-1]=Excellent, [.80-.90] = Good, [.70-.80] = Fair. Handling Inconsistent Data”, 2016 IEEE International Conference on
Computational Intelligence and Computing Research, Pages: 1-8
The Receiver Operating Curves(ROC) are drawn for all [3] Aarzoo Dhiman & Durga Toshniwal “An Approximate Model for Event
three methods with various thresholds. The Area under the Detection From T witter Data”,IEEE Volume: 8 | July-2020, Pages:
curve (AUC) of the proposed HTED method is 0.913( as 122168-122184
shown in Figure 5), indicating better true way of predicting the [4] Deepa Nagalavi and M. Hanumanthappa “N-gram Word Prediction
user tweets with less irrelevant tweets. AUC for Popularity- Language Models to Identify the Sequence of Article Blocks in English
Based Recommendation and Random Reco mmendation are E-Newspapers” IEEE 2016 International Conference on Computational
Systems and Information Systems for Sustainable Solutions, Pages :
0.8035( in Figure 6) and 0.7078(in Figure 7) respectively. It is 307-311
evident fro m the plots of three applied models ROC p lots, it is [5] Shih-Feng Yang and Julia Taylor Rayz “An Event Detection Approach
observed that the proposed HTED model outperforms better Based on T witter Hashtags” 18th International Conference on
than the traditional Popularity-Based Recommendation and Computational Linguistics and Intelligent Text Processing, 2018
Random Recommendation for all users. [6] Harshal Kapase, Kalyani Galande, Tanmay Sonna, Deepali Pawar,
Dipmala Salunke “Sentiment polarity analysis on T witter data from
The recall rate @ K is the number of suggestions and each different Events”, Volume: 05 Issue: 03 | Mar-2018,Pages: 1479-1482
suggestion is a list of items. So me items are right and some are [7] Mayuri D Malas and Madhav V.Vaidya “Real-T ime Progressive Event
not correct. In Figure 08 the three applied models are Summarization and Sentiment Analysis on Evolutionary T weet Stream”
compared. Three Models are s tarts at K=1 and Till K=2 there ICICCS 2017, Pages: 388-393
is an Increment in Mean Average Recall but fro m K=2 it [8] Rabia Batool, Asad Masood Khattak, Jahanzeb Maqbool and Sungyoung
shows constant i.e., K=2 is the turning point. It observed that Lee “Precise T weet Classification and Sentiment Analysis” IEEE 2016,
Pages: 1-6
Random and popularity models are somewhat less superior
compared to the Proposed HTED Model in terms MAR@K. [9] Shubham, Shashank Kumar, Sunanda Dixit, Piyush Kumar
“Classification of tweets into various categories using classification
Note: if the higher percentage of Events is recommended, then methods” 2018,IEEE Volume:04 Issue: 03 , Pages : 937-939
recall is higher. If a small percentage of Events is [10] Zeynep Zengin Alp, Sule Gunduz, “Extracting Topical Information of
recommended then precision will decreases automatically. T weets Using Hashtags” IEEE 14th International Conference on
Machine Learning and Applications (ICMLA) 2015
V. CONCLUSION [11] Supriya Singh,Abhishek Kesharwani, “Recommender System using
Sentiment Analysis” , International Journal of AdvancedResearch in
This research work exhib its a system that recommends Science, Engineering and Technology,Volume:05,Issue:12, Dec-
user preference tweets by extracting hashtag from live t weets 2018,Pages: 7615-7619
and then classifies the tweets by considering the hashtags [12] YiBo Ren, SongJie Gong , “A Collaborative Filtering Recommendation
presented in Tweets. If hashtags are classified and the Algorithm Based on SVD”, Third International Symposium on
Intelligent Information Technology Application,2009, Pages: 530-532
prediction of hashtags using classification, this might permit
[13] WU Yuan-hong, TAN Xiao-qiu ,“A Real-time Recommender System
us to grasp the user's semantic-word presentations. But it’s a Based on hybrid collaborative filterning” Fifth International Conference
more challenging accepting task to induce 100% accuracy as it on Computer Science & Education, August -2010, Pages:1909-1912
relies on massive data. Sentiment analysis is performed on

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1443

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

[14] Zeinab Sharifi, Mansoor Rezghi, Mahdi Nasiri ,“New Algorithm for
Recommender Systems based on Singular Value Decomposition
Method”, Third International Conference on Computer and Knowledge
Engineering (ICCKE), 2013.
[15] Otsuka, Eriko; Wallace, Scott A.; Chiu, David. “A hashtag
recommendation system for twitter data streams. Computational Social
Networks”, 2016.
[16] Huang qin-hua, Ouyang wei-min, Fuzzy collaborative filtering with
multiple agents, Journal of Shanghai University, 2007,11(3):290-295.
[17] Gao Fengrong, Xing Chunxiao, Du Xiaoyong, Wang Shan, Personalized
Service System Based on Hybrid Filtering for Digital Library, Tsinghua
Science and T echnology, February 2007,1-8.

AUTHO RS PRO FILE

Dr. P M AS HOK KUMAR is Currently


Working as an Associate Professor in the
Department of Computer Science and
Engineering at KL Deemed to be university
(KLEF), Vaddeswaram, Guntur, Andhra
Pradesh.
Mail-id :

Mr. K GURU CHARAN, is a student


currently pursuing the Final Year of B.Tech
(CSE) at Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur, Andhra
Pradesh. He is doing his Research Work in
Data Science and Big Data Analytics
Mail-id: charan.kakaraparthi@gmail.com

Mr. K AMITH, is a student currently


pursuing the Final Year of B.Tech (CSE) at
Koneru Lakshmaiah Education Foundation,
Vaddeswaram, Guntur, Andhra Pradesh. He
is doing his Research Work in Data Science
and Big Data Analytics
Mail-id:

Mr. G.B.V S AI KUMAR is a student


currently pursuing the Final Year of B.Tech
(CSE) at Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur, Andhra
Pradesh. He is doing his Research Work in
Data Science and Big Data Analytics
Mail-id:

Mr. K S AI KRIS HNA is a student


currently pursuing the Final Year of B.Tech
(CSE) at Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur, Andhra
Pradesh. He is doing his Research Work in
Data Science and Big Data Analytics.
Mail-id: kittu.kothuru@gmail.com

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1444

Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 18,2021 at 18:09:48 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy