0% found this document useful (0 votes)
25 views18 pages

Sailunaz emotion and sentiment analysis from twitter text

This paper focuses on detecting and analyzing emotions and sentiments in Twitter text to generate personalized recommendations. It introduces a novel dataset that includes tweets and their replies, measuring user influence through various parameters such as agreement and sentiment scores. The findings highlight the potential of using emotion networks for understanding user behavior and improving recommendation systems in social networks.

Uploaded by

jessicabarbara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views18 pages

Sailunaz emotion and sentiment analysis from twitter text

This paper focuses on detecting and analyzing emotions and sentiments in Twitter text to generate personalized recommendations. It introduces a novel dataset that includes tweets and their replies, measuring user influence through various parameters such as agreement and sentiment scores. The findings highlight the potential of using emotion networks for understanding user behavior and improving recommendation systems in social networks.

Uploaded by

jessicabarbara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Journal of Computational Science 36 (2019) 101003

Contents lists available at ScienceDirect

Journal of Computational Science


journal homepage: www.elsevier.com/locate/jocs

Emotion and sentiment analysis from Twitter text


Kashfia Sailunaz a,b,∗ , Reda Alhajj a,b
a
Department of Computer Science, University of Calgary, Alberta, Canada
b
Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey

a r t i c l e i n f o a b s t r a c t

Article history: Online social networks have emerged as new platform that provide an arena for people to share their
Received 4 October 2018 views and perspectives on different issues and subjects with their friends, family, relatives, etc. We can
Received in revised form 14 February 2019 share our thoughts, mental state, moments, stand on specific social, national, international issues through
Accepted 29 May 2019
text, photos, audio and video messages and posts. Indeed, despite the availability of other forms of com-
Available online 2 July 2019
munication, text is still one of the most common ways of communication in a social network. The target
of the work described in this paper is to detect and analyze sentiment and emotion expressed by peo-
Keywords:
ple from text in their twitter posts and use them for generating recommendations. We collected tweets
Emotion
Sentiment
and replies on few specific topics and created a dataset with text, user, emotion, sentiment information,
Text etc. We used the dataset to detect sentiment and emotion from tweets and their replies and measured
Emotion models the influence scores of users based on various user-based and tweet-based parameters. Finally, we used
Emotion detection the latter information to generate generalized and personalized recommendations for users based on
Sentiment detection their twitter activity. The method we used in this paper includes some interesting novelties such as, (i)
Emotion analysis including replies to tweets in the dataset and measurements, (ii) introducing agreement score, sentiment
Sentiment analysis score and emotion score of replies in influence score calculation, (iii) generating general and personalized
recommendation containing list of users who agreed on the same topic and expressed similar emotions
and sentiments towards that particular topic.
© 2019 Elsevier B.V. All rights reserved.

1. Introduction Researchers in various fields (i.e., psychology, business, com-


puter science, affective computing, artificial intelligence, etc.) have
Emotion is a very complicated, multidimensional characteristic been working on detecting and analyzing emotions from a piece
which reflects the personality and behavioral traits of humans. In of text. Various methods have been applied to detect correct emo-
their daily life, people express their emotions on different issues, tion from text. Due to the complexity of human emotions, detecting
events, persons, environment, and even every little thing surround- correct emotion from text has lots of challenges yet to address and
ing them. They use various communication methods to convey their overcome. Emotion recognition from text becomes more difficult
emotions to others. The most common way to reveal emotions to when multiple emotions are expressed through a single piece of
other people is through speech and facial expressions. But, with text. Sometimes emotion in a text is so implicit that makes auto-
the progress of technology and social networks, people express matic emotion detection nearly impossible. Lots of sarcastic texts
their emotions to their friends using different types of social net- are often difficult even for other humans to recognize, let alone
work posts. Although the advancement in technology allows users detected correctly by a machine.
of social networking platforms to demonstrate their emotions by ‘Emotion’ was defined in Oxford Dictionary as ‘A strong feel-
‘audios’ and ‘videos’, ‘text’ is still the most common form of com- ing deriving from one’s circumstances, mood, or relationships with
munication in social networks. People profess their emotions via others’ [1]. American Psychological Association defined ‘emotion’
their social network posts (i.e. status, comments, blogs, microblogs as ‘A complex pattern of changes, including physiological arousal,
etc.). feelings, cognitive processes, and behavioral reactions, made in
response to a situation perceived to be personally significant’ [2].
In general, ‘emotion’ is the feeling or reaction that people have on a
certain event. ‘Happy’, ‘Sad’, ‘Angry’, ‘Fear’ are few examples of emo-
∗ Corresponding author at: Department of Computer Science, University of Cal-
tions which someone could express. ‘Emotions’ and ‘sentiments’ are
gary, Alberta, Canada.
E-mail addresses: kashfia.sailunaz@ucalgary.ca (K. Sailunaz), alhajj@ucalgary.ca often considered as replaceable terms, but sentiments represent
(R. Alhajj). a more general idea – polarity of emotion (i.e., positive, negative

https://doi.org/10.1016/j.jocs.2019.05.009
1877-7503/© 2019 Elsevier B.V. All rights reserved.
2 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

or neutral) [3]. For example, if someone writes ‘I am happy’, then Researchers have been using Twitter network for various cal-
it is understood that the emotion of a person is ‘happy’ and the culations and analysis for a long time. Emotions and sentiments of
sentiment behind the emotion is ‘positive’. tweets, influential user detection (using retweets, posts, favorites,
Twitter [4] data is a popular choice for text analysis tasks etc.), recommendation generation (based on Twitter posts) and
because of the limited number of characters (140) allowed and the user influence have been experimented by different researchers
global use of Twitter to express opinions on different issues among who have applied various methods. In this paper, we introduced
people of all ages, races, cultures, genders, etc. In this paper, we ana- some new ideas and incorporated them with existing ones. Inclu-
lyzed a Twitter network for emotion and sentiment detection and sion of replies to tweets and reply-based parameters are the major
analysis. We detected the emotions and sentiments from tweets novelties of this paper. We have included emotion and sentiment
and their replies and formed an emotion network based on texts expressed in replies and also considered the agreement score (i.e.,
posted by users. From the emotion network, we detected influen- if the reply agreed with the original tweet or not), sentiment score
tial people for both positive and negative emotions. Afterwards, we (i.e., if the reply sentiment was same as the original tweet sentiment
showed how influential people in an emotion network contribute or not) and emotion score (i.e., if the reply emotion was similar
to changes in emotion in the overall network. Finally, we computed to the original tweet emotion or not). The equations used in this
a trust network based on emotion similarities and influences by paper include these new parameters with some existing parame-
generating recommendations. ters to calculate influence scores of users which later propagated
Fig. 1 shows an example of a sample tweet and various types of to recommendation generation.
replies to the tweet. The author of the tweet expressed his surprise The rest of this paper is organized as follows. Section 2 men-
on an issue and some people replied agreeing to it like the second tions some existing research achievements on emotion detection
and third replies. Some people disagreed to the tweet and showed and analysis. Section 3 discusses the methodologies used for emo-
disgust or anger like the fourth reply. Some replies like the first tion and sentiment analysis from texts. Section 4 explains the
one tried to answer to the tweet with logic and some other replies experiments conducted and describes results achieved. Section 5
expressed no particular emotions at all. is conclusions and future research directions.
As no existing twitter datasets were found consisting both
tweets and their replies, we collected text from tweets and replies 2. Related works
on specific recent topics to create our customized dataset. We col-
lected both text-based and user-based data for our dataset. After Researchers of religion, psychology and philosophy have been
passing them through few preprocessing and cleaning phases, we discussing emotions from the early stages of those research
generated a structured dataset. We annotated our text according to domains [5,6]. Darwin experimented and commented on the
their emotions and sentiments. The annotated datasets were then relation between emotion and biological evolution in 1872 [7].
used to train and test a Naïve Bayes classifier for supervised classi- Afterwards, neuro-scientists discovered that emotions may be
fication of emotions and sentiments. Text-based parameters were anticipated as the outcomes of some functions of the neural system
then merged with user-based parameters to detect influential users [8]. With the advancement in technology, emotion analysis became
of emotion and sentiment networks. Our recommender system a research field for information technology. Since human emotion
used k-means clustering to create clusters of users based on their is one of the most important contributor to human behavior, it
emotions and sentiments. Finally, the recommender system used became necessary to find a way to recognize emotion automati-
influence scores of users to provide generalized and personalized cally using machines. Current technology is associated with lots of
recommendation for users. tools and applications to detect human emotion from speech, facial

Fig. 1. Sample tweet and comments.


K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 3

Fig. 2. Dataset creation workflow.

Fig. 3. Sample tweet dataset.

Table 1
Tweet attributes.

created at id id str text

source truncated in reply to status id in reply to status id str in reply to user id


in reply to user id str in reply to screen name user coordinates
place quoted status id str is quote status quoted status
retweeted status quote count reply count retweet count
favorite count entities extended entities favorited
retweeted possibly sensitive filter level lang
4 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

Fig. 4. Sample reply dataset.

Fig. 5. Sample retweet dataset.

Fig. 6. Sample user dataset.

expressions, gestures, writing and other forms of communication. ods, approaches, datasets, experiments and outcomes [9–12]. A
In this paper, we will concentrate on text emotion analysis. detailed survey on emotion analysis from text, including evolu-
Emotion detection from text like paragraphs, books, song lyrics, tion of emotions, emotion models, emotion detection methods,
play scripts, blogs, microblogs, Facebook posts, tweets, product existing datasets and challenges of emotion analysis were men-
reviews have been evaluated by affective computing researchers tioned in detail in [13]. Some researchers proposed their own
for a long time. Some existing surveys on text emotion detec- approaches with a detailed survey on textual emotion [14–17].
tion and analysis summarize the works done in this field from Different approaches were applied for text emotion detection by
different time-lines describing existing emotion detection meth- researchers. The most common methods described in the litera-
K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 5

Table 2
User attributes.

id id str name screen name location

url description derived protected verified


followers count friends count listed count favourites count statuses count
created at utc offset time zone geo enabled lang

Fig. 7. Preprocessing workflow.

Fig. 8. Classification & influence score calculation workflow.

ture are ‘Keyword-based’, ‘Lexicon-based’, ‘Machine-learning’ and model [30] and Oatley’s model [31] categorize all human emotions
‘Hybrid’ methods. Few researchers tried to detect emotion with lin- in few major classes (i.e., Anger, Disgust, Fear, Joy, Love, etc.). On the
guistic rule-based methods [18], natural language processing [19], other hand, dimensional emotion models like Plutchik’s model [32],
case-based reasoning [15] and some unique approaches. Keyword- Circumplex model [33], OCC (Ortony, Clore, and Collins) Model [34]
based method detect emotion by looking for a match between and Loveim’s model [35] classify emotions in detail using multiple
words in a piece of text and emotion keywords [14]. Lexicon-based dimensions (i.e., valence, arousal, dominance, etc.) and intensi-
methods use an emotion lexicon to detect the correct emotion from ties (i.e., basic, mild, intense, etc.) of emotion. The data structures
a piece of text [20]. Machine-learning methods use both supervised underlying these models are mainly based on lists, trees, wheels,
[21,22] and unsupervised [23,24] learning for emotion detection cubes etc.
using various existing classification and clustering methods. Hybrid In recent textual emotion analysis studies, social network posts
methods merge more than one from the above techniques and are being used for text emotion analysis due to the huge number of
apply that to recognize text emotion [14,25–27]. participants and posts. Almost 2.46 billion people are active in var-
Emotion is generally defined and described by different emo- ious social networks and they are members of one or more social
tion models. Researchers have mentioned different dimensions of networking platforms like Facebook, Twitter, Instagram, YouTube
human emotions from various perspectives. All existing emotion etc. [36]. Finding influential people and groups in any social net-
models can be divided into ‘Categorical’ and ‘Dimensional’ models work can be significant to detect and control information flow.
[28]. Categorical emotion models like Ekman’s model [29], Shaver’s Haewoon et al. [37] analyzed twitter and generated the first quan-
6 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

Fig. 9. Connections between a tweeter and retweeters and commenters.

Fig. 10. Sentiments of user and commenter nodes.

Fig. 11. Emotions of user and commenter nodes.


K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 7

Fig. 12. Suggestions of connections between similar emotion nodes.

Table 3
Sentiment and emotion classification accuracy for Naïve Bayes classifier (in percentage – %).

k-fold Full text sentiment NAVA sentiment Full text emotion NAVA emotion ISEAR [48] dataset ISEAR [48] dataset
full text emotion NAVA emotion

3-fold 62.69 55.96 44.37 40.45 52.34 53.37


5-fold 63.71 58.33 46.32 41.29 54.01 54.82
10-fold 66.86 61.15 47.34 43.24 55.11 56.51

Table 4
Sentiment and emotion classification accuracy comparison (in percentage – %).

Text type Naïve Bayes (Sentiment) SVM (Sentiment) Random Forest (Sentiment) Naïve Bayes (Emotion) SVM (Emotion) Random Forest (Emotion)

Full text 66.86 23.32 55.23 47.34 14.48 35.66


NAVA text 61.15 23.32 52.01 43.24 14.48 37.26

on twitter data. Finally, some open research problems were stated


for researchers to analyze. T-PICE [39] is an influential community
extraction system which identifies influential people by classify-
ing people into some personality classes based on their tweets on
a specific topic. NavigTweet [40] is a visualization tool which finds
influential people from the friend list of a user by analyzing con-
Fig. 13. Pairwise comparison scale in AHP. tent of tweets that are common between them. An analysis on
tweets about two car companies to find out influential users led
titative study on twitter. They crawled twitter on July 2009 and to some hypotheses in [41]. It represented some interesting pat-
got more than 41 million user information along with the follow- terns of influential people of a social network and some textual
ing/follower relations. They computed the distributions of users, characteristics like their use of words, hashtags, links, expressions,
their followers, the people they were following, recent trends, sep- etc.
aration, homophily and diffusion of information in twitter. Their Joshi et al. [42] experimented on a Twitter dataset to classify
analysis showed noticeable differences between twitter and other sentiments of tweets by applying different machine-learning algo-
social networks. Because of the one directional relation between rithms. They compared the results of simple positive-negative word
users in twitter, it defied the power law and the distributions were counting, Naïve Bayes, and Maximum Entropy. The two machine
not very structured. The list of top ranked users also differed accord- learning techniques performed better than the baseline method
ing to the method of computation- number of followers, page rank, in terms of accuracy for detecting correct sentiment. Unlike this
number of retweets, etc. They also mentioned and calculated active approach, emoticons in Twitter text were used for emotion analy-
period of trends which was less than a week in most cases. Their sis in [43], where the author discussed sentiments and emotions in
retweet trees and temporal analysis of retweets showed the diffu- detail while mentioning existing emotion models. Symbol-based
sion of information and its difference from other media. sentiment and emotion classification combined with lexicon-
Riquelme et al. [38] conducted a detailed survey on twitter based unsupervised classification were discussed with possible
activities, influences and popularity of users. They showed a clas- future directions. Emotion contagion in Facebook was analyzed
sification of different measures of twitter data based on various by Coviello et al. [44]. Facebook posts were analyzed using instru-
metrics and content. The relationship between users and tweets mental variable regression method to detect positive and negative
was discussed in detail. Twitter metrics and their time complex- emotions of people in rainy days together with the effect of one’s
ity were explained with detail anatomy of twitter. They mentioned emotion on his/her friends’ emotion. Detecting influential people
existing methods of finding influential users and experiments done from a social network is a very lucrative research area for referral
8 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

Fig. 14. Recommender system workflow.

marketing to spread information related to any product and reach needed the replies to those tweets and the repliers/commenters
highest possible nodes of a network [45]. information. For our emotion network, we needed the connections
between users based on their emotion on a specific issue. For our
experiments, we picked few recent events and issues to collect
3. Methodology
tweets with various emotions. Search keywords were – #Syria,
#DonaldTrump, #SchoolShooting, #Christmas2017, #NewYear,
3.1. Data collection
#ValentinesDay2018, #Terrorism, #olympicgames2018, #Wom-
ensDay, #Oscars2018.
Twitter is one of the most popular social networking plat-
The dataset was prepared in few steps – (i) collecting random
forms nowadays with 330 million monthly active users [46].
tweets on a keyword, (ii) collecting user information (user ID, loca-
People express their opinions about their daily live, different
tion, gender, number of posts, number of followers, number of
social/national/international issues, etc. They share their views
followees, number of likes), (iii) collecting comments/replies on
within 140 characters of text and sometimes also share audio/video
each tweet, (iv) collecting user information of the commenters, (v)
files. Posts are called tweets and they are public. Other people can
collecting retweets on the tweet, (vi) collecting user information of
like posts, comment on them, or retweet them. People can follow
each retweeter. These steps were repeated for all keywords.
each other or can be friends with each other in twitter. Unlike most
We faced few challenges while collecting data from twitter.
other social networking platforms, Twitter allows one directional
These include: (i) Some tweets shared photos and videos and
links, which means one user can follow another user without the
didn’t mention much in the text. (ii) Even for tweets which were
latter user reciprocating the communication. These interactions
expressed as text in English, lots of comments were in other lan-
lead to a network of communication.
guages. (iii) Lots of comments had no text, they just shared photos
The dataset we used for our experiments contains a collec-
or videos. (iv) In some cases, lots of comments were posted by
tion of tweets, comments, retweets and their user information.
a tweeter as reply to commenters. Some people replied to each
A number of text dataset for emotion and sentiment analysis
comment on their tweet making the number of comments on their
like ‘Emotion in Text data set [47]’, ‘ISEAR [48]’, ‘SemEval [49]’,
tweets double. (v) Some comments on some tweets were not from
‘EmoBank [50]’, ‘TREC [51]’, etc. were used in related works. But
a person, but from some accounts of news channels or business
the existing datasets of twitter couldn’t be used for our work,
persons. Those comments were basically advertisements of some
because most existing datasets have either the tweets or the
news or product. For example, in the comments of #WomensDay
friend/follower connections between users. For our experiments,
tweets, few advertisement comments were from some news media
we needed an emotion network based on the text of the users,
which work for gender equity, few comments were from business
not just the information about who is following whom. We also
K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 9

Fig. 15. Users and their influence scores.

Fig. 16. General recommendation according to sentiment.


10 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

Fig. 17. General recommendation according to emotion.

Table 5 Table 6
AHP comparison matrix and weight vector for user-based parameters. AHP comparison matrix and weight vector for tweet-based parameters.

TS U FS U OS U LS U RS T CS T PS T AS T SS T ES T
1 1
TS U 1 5
7 7 W T = 0.292 RS T 1 7 3 3
7 7 W R = 0.272
1 1 1
FS U 5 1 7 7 W F = 0.586 CS T 7
1 7 7
1 1 W C = 0.040
1 1 1 1
OS U 7 7
1 1 W O = 0.061 PS T 3
7 1 3
7 7 W P = 0.204
1 1
LS U 7 7
1 1 W L = 0.061 AS T 3 7 3 1 7 7 W A = 0.395
1 1 1
SS T 7
1 7 7
1 3 W S = 0.053
1 1 1 1
ES T 7
1 7 7 3
1 W E = 0.035

accounts with advertisement of their beauty products, and so on.


(vi) Most users didn’t share their location. (vii) Few people had
thousands of tweets, but none of them were original. They just of twitter API, 15 API calls are allowed in every 15 min and this
retweeted others tweets. (viii) Few replies didn’t have any text restricted the amount of collected data. We collected and anno-
other than few mentions of some accounts. (ix) Few replies just tated 7246 tweets and replies from 25 February 2018 to 8 March
had the same hashtagged word as the original tweet and noth- 2018. According to the tweets and replies, we collected information
ing else. (x) Few tweets and replies just stated some facts without for 3607 users. As we manually annotated each tweet and reply
expressing any sentiments or emotions. (xi) Few replies just had according to emotion, sentiment and agreement score, the dataset
emojis and no text. (xii) Some repliers responded by asking ques- had limited amount of data. The text was labeled using ‘Agreed’,
tions which didn’t express any emotion. (xiii) Some responses were ‘Disagreed’ and ‘Random’ as agreement value based on if the reply
totally random and out of context. text agreed to the original tweet or not. The text was labeled with
We used both twitter API and web-page scraping for data col- proper sentiment using ‘Positive’, ‘Negative’ and ‘Neutral’ and with
lection. Table 1 shows attributes of the tweets and Table 2 shows specific emotion ‘Anger’, ‘Disgust’, ‘Fear’, ‘Joy’, ‘Sadness’, ‘Surprise’
the attributes of a user object that can be extracted using twitter and ‘Neutral’. Absence of any other reply data, user sentiment,
API. There are few profile images and background images avail- emotion and agreement inferred by the annotator by reading and
able which were not mentioned in the tables. Due to the rate limit manually annotating caused the lack of proper distribution of data
K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 11

extracting replies to a tweet. Thus, we started collecting replies to


a tweet using a different crawler. The latter crawler took the tweet
ID and user ID and checked the text directed to the user since the
time of the tweet. It collected only texts which were true for the
parameter ‘in reply to status id’. As this was not using any direct
function, collecting replies of a tweet took most of the time.
To enhance the speed of collecting replies of tweets, we used
a Web page scraper where replies were collected by scraping the
tweet page. For our experiments, we had to also collect user infor-
mation. So we used a Web page scraper to extract users data
from their twitter Web page for original tweet users, retweeters
and repliers. User attributes collected were the number of tweets,
likes, follower, followee and location (when available). In our final
dataset, we combined all user data and tweet data. Fig. 3, Fig. 4, Fig. 5
and Fig. 6 are samples of tweet, reply, retweet and user datasets
consecutively.
Data preprocessing included cleaning the collected data and
annotating the data according to sentiments and emotions. Tweets
Fig. 18. Personalized recommendation. and comments had lots of unnecessary symbols and noise. Fig. 7
shows the steps of preprocessing. The data cleansing process fol-
among all sentiments and emotions. In this paper, we tried to take
lowed these steps: (i) all user mentions (i.e., @alice) were removed
the first step towards a personalized social network recommender
from the text and added as a connection between users in the net-
by proposing a new approach for Twitter emotion and sentiment
work; (ii) all hashtags (only the # symbol) were removed; (iii) all
network.
emoticons were removed (i.e., :-), :-(etc.); (iv) all multiple occur-
rences of the same letter were removed (i.e., ‘wowwwwwww’
3.2. Pre-processing
became ‘wow’); (v) all non-alphanumeric characters (not in A–Z
Fig. 2 shows the work-flow of creating the dataset. First, we used or a–z or 0–9) were removed (i.e., ?, , , ., ;, ! etc.); (vi) all URLs were
a crawler to find original tweets on a specific topic. We filtered the removed (i.e., http://a.com). Clean tweets and replies were then
tweets while collecting them and accepted only original tweets annotated.
(not retweets) on the topic. The first crawling returned unique All tweets and comments were annotated with positive, nega-
tweet ID and tweet text which were applied on another crawler tive or neutral sentiments. For our experiments, we used Ekman’s
to find the number of likes, retweets and time of tweet. Another emotion model [29] which lists six basic human emotions – Anger,
crawler returned retweeters user IDs. No direct functions exist for Disgust, Fear, Joy, Sadness and Surprise. All tweets and comments

Fig. 19. Retweet network based on degree.


12 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

Fig. 20. Sample retweet network.

were annotated using these six emotions. Replies on tweets were follower and followees to generate the final influence score. The
also annotated with agreed or disagreed based on their agreement recommender system took the score and generated recommenda-
or difference on the tweet. After annotating tweets and replies, we tion as a list of users who shared similar emotion and sentiment on
used Natural Language Processing (NLP) to process the text. Words a specific topic.
of the sentences were tokenized. Then common English stop words We applied the Naïve Bayes algorithm with 3-fold, 5-fold and
(i.e., am, as, the, etc.) were removed from the text and words were 10-fold cross validation on our preprocessed text (tweets and
tagged according to their corresponding Parts-of-Speeches (POS) replies). We applied the classification on both the NAVA text and the
using a POS tagger. Among all words, only Noun, Adjective, Verb complete clean text. The feature set used for Naïve Bayes includes
and Adverb (NAVA) words were collected as they contribute most words from each tweet and reply. We applied the classifier on the
in a sentence. Finally, we extracted the preprocessed text. For com- same datasets twice – (i) for classifying them according to their
parison in the classification phase, we stored both NAVA text and sentiments and (ii) for classifying them according to their corre-
the cleaned complete text. sponding emotion.

3.3. Sentiment and emotion detection 3.4. Influential users detection

In previous studies of emotion detection from text, differ- In most of the previous works, researchers counted number
ent methods were used by various researchers. In most recent of followers to find out influential users from twitter. The idea is
researches, machine learning was used with both supervised based on the common statement that “the more followers a person
[52,53] and unsupervised [24,54] classifiers. Machine learning is has, the more influential he/she is”. Some researchers ranked users
a better option for a larger dataset and the training of the classifier based on their total number of tweets [55] or summation of the total
is a more generalized approach than creating emotion word dic- number of tweets and the number of retweets of those tweets [56].
tionaries. Naïve Bayes worked better than other machine learning To find the most influential users on a specific topic, the number of
approaches described in the literature with sentiment and emo- tweets, the number of replies by the author, the number of retweets
tion detection from text. Therefore, for our experiments, we chose on a specific topic were used and normalized with the total number
Naïve Bayes algorithm to classify tweets into their corresponding of tweets on that topic [57]. Followers and following counts were
sentiment and emotion classes. also used to find influential users. Ratio of number of followers and
Fig. 8 shows a work flow of classification of preprocessed text followees [58], ratio of followers and summation of followers and
and calculation of influence score to generate recommendation. followees [55,38] were used to compute how influential a user is
The preprocessed text (both tweets and replies) were divided into in twitter. In [59], the reciprocal relationship between users were
training and test sets for emotion and sentiment classifications. also counted in case the user had less followers than the number of
Then, emotion and sentiment were fed into the calculation of “user people he/she followed.
influence score” which also used the agreement score for replies Acquaintance Affinity Identification (AAI) score was calculated
(i.e., if the reply agreed to the original tweet or not). The influence using number of followers, number of users mentioning the author,
score used all these values along with number of tweets, retweets, number of users replied to the author’s tweets, number of users
K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 13

Fig. 21. Sample reply network.

retweeted the author’s tweets [60]. Various common centrality to the original tweet of user U and i = 1, . . ., 5. Retweeters are mostly
measures like betweenness and closeness centrality were used by influenced by the tweeter; so U influences all Ri nodes. Indeed, in
researchers to find influential users from a network [61], whereas real-life scenarios it may be common to have someone retweets
few researchers found their own centrality indexes like H-index another tweet without necessarily agreeing to the tweet. But in
which considered a maximum H-value for replies, retweets, likes most cases, when someone retweets some tweet, it is implicitly
for the calculations [62,63]. Google’s page rank algorithm was mod- understood that he/she agree to the tweet, which means that the
ified as TunkRank [64] and UserRank [65] using followers, followees original tweeter has directly or indirectly influenced the retweeter.
and tweets to rank twitter users. Lahuerta-Otero et al. [41] used the The commenters, on the other hand, can display agreement or dis-
regression model with lexical diversities in tweets, average num- agreement or random text as replies. Fig. 10 shows sentiments of
ber of characters used in tweets, hashtags, user mentions, URLs, the comments as positive, negative or neutral by assigning indi-
followees, positive and negative sentiments to extract influential vidual color to each sentiment. For example, if user U expressed
users from twitter. Exponential of the number of followers was used negative sentiment in his/her tweet then few commenters like C2
in [66] to find popular users from twitter. Users were ranked using and C4 would express the same negative sentiment through their
both tweet-based and user-based scores by Francalanci et al. [40]. replies, whereas commenters like C1 and C3 would express posi-
They used favorites, followers, following, listed, tweets for user- tive sentiment in their replies. There would be commenters like C5
based score and favorite, retweets, URLs, hashtags, mentions for who would not express any particular sentiment in their replies.
tweets-based score. Each parameter was assigned weights based One thing to remember here is that, even if some nodes are express-
on Analytic Hierarchy Process (AHP). ing same sentiments on a certain issue, they still can have different
In this paper, we considered few other factors for our calcula- emotions. For example, on some horrific incident, one person can
tions. One person can influence another person in both positive express anger whereas another person can express disgust, though
and negative ways. Normally, when someone likes or retweets a anger and disgust both are negative sentiments. Fig. 11 represents
tweet from another person, he/she agrees to that tweet. But a per- emotions of the commenter nodes who demonstrate the exam-
son can comment on another person’s tweet and express either ple. Here, lets assume that the original tweet by user U expressed
his/her support or disagreement to that tweet. We considered this ‘Anger’ as an emotion and commenter C4 expressed the exact same
while calculating the influence based on all the factors – (i) Num- emotion. It is very possible that commenter C2 would express
ber of retweets, (ii) Number of likes, (iii) Number of comments similar sentiment, but with a different emotion like ‘Disgust’. Com-
that agree to the tweet or show similar sentiment/emotion, (iv) menter C1 and C3 would express ‘Happiness’, and commenter C5
Number of comments that disagree to the tweet or show opposite would not express any specific emotion at all. Fig. 12 represents
sentiment/emotion, (v) Number of people who follow the person. possible networks within the Twitter network and suggestions for
Fig. 9 represents sample connections between the user of the users to let them know people who are not directly connected, but
original tweet (U), retweeters (Ri ) and commenters (Ci ) who replied have the same emotion regarding any social, national or interna-
14 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

Fig. 22. Tweet sentiment network for #Oscars2018.

tional issue. For example, for any specific topic, after analyzing the scale shown in Fig. 13. P1 and P2 are any two parameters and values
tweets and replies to the tweets, we can generate smaller emotion from 1 to 9 are measurements of their comparison. (iii) We created
and sentiment networks within the whole network. For a topic, if two different comparison matrix for two groups. (iv) After creat-
user U1, U5, U7 and commenter C4 expressed ‘Anger’ then each of ing the Eigen vectors for both matrices, normalized Eigen vectors
them would have the others in the recommended users list as they were used as priority matrices for weight vectors. We calculated
expressed exactly the same emotion on a particular topic and would a general influence score ISG for each user based on user-based
be considered as nodes of a hidden emotion network within the parameters. Another topic specific influence score IST was calcu-
complete network. Similarly, U3, U8 and C2 would form ‘Disgust’ lated by merging user-based and tweet-based parameters.
network; C1, C3 and C5 would create ‘Happiness’ network; and User-based Parameters:
U2 and C9 would form ‘Fear’ network. These smaller hidden net-
TS U (TweetS core) = WT ∗ #Tweets
works would generate the recommendation lists based on influence
scores of the users and commenters. FS U (FollowerS core) = WF ∗ #Followers
Due to the variety of parameters and metrics of twitter data,
there is no one standard equation to measure influential users in OS U (FolloweeS core) = WO ∗ #Followees
twitter. Various researchers used different measurements based on LS U (LikeS core) = WL ∗ #Likes
different perspectives and statements of their research problems.
We modified existing measurements according to our problem Tweet-based Parameters:
statement. We measured two types of influence scores (IS) of a
RS T (RetweetS core) = WR ∗ #Retweets
user using the following equations with various user-based and
tweet-based parameters. For the “Agreement Score”, as mentioned CS T (CommentS core) = WC ∗ #Comments
in the equation, we assigned either 1 or −1. In the manual anno-
PS T (LikeS core) = WP ∗ #Likes
tation process, each reply was labeled using ‘Agree’, ‘Disagree’, or
‘Random’ and assigned 1, −1 and 0 as scores, respectively. If the AS T (AgreementS core) = WA ∗ AgreementV alue
reply text agreed to the original tweet text, then the label will
be ‘Agree’, if it did not then the label will be ‘Disagree’ and if it SS T (SentimentS core) = WS ∗ SentimentV alue
was irrelevant then the label will be ‘Random’. The original tweets ES T (EmotionS core) = WE ∗ EmotionV alue
always had the label ‘Agree’ because a tweeter would mostly agree
to his/her own tweet while tweeting. Similarly, if a tweet and reply 
expressed the same sentiment/emotion then the score will be 1, if 1 if reply agrees with tweet
AgreementV alue =
they were dissimilar then the score will be −1. W values are the −1 if reply disagrees with tweet
weights assigned to each parameter. These weights are calculated
using Analytical Hierarchy Process(AHP)[67]. Weight calculation 
1 similarsentiment
was done using the following steps. (i) Parameters were divided SentimentV alue =
into two groups: (a) user-based parameters (Number of tweets, fol- −1 differentsentiment
lowers, followees and likes), (b) tweet-based parameters (number
of retweets, comments, likes, agreement, sentiment and emotion

1 similaremotion
scores). (ii) For each group of parameters, we did a pairwise com- EmotionV alue =
−1 differentemotion
parison between every combination of two parameters using the
K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 15

The two Influence Scores (IS) are, data (tweets and replies), finding proper and grammatically cor-
rect sentences with appropriate parts-of-speeches was difficult.
IS U = TS U + FS U + OS U + LS U
On the other hand, for the benchmark ISEAR dataset, as the texts
were properly formed meaningful sentences, the accuracy level of
IS T = IS U + RS T + CS T + PS T + AS T + SS T + ES T classification was higher than the Twitter data. The distribution of
To normalize IS values, we converted them into percentage emotions in the dataset was another reason for the lower accuracy.
using the following formula, where MIN and MAX are the mini- As tweets and replies were collected randomly, all emotions were
mum and maximum values of IS (Influence Score) among all users not very well-distributed in the dataset and this resulted in lower
in the dataset. classification accuracy.
Table 4 shows accuracy comparison between three different
IS − MIN
∗ 100 classifiers, namely – Naïve Bayes, Support Vector Machine (SVM)
MAX − MIN and Random Forest. Although from existing works and surveys
mentioned in Section 2 we already inferred that Naïve Bayes is one
3.5. Recommendation of the most commonly used classifiers for sentiment and emotion
classification, we still needed to check if that statement holds for
Our recommender system provided two different types of our dataset or not. After applying these three supervised classifiers
recommendation on both emotion and sentiment. (i) General Rec- on our dataset and by calculating classification accuracies, we can
ommendation: A common recommendation based on a specific see that Naïve Bayes gives the best results for both full text and
topic which recommended top ten user IDs on each sentiment NAVA words by considering sentiments and emotions. Classifica-
and emotion. (ii) Personalized Recommendation: For each user, tion accuracy decreased at least 10% for Random Forest and SVM
a recommendation which recommends top ten user IDs who performed the worst for our dataset. After comparing the accuracy,
share similar emotion and sentiment on a specific topic. The gen- we can say that choosing Naïve Bayes classifier for our experiment
eral recommender found users who interacted (tweeted, replied, is well justified.
retweeted) on a specific topic. Each of those users expressed some The next step was measuring the influence score for users.
emotion and sentiment regarding that particular topic. So we We used AHP to calculate our weight vector. After computing the
applied k-means clustering individually based on sentiment and comparison matrices and calculating eigen vectors, they were nor-
emotion. For sentiment, we formed three clusters (positive, nega- malized and we got priority matrices. The comparison matrices and
tive and neutral). Each user was assigned to a cluster according to the weight vectors calculated from them are shown in Table 5 and
his/her sentiment on the topic. Then, all users of each cluster were in Table 6 for user-based and tweet-based parameters, respectively.
sorted according to their influence score and the first ten users The agreement score represented if the reply to a tweet agreed
with highest influence scores were recommended. We repeated or disagreed to the tweet. To apply the influence score formula on
the same thing for emotion as well. We had seven clusters for emo- the whole dataset (both tweets and replies), we assigned 1 (agree-
tion, namely anger, disgust, fear, joy, sadness, surprise and neutral. ment) as the Agreement Value for original tweets. This inclusion
Fig. 14 shows the work-flow of the recommender system. was logical because the user who tweeted something obviously
agreed to his/her own tweet. After the calculation we normalized
4. Experimental results the influence score between 0 to 100 since the influence scores of
users varied from 10 to millions because of the number of tweets,
The experiments were conducted in multiple steps. After col- followers etc. Fig. 15 shows the user IDs with their original and
lecting the data, we cleaned and preprocessed the text. Then we normalized influence scores in a descending order.
applied Naïve Bayes classifier to classify the text according to cor- The influence scores were then used for the recommender
responding sentiment and emotion classes using 3-fold, 5-fold and system. The general recommender provided top 10 most influ-
10-fold cross validation. We used the sentiment and emotion clas- ential users for each sentiment and emotion. Fig. 16 and Fig. 17
sifier on the cleaned text and only on NAVA words individually to show the general sentiment and emotion recommendation on
measure classification accuracy. Table 3 shows accuracy results in #Oscars2018. Fig. 18 shows personalized recommendation for one
percentage for sentiment and emotion classification for both full user. The file name shows user ID for whom the recommendation
text and text which had only NAVA words. The table also shows was issued. Fig. 16 shows a generalized recommendation with list
emotion classification accuracy of Naïve Bayes on the benchmark of users having the highest influence scores for each sentiment on
ISEAR dataset [48]. Emotion classes in the new dataset and ISEAR #Oscars2018. The first ten user IDs belong to users in the network
dataset have five emotions in common. Instead of ‘surprise’, ISEAR who expressed positive sentiments on #Oscars2018 through their
includes ‘shame’ and ‘guilt’. The accuracy of emotion classification tweets or replies. They were sorted in a descending order of their
of ISEAR dataset were included to show variation in results of the influence scores. The next ten users were influential users who
same classifier with two different datasets with slightly varying expressed negative sentiment on #Oscars2018, and the next ten
emotions. users expressed neutral sentiment. Similarly, Fig. 17 shows general
According to the results, we found that using full text for clas- recommendation for the network on #Oscars2018 for each emo-
sification increased accuracy by at least 5% than using only NAVA tion. Fig. 18 shows an example of personalized recommendation.
words. We also observed that as we increased the number of folds For the user with user ID ‘279785200’, the top most other ten users
for our k-fold cross validation, classification accuracy increased were listed in the recommendation in descending order of their
gradually. Another interesting observation was the differences influence scores. These ten users shared similar emotion and senti-
between accuracies in sentiment and emotion classifications. For ment in their tweets or replies as user ‘279785200’ on #Oscars2018,
sentiment classification, we had 3 classes (positive, negative, neu- and were the most influential users in the network.
tral) whereas there were 7 classes for emotion (anger, disgust, fear, Fig. 19 shows the retweet network for tweets on #Trump,
joy, sadness, surprise and neutral). We noticed that as the num- #SchoolShooting, #Oscars2018 and #Syria. The little circles repre-
ber of classes increased for emotion classifier, the accuracy level sent users. Larger circles represent clusters of users who retweeted
decreased a little bit compared to sentiment classifier. As expected a tweet of the user who is the center of the cluster. The bigger the
from a text based classifier, accuracy levels were not very high. circle is, the more retweets that tweet has. This is equivalent to the
The reason behind this was the structure of the data. For Twitter influence of the user. Few tweets were retweeted later by the author
16 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

Fig. 23. Tweet emotion network for #Oscars2018.

Table 7
Quantitative measurements for #Oscars2018.

Type #Texts #Positive #Negative #Neutral #Anger #Disgust #Fear #Joy #Sadness #Surprise #Neutral #Agreement #Disagreement #Random

Tweet 193 112 26 55 5 11 1 95 3 19 59 193 0 0


Reply 482 317 52 113 14 17 2 251 4 68 126 328 49 105
Total 675 429 78 168 19 28 3 346 7 87 185 521 49 105

Fig. 24. Tweet-reply agreement network for #Oscars2018.

of the tweet. This created few loops in the network. Fig. 20 shows a to a tweet of the destination node. Table 7, Fig. 22, Fig. 23 and
sample retweet network and Fig. 21 shows a sample reply network Fig. 24 show how most of the connected users expressed similar
without considering emotions or sentiments. Here, a connection sentiment and emotion on a topic. Table 7 shows an example of
from one node to another means the source node replied/retweeted quantitative values for tweets and replies on #Oscars2018. We col-
K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003 17

lected 675 tweets and replies on #Oscars2018 (193 original tweets abbreviations, emoticons, shouting which sometimes have major
and 482 replies). 328 replies agreed to the original tweets, whereas contribution to the emotion and sentiment of the text. A dynamic
only 49 replies disagreed and 105 replies did not mention about recommender system to provide suggestions based on any topic on
their agreement or disagreement. In case of sentiments, 429 text any social network, including user emotion and sentiment using
blocks (112 original tweets, 317 replies) expressed positive senti- complete posts rather than picking only complete words from the
ment, 78 text blocks (26 original tweets and 52 replies) expressed posts may be a possible future work which can use this framework
negative sentiment and 168 text blocks (55 original tweets and 113 to recommend various preferable topics to users.
replies) did not express any sentiment. Similarly, for each emotion,
the table shows the number of original tweets, replies that have Declaration of Competing Interest
shown the emotion, and the total amount of text blocks related to
the emotion. This information was used to generate the networks The authors declare that they have no known competing finan-
shown in Fig. 22, Fig. 23 and Fig. 24. Fig. 22 shows the sentiments cial interests or personal relationships that could have appeared to
of each user node according to the color assigned to each sen- influence the work reported in this paper.
timent, whereas Fig. 23 shows the emotions of the user nodes.
The network shows how connected users created clusters among References
themselves, and majority of the nodes share the same color (i.e.,
sentiment/emotion) in most of the clusters. [1] O. L. Dictionaries. emotion. [Online]. Available: https://en.oxforddictionaries.
com/definition/emotion.
[2] A.P. Association, Glossary of psychological terms. [Online]. Available: http://
www.apa.org/research/action/glossary.aspx?tab=5.
5. Conclusions and future work [3] M.D. Munezero, C.S. Montero, E. Sutinen, J. Pajunen, Are they different? Affect,
feeling, emotion, sentiment, and opinion detection in text, IEEE Trans. Affect.
Comput. 5 (2) (April 2014) 101–111.
The analysis of different emotions and sentiments showed some
[4] I. Twitter. Twitter. [Online]. Available: https://twitter.com/.
interesting human characteristics. All six emotions we have ana- [5] A.R. Manser, Sketch for a theory of the emotions, Anal. Philos. 4 (1) (January
lyzed can be characterized by multiple dimensions. For example, 1963) 27–28.
‘surprise’ is an emotion which can express both positive and neg- [6] C. Bell, Essays on the Anatomy of Expression in Painting. Essays on the
Anatomy and Philosophy of Expression, 1824.
ative sentiments. Someone can be surprised by how stupid or [7] C. Darwin, The Expression of the Emotions in Man and Animals, Oxford
disgusting some people are, which is the negative kind of surprise. University Press, USA, 1998.
On the other hand, someone can be surprised by how beautiful [8] J.E. LeDoux, Cognition and emotion, in: Handbook of Cognitive Neuroscience,
Springer US, 1984, pp. 357–368.
nature is, which is the positive type of surprise. We saw this in our [9] A. Yadollahi, A.G. Shahraki, O.R. Zaiane, Current state of text sentiment
dataset several times. Sometimes, some repliers’ emotion about a analysis from opinion to emotion mining, ACM Comput. Surv. (CSUR) 50 (2)
tweet was ‘surprise’ even if they disagreed to the tweet because (May 2017), 25:1–25:33.
[10] L. Canales, P. Martinez-Barco, Emotion detection from text: a survey,
they were surprised about the stupidity of the tweet. For some Processing in the 5th Information Systems Research Working Days (JISIC
tweets, few comments expressed opposite sentiment even though 2014) (2014) 37–43.
the repliers agreed to the tweet. For example, a tweet from a Syr- [11] C.R. Chopade, Text based emotion recognition: a survey, Int. J. Sci. Res. (IJSR) 4
(6) (June 2015) 409–414.
ian who tweeted about their sufferings had most of the comments
[12] V. Tripathi, A. Joshi, P. Bhattacharyya, Emotion analysis from text: A survey,
agreeing to the content. Few comments expressed love and support 2016 http://www.cfilt.iitb.ac.in/resources/surveys/emotion-analysis-survey-
towards them. So, those comments fall into positive sentiment by 2016-vaibhav.pdf.
[13] K. Sailunaz, M. Dhaliwal, J. Rokne, R. Alhajj, Emotion detection from text and
classification because of their positive words. This shows that even
speech: a survey, Soc. Netw. Anal. Mining 8 (1) (2018) 28.
if the tweet and its’ comments were showing opposite sentiments, [14] H. Binali, C. Wu, V. Potdar, Computational approaches for emotion detection
they were actually agreeing on the tweet. in text, in: 2010 4th IEEE International Conference on Digital Ecosystems and
Our target was to analyze Twitter network from a comparatively Technologies (DEST), April, 2010, pp. 172–177.
[15] E.C.-C. Kao, C.-C. Liu, T.-H. Yang, C.-T. Hsieh, V.-W. Soo, Towards text-based
newer perspective. We observed users behavior based on their text emotion detection: a survey and possible improvements, in: International
(tweets, replies) along with numerical scores like number of tweets, Conference on Information Management and Engineering, 2009. ICIME’09,
followers, etc. We combined those values with the sentiment and IEEE, April 2009, pp. 70–74.
[16] S.N. Shivhare, S. Khethawat, Emotion detection from text, May 2012.
emotion of users on certain topics to measure the effect. Existing [17] V.K. Jain, S. Kumar, S.L. Fernandes, Extraction of emotions from multilingual
researches focused on the text of the tweets whereas in this work text using intelligent text processing and computational linguistics, J.
text from tweets and replies were considered to find the commu- Comput. Sci. (February 2017).
[18] S.Y.M. Lee, Y. Chen, C.-R. Huang, A text-driven rule-based system for emotion
nication between users in humane aspects. In most cases, we found cause detection, in: Proceedings of the NAACL HLT 2010 Workshop on
that users who reply to tweets normally share similar emotion or Computational Approaches to Analysis and Generation of Emotion in Text,
sentiment and agree to the tweet content. But there were cases Association for Computational Linguistics, June, 2010, pp. 45–53.
[19] B. Desmet, V. Hoste, Emotion detection in suicide notes, Expert Syst. Appl. 40
where the reply opposed the original tweet emotion and expressed
(16) (November 2013) 6351–6358.
the disagreement of the commenter very clearly. When we con- [20] A. Joshi, V. Tripathi, R. Soni, P. Bhattacharyya, M.J. Carman, Emogram: an
sidered all these factors to find influential users of a network, the open-source time sequence-based emotion tracker and its innovative
applications, in: AAAI Workshop: Knowledge Extraction from Text, March,
result was more specific to a certain issue or topic. Our experi-
2016.
ment provided a general recommendation to show who are the [21] S.M. Mohammad, S. Kiritchenko, Using hashtags to capture fine emotion
most influential users expressing a certain emotion or sentiment categories from tweets, Comput. Intell. 31 (2) (May 2015) 301–326.
on an issue. Our personalized recommender provided customized [22] M. Hasan, E. Rundensteiner, E. Agu, Emotex: detecting emotions in twitter
messages, in: 2014 ASE BigData/SocialCom/CyberSecurity Conference, May,
recommendation to suggest users from the network using informa- 2014.
tion about other people who share similar sentiment or emotion on [23] A. Agrawal, A. An, Unsupervised emotion detection from text using semantic
a topic as the user himself/herself. and syntactic relations, in: Proceedings of the 2012 IEEE/WIC/ACM
International Joint Conferences on Web Intelligence and Intelligent Agent
Our proposed approach is the first step towards a personalized Technology, vol. 1, IEEE Computer Society, December, 2012, pp. 346–353.
recommender system for social networks. The system is static and [24] M. Hajar, Using YouTube comments for text-based emotion recognition,
is limited to the range of collected data within a limited time period Procedia Comput. Sci. 83 (December) (2016) 292–299.
[25] S.P. Tiwari, M. Vijaya Raju, G. Phonsa, D.K. Deepu, A novel approach for
and lots of missing information on users. Also, our experiments and detecting emotion in text, Indian J. Sci. Technol. 9 (29) (August 2016).
framework focused on only simple text and extracted proper words [26] D. Ghazi, D. Inkpen, S. Szpakowicz, Prior and contextual emotion of words in
from it. In reality, social network text includes lots of short-hands, sentential context, Comput. Speech Lang. 28 (1) (January 2014) 76–92.
18 K. Sailunaz, R. Alhajj / Journal of Computational Science 36 (2019) 101003

[27] S. Grover, A. Verma, Design for emotion detection of Punjabi text using hybrid [58] C. Bigonha, T.N.C. Cardoso, M.M. Moro, V.A.F. Almeida, M.A. Gonçalves,
approach, in: International Conference on Inventive Computation Detecting evangelists and detractors on twitter, 18th Brazilian Symposium on
Technologies (ICICT), vol. 2, IEEE, August, 2016, pp. 1–6. Multimedia and the Web (2010) 107–114.
[28] R.A. Calvo, S.M. Kim, Emotions in text: dimensional and categorical models, [59] D. Gayo-Avello, D.J. Brenes, D. Fernández-Fernández, M.E.
Comput. Intell. 29 (3) (August 2013) 527–543. Fernández-Menéndez, R. García-Suárez, De retibus socialibus et legibus
[29] P. Ekman, An argument for basic emotions, Cogn. Emot. 6 (3–4) (May 1992) momenti, EPL (Europhysics Letters) 94 (3) (April 2011) 38001.
169–200. [60] M.S. Srinivasan, S. Srinivasa, S. Thulasidasan, Exploring celebrity dynamics on
[30] P. Shaver, J. Schwartz, D. Kirson, C. O’connor, Emotion knowledge: further twitter, in: Proceedings of the 5th IBM Collaborative Academia Research
exploration of a prototype approach, J. Pers. Soc. Psychol. 52 (6) (June 1987) Exchange Workshop, ACM, October, 2017, p. 13.
1061–1086. [61] V. Latora, M. Marchiori, A measure of centrality based on network efficiency,
[31] K. Oatley, P.N. Johnson-Laird, Towards a cognitive theory of emotions, Cogn. New J. Phys. 9 (6) (June 2007).
Emot. 1 (1) (March 1987) 29–50. [62] J.E. Hirsch, An index to quantify an individual’s scientific research output that
[32] R. Plutchik, Emotion: A Psychoevolutionary Synthesis, Harper and Row, 1980. takes into account the effect of multiple coauthorship, Scientometrics 85 (3)
[33] J.A. Russell, A circumplex model of affect, J. Pers. Soc. Psychol. 39 (6) (1980) (December 2010) 741–754.
1161–1178. [63] D.M. Romero, W. Galuba, S. Asur, B.A. Huberman, Influence and passivity in
[34] G.L.C. Andrew Ortony, A. Collins, The Cognitive Structure of Emotions, social media, in: Joint European Conference on Machine Learning and
Cambridge University Press, 1988. Knowledge Discovery in Databases, September, Springer, Berlin, Heidelberg,
[35] H. Lovheim, A new three-dimensional model for emotions and monoamine 2011, pp. 18–33.
neurotransmitters, Med. Hypotheses 78 (2) (February 2012) 341–348. [64] D. Tunkelang, A twitter analog to pagerank. [Online]. Available: http://
[36] Statista, Number of social media users worldwide from 2010 to 2021 (in thenoisychannel.com//01/13/a-twitter-analog-to-pagerank.
billions). [Online]. Available: https://www.statista.com/statistics/14/number- [65] T. Majer, M. Šimko, Leveraging microblogs for resource ranking, in:
of-worldwide-social-network-users/2784. International Conference on Current Trends in Theory and Practice of
[37] H. Kwak, C. Lee, H. Park, S. Moon, What is twitter, a social network or a news Computer Science, January, Springer, Berlin, Heidelberg, 2012, pp. 518–529.
media? in: Proceedings of the 19th International Conference on World Wide [66] A. Aleahmad, P. Karisani, M. Rahgozar, F. Oroumchian, Olfinder: finding
Web, ACM, April, 2010, pp. 591–600. opinion leaders in online social networks, J. Inform. Sci. 42 (5) (October 2016)
[38] F. Riquelme, P. Gonzalez-Cantergiani, Measuring user influence on twitter: a 659–674.
survey Inform. Process. Manage. 52 (5) (September 2016) 949–975. [67] T.L. Saaty, How to make a decision: the analytic hierarchy process, Eur. J.
[39] E. Kafeza, A. Kanavos, C. Makris, P. Vikatos, T-pice: twitter personality based Oper. Res. 48 (1) (September 1990) 9–26.
influential communities extraction system, in: 2014 IEEE International
Congress on Big Data (BigData Congress), IEEE, June, 2014, pp. 212–219.
[40] C. Francalanci, A. Hussain, Influence-based twitter browsing with navigtweet, Kashfia Sailunaz is a PhD student in the Advanced
Inform. Syst. 64 (March 2017) 119–131. Database Systems and Applications Lab under the super-
[41] E. Lahuerta-Otero, R. Cordero-Gutierrez, Looking for the perfect tweet. The vision of Prof. Reda Alhajj of the Department of Computer
use of data mining techniques to find influencers on twitter, Comput. Hum. Science at the University of Calgary, Calgary, Alberta,
Behav. 64 (November 2016) 575–583. Canada. She obtained her Master in Science degree from
[42] S. Joshi, D. Deshpande, Twitter sentiment analysis system, Int. J. Comput. the same lab at the University of Calgary. She received
Appl. 180 (47) (2018) 35–39. her Bachelor of Science degree from Military Institute of
[43] W. Wolny, Emotion analysis of twitter data that use emoticons and emoji Science and Technology, Bangladesh University of Pro-
ideograms, 25th International Conference on Information Systems fessionals, Dhaka, Bangladesh. She worked as a research
Development (ISD2016 Poland) (2016). assistant in Advanced Intelligent Multidisciplinary Sys-
[44] L. Coviello, Y. Sohn, A.D.I. Kramer, C. Marlow, M. Franceschetti, N.A. Christakis, tems (AIMS) Lab, Dhaka, Bangladesh for almost two years.
J.H. Fowler, Detecting emotional contagion in massive social networks, PLoS Her current research interests focus on social network
One 9 (3) (March 2014) e90315. analysis, advanced database and data mining.
[45] I. Roelens, P. Baecke, D.F. Benoit, Identifying influencers in a social network:
the value of real referral data, Decis. Support Syst. 91 (November 2016) 25–36. Reda Alhajj is currently a full professor at Medipol Uni-
[46] Statista, Number of monthly active twitter users worldwide from 1st quarter versity, Istanbul, Turkey. He is also a tenured professor
2010 to 4th quarter 2017 (in millions). [Online]. Available: https://www. in the Department of Computer Science at the Univer-
statista.com/statistics/87/number-of-monthly-active-twitter-users/2820. sity of Calgary, Alberta, Canada. He published over 500
[47] CrowdFlower, The emotion in text data set. [Online]. Available: https://www. papers in refereed international journals, conferences and
crowdflower.com/wp-content/uploads//07/text emotion.csv. edited books. He served on the program committee of
[48] AAAC Emotion Research, ISEAR Databank. [Online]. Available: http://emotion- several international conferences. He is founding editor
research.net/toolbox/toolboxdatabase.2006-10-13.2581092615. in chief of the Springer premier journal “Social Networks
[49] CodaLab, Semeval. [Online]. Available: https://competitions.codalab.org/. Analysis and Mining”, founding editor-in-chief of Springer
[50] JULIELab, Emobank. [Online]. Available: https://github.com/JULIELab/ Series “Lecture Notes on Social Networks”, founding
EmoBank. editor-in-chief of Springer journal “Network Modeling
[51] National Institute of Standards and Technology (NIST), Text REtrieval Analysis in Health Informatics and Bioinformatics”, found-
Conference (TREC), https://trec.nist.gov/. [Online]. Available: https://trec.nist. ing co-editor-in-chief of Springer “Encyclopedia on Social
gov/data.html. Networks Analysis and Mining” (ranked top 3rd in most downloaded sources in com-
[52] U. Gupta, A. Chatterjee, R. Srikanth, P. Agrawal, A puter science in 2018), founding steering chair of the flagship conference “IEEE/ACM
sentiment-and-semantics-based approach for emotion detection in textual International Conference on Advances in Social Network Analysis and Mining”, and
conversations, in: Neu-IR: Workshop on Neural Information Retrieval, SIGIR three accompanying symposiums FAB (for big data analysis), FOSINT-SI (for home-
2017, ACM, July 2017 arXiv: 1707.06996. land security and intelligence services) and HI-BI-BI (for health informatics and
[53] A. Sen, M. Sinha, S. Mannarswamy, S. Roy, Multi-task representation learning bioinformatics). He is member of the editorial board of the Journal of Information
for enhanced emotion categorization in short text, in: Pacific-Asia Conference Assurance and Security, Journal of Data Mining and Bioinformatics, Journal of Data
on Knowledge Discovery and Data Mining, Springer Cham, May 2017, pp. Mining, Modeling and Management; he has been guest editor of a number of special
324–336. issues and edited a number of conference proceedings. Dr. Alhajj’s primary work and
[54] A. Agrawal, A. An, Unsupervised emotion detection from text using semantic research interests focus on various aspects of data science and big data with empha-
and syntactic relations, in: Proceedings of the 2012 IEEE/WIC/ACM sis on areas like: (1) scalable techniques and structures for data management and
International Joint Conferences on Web Intelligence and Intelligent Agent mining, (2) social network analysis with applications in computational biology and
Technology, vol. 1, IEEE Computer Society, December, 2012, pp. 346–353. bioinformatics, homeland security, disaster management, etc., (3) sequence analy-
[55] R. Nagmoti, A. Teredesai, M. De Cock, Ranking approaches for microblog sis with emphasis on domains like financial, weather, traffic, energy, etc., (4) XML,
search, in: 2010 IEEE/WIC/ACM International Conference on Web Intelligence schema integration and re-engineering. He currently leads a large research group
and Intelligent Agent Technology (WI-IAT), vol. 1, August, 2010, pp. 153–157. of PhD and MSc candidates. He received best graduate supervision award and com-
[56] T. Noro, F. Ru, F. Xiao, T. Tokuda, Twitter user rank using keyword search, in: munity service award at the University of Calgary. He recently mentored a number
Information Modelling and Knowledge Bases XXIV. Frontiers in Artificial of successful teams, including SANO who ranked first in the Microsoft Imagine Cup
Intelligence and Applications, vol. 251, January 2013, pp. 31–48. Competition in Canada and received KFC Innovation Award in the World Finals held
[57] A. Pal, S. Counts, Identifying topical authorities in microblogs, in: Proceedings in Russia, TRAK who ranked in the top 15 teams in the open data analysis com-
of the Fourth ACM International Conference on Web Search and Data Mining, petition in Canada, Go2There who ranked first in the Imagine Camp competition
February, 2011, pp. 45–54. organized by Microsoft Canada, Funiverse who ranked first in Microsoft Imagine
Cup Competition in Canada.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy