0% found this document useful (0 votes)
17 views14 pages

TAC Submission With LML-2

This document presents a method for automatically inferring multi-dimensional affective representations of words from their word embeddings. The method uses regression to map word embeddings to affective dimensions based on seed words. Evaluation on several affective lexicons shows the method outperforms state-of-the-art baselines. It is also computationally efficient and works well across different rating scales without needing to transform ratings. Experiments also demonstrate the extended lexicons improve performance on sentiment analysis tasks.

Uploaded by

juan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views14 pages

TAC Submission With LML-2

This document presents a method for automatically inferring multi-dimensional affective representations of words from their word embeddings. The method uses regression to map word embeddings to affective dimensions based on seed words. Evaluation on several affective lexicons shows the method outperforms state-of-the-art baselines. It is also computationally efficient and works well across different rating scales without needing to transform ratings. Experiments also demonstrate the extended lexicons improve performance on sentiment analysis tasks.

Uploaded by

juan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Inferring Affective Meanings of Words

from Word Embedding


Minglei Li, Qin Lu, Yunfei Long, and Lin Gui

Abstract—Affective lexicon is one of the most important resource in affective computing for text. Manually constructed affective
lexicons have limited scale and thus only have limited use in practical systems. In this work, we propose a regression-based method to
automatically infer multi-dimensional affective representation of words via their word embedding based on a set of seed words. This
method can make use of the rich semantic meanings obtained from word embedding to extract meanings in some specific semantic
space. This is based on the assumption that different features in word embedding contribute differently to a particular affective
dimension and a particular feature in word embedding contributes differently to different affective dimensions. Evaluation on various
affective lexicons shows that our method outperforms the state-of-the-art methods on all the lexicons under different evaluation metrics
with large margins. We also explore different regression models and conclude that the Ridge regression model, the Bayesian Ridge
regression model and Support Vector Regression with linear kernel are the most suitable models. Comparing to other state-of-the-art
methods, our method also has computation advantage. Experiments on a sentiment analysis task show that the lexicons extended by
our method achieve better results than publicly available sentiment lexicons on eight sentiment corpora. The extended lexicons are
publicly available for access.

Index Terms—Affective lexicon, sentiment, emotion, word embedding, regression

1 INTRODUCTION

A S the Internet and social media are becoming so popu-


lar, web text is becoming one of the most important
channels for people to express their opinions, mental state,
Earlier works represent affective meanings of words by
discrete affective labels, such as positive, negative, happiness,
sadness, anger [10], [11], [12], etc. Another method is to rep-
and communicate with each other. Affective meaning refers resent affective meaning by the more comprehensive
to emotion, sentiment, personality, mood and attitude multi-dimensional representation models , such as the
expressed through text [1]. In this work, we refer to the term valence-arousal-dominance model (VAD) [13] and the
affective to be specific to emotion and sentiment. Affective evaluation-potency-activity model (EPA) [14]. Theoreti-
computing from text has many potential applications, such cally speaking, discrete affective labels can always be
as the analysis of consumer opinions on a company’s prod- mapped to certain points in a multi-dimensional affective
ucts [2], automatic recommendation systems for movies, space [15]. Sentiment indicated by polarities can be viewed
books, music or pictures based on current user’s emotions as a one dimensional affective model. For example, it is
[3], detection of people who have potential suicide risks equal to the valence dimension in VAD or the evaluation
based on social media [4], stock market prediction based on dimension in EPA.
public opinions [5], product aspect extraction [6], sarcasm Compared to discrete emotion labels or one dimensional
detection [7], personality detection [8], and intelligent sentiment, multi-dimensional affective representation is
human-computer interaction systems that can express and more comprehensive because it can capture more fine-
detect the affective states of human beings [9], etc. grained information compared to the discrete and the one
The most important resource for affective computing is dimensional models. According to the Affective Control The-
a comprehensive affective lexicon, in which words are ory (ACT), each concept in an event has a transient affective
annotated with affective meanings. The affective meaning meaning which is context dependent in addition to cultural,
of a word can be represented using different methods. behavior and other background information [16]. Multi-
dimensional models allow for more interaction between a
sequence of words so that more context information can be
 M. Li, Q. Lu, and Y. Long are with the Department of Computing, The included in affective computing of text. For example, the
Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong. same noun champion may have different affective state in two
E-mail: {csmli, csluqin, csylong}@comp.polyu.edu.hk. different events: The little boy defeated the champion and The
 L. Gui is with the College of Mathematics and Computer Science, Fuzhou
University, Fuzhou, Fujian, China. E-mail: guilin.nlp@gmail.com. champion defeated the little boy. The difference of the affective
Manuscript received 19 Jan. 2017; revised 20 June 2017; accepted 28 June
states cannot be inferred through single sentiment dimension
2017. Date of publication 3 July 2017; date of current version 5 Dec. 2017. but it can be distinguished through multi-dimensional EPA
(Corresponding author: Minglei Li.) affective lexicons based on the ACT [16]. However, multi-
Recommended for acceptance by E. Cambria, A. Hussain, and A. Vinciarelli. dimensional affective lexicons as NLP resources are limited
For information on obtaining reprints of this article, please send e-mail to:
reprints@ieee.org, and reference the Digital Object Identifier below. because most available ones are based on manual annotation,
Digital Object Identifier no. 10.1109/TAFFC.2017.2723012 such as the ANEW lexicon of VAD based on manual
TABLE 1
List of Popular Discrete Emotion Categorizations

Author Num Basic Emotions


Ekman [24] 6 anger, disgust, fear, joy, sadness, surprise
Parrot [25] 6 anger, fear, joy, love, sadness, surprise
Frijda [26] 6 desire, happiness, interest, sorrow,
surprise, wonder
Plutchik [26] 8 acceptance, anger, anticipation, disgust,
fear, joy, sadness, surprise
Tomkins [27] 9 anger, contempt, disgust, distress, fear,
interest, joy, shame, surprise
Ortony [28] 22 fear, joy, distress, happy-for, gloating,
hope, pity, pride, relief, resentment, Fig. 1. Two dimensional valence-arousal (VA) affective model.
satisfaction, etc.
Xu [29] 7 anger, disgust, fear, joy, like, sadness, is available. We perform extensive experiments on inferring
surprise different affective meanings, including sentiment, valence,
arousal, dominance, evaluation, potency, activity, imagery,
and also other meanings including perceptual sense of
annotation[17], the extended ANEW lexicon based on words, concreteness of words. Evaluations show that:
crowdsourcing [18], the Chinese valence-arousal lexicon
based on manual annotation [19], the EPA lexicon based on 1) Our method achieves the state-of-the-art perfor-
manual annotation [14]. Obviously, manual annotation is mance, outperforms all the baseline methods on sev-
not scalable and it limits the use of multi-dimensional mod- eral affective lexicons in affective space and lexicons
els in real applications. Only if automatic methods can be in other semantic space.
used to learn the representations of affective meanings of 2) Our method is rating scale insensitive, which means
words, the more comprehensive multi-dimensional models that our method does not require the rating range to
can have a wider practical use. Word embedding based be bipolar and there is no need to transform unipolar
graph propagation method is used as an automatic method ratings to bipolar ratings.
to predict the valence-arousal ratings from seed words [20]. 3) Our method is computationally more efficient than
However, word embedding is normally trained to obtain the the baseline methods, especially compared to propa-
general meaning of words, which can include denotative gation based methods.
meaning, connotative meaning, social meaning, affective 4) Several affective lexicons with about million of
meaning, reflected meaning, collocative meaning and the- words are built and one experiment using the built
matic meaning [21]. In other words, directly computing sentiment lexicon shows that lexicons based on
word similarity captures the general meanings of words word embedding perform better than previously
rather than the affective meanings specifically. Words that available sentiment lexicons.
have similar denotative meanings may be associated with The rest of the paper is organized as follows: Section 2
different affective meanings. For example, “father” and “dad” describes related works, including affective models, lexicon
have the same denotative meaning, yet they are associated generation methods, and word embedding models. Section 3
with different affective meanings; “father” is more formal introduces our proposed method for inferring affective
and detached whereas “dad” is more personal and dear affec- meanings. Section 4 performs extensive experiments on var-
tively. Another type of automatic method uses regression ious affective lexicons to validate the effectiveness of our
models to extend affective lexicons. Specific methods include proposed method. Section 5 concludes this paper.
(1) a linear regression model based on manually defined
features from a knowledge base[22], which is limited by the
2 RELATED WORKS
manually prepared features; (2) the linear regression 2.1 Affective Model
weighted on the the semantic similarity between a target Affective meaning includes emotion, sentiment, trait, mood,
word and the seed words[23], which is limited by the accu- and attitudes, etc. Current research in affective computing
racy of the semantic similarity. mainly studies sentiment and emotion. Sentiment is mea-
In this work, we propose a regression method to infer sured by positive or negative polarities. Emotion can be con-
various affective meanings from word embedding based on sidered as fine-grained sentiment. Affective meaning can be
the assumption that different features in word embedding represented either by discrete categories or a set of values in
may contribute differently to a particular affective dimen- continuous scales of some multi-dimension models. In the
sion and one feature in word embedding may also contrib- former representation, different categories are proposed.
ute differently to different affective dimensions. The Table 1 lists several proposed emotion categorizations.
method treats word embedding as word features and learns There are several multi-dimensional models including the
meaning specific weights to each feature when mapping valence-arousal model (VA) [13] as shown in Fig. 1; the evalu-
embedding to different affective dimensions. Consequently, ation-potency-activity model [30] as shown in Fig. 2; the hour-
the method learns one regression model for each affective glass model of emotion [31] which represents the affective
dimension based on the seed words to predict the affective state in four independent dimensions: pleasantness, attention,
meaning of a new word provided that its word embedding sensitivity and aptitude; the Pleasure-Arousal-Dominance
seeds and negative seeds, respectively [37], [44]. Similarly,
PMI is used to build discrete emotion lexicon based on natu-
rally annotated hashtags in twitter [45]. The second approach
is based on the label propagation method which first builds a
word graph and then label propagation is performed to infer
the affective values of unseen words from the seed words.
For example, a graph can be built based on the semantic rela-
tionship in WordNet and the label propagation is performed
to infer the EPA values [46] and sentiment polarity [47]. A
knowledge based graph is confined by the coverage of the
knowledge base. A word graph can also be built from a text
Fig. 2. Three dimensional evaluation-potency-activity (EPA) affective
model. corpus based on the cosine similarity of words represented
by their contexts words and then graph propagation is per-
(PAD) [32]; the two continuous dimensions of evaluation formed to infer the sentiment polarity of unseen words [48].
and activation [33]; the four dimensions of evaluation- Word embedding is also used to compute the cosine similar-
pleasantness, potency-control, activation-arousal, and unpre- ity between words to build the word graph and PageRank
dictability [34]; the three dimensions of serotonin, dopamine algorithm is employed to infer the valence-arousal ratings of
and noradrenaline based on neuroscience [35]. Compared to unseen words [20]. Similarly, a word graph is constructed
the discrete affective models, the dimensional models can using cosine similarity of word embedding to infer senti-
capture more information and are more suitable for computa- ment polarities [43]. The third approach represents a word as
tion because the interaction information between different a vector and then map this vector to some sentiment value or
dimensions can be captured. categories based on a regression model or a classifier. This
approach mainly include (1) representing words by manual
2.2 Affective Lexicon Generation defined features based on some knowledge base and per-
Based on the affective models, affective lexicons are built forming linear regression on the features [22]; (2) represent-
either using a discrete affective model or a dimensional ing words as word embeddings obtained automatically and
model. In this paper, we will only focus on dimension based using a classifier [49] or linear regression [50] to obtain senti-
lexicons. Since sentiments can be described by a one dimen- ment labels or scores; (3) mapping word embedding into
sional model, we also include methods for obtaining senti- sentiment space through a transformation matrix that mini-
ment lexicons. Theoretically speaking, methods to obtain a mizes intra-group distance in each sentiment category and
sentiment lexicon can be extended to obtain other affective maximizes inter-group distance without considering the
dimensions. actual values of the seed words [51].
Affective lexicons can be obtained either by manual
annotation or automatic methods. Manual annotation can 2.3 Word Representation
obtain high-quality lexicons. Manually annotated sentiment In a conventional word representation, a word is first con-
lexicons include the General Inquirer (GI) [10], MPQA [36], verted to a symbolic ID. Its feature set are then transformed
the twitter sentiment lexicon based on crowdsourcing [37], into a vector using a one-hot representation. One-hot encod-
[38], VADER based on crowdsourcing [39], etc. Manually ing is a high dimensional vector representation with only one
annotated multi-dimensional affective lexicons include dimension as 1 and all the other dimensions as 0 for one
ANEW, CVAW, DAL, EPA and ANGST, among others. word, and the dimension size is the size of the vocabulary.
The ANEW lexicon based on the VAD model [17] which This kind of representation cannot capture the semantic rela-
contains 1,034 English words. The extended ANEW lexicon tions between different words. Another method is to repre-
contains about 13,965 English words annotated through sent a word using a low dimensional dense vector, also called
crowdsourcing. The CVAW lexicon based on the VAD model word embedding, which can encode the semantic meaning of
[19] contains 1,653 traditional Chinese words annotated in words and thus comparisons can be easily made. For exam-
the valence and arousal dimensions. The Dictionary of Affect ple, using word embedding, we can make the approximation:
in Language (DAL) lexicon annotated in the dimensions of vecðkingÞ  vecðqueenÞ ¼ vecðmanÞ  vecðwomanÞ [52].
pleasantness-activation-imagery contains 8,742 terms [33]. Various approaches have been proposed to learn dense
The EPA lexicon annotated in the evaluation-potency- word vectors, which can be divided mainly into count based
activity dimensions [16] contains about 4,505 English terms. approaches and prediction based approaches [53], both of
The ANGST lexicon annotated in the valence-arousal- which are based on the distributional hypothesis that words
dominance-imageability-potency dimensions contains 1,003 occur in similar context tends to have similar meanings
German words [40]. [54]. A count based method constructs a word-context co-
Automatic methods to obtain affective lexicons are focused occurrence statistic matrix and then perform matrix factori-
mainly on the sentiment dimension because current research zation to obtain the final word embedding. Features used
works are mostly on sentiment analysis [41], [42], [43]. In include point-wise mutual information, positive point-wise
terms of methodology, there are mainly three approaches. mutual information (PPMI), and log of co-occurrences, etc.
The first approach uses statistical information between a tar- Based on the matrix factorization, various algorithms have
get word and the seed words. For example, sentiment polar- been proposed, such as decomposition of the matrix into two
ity intensities are calculated based on point-wise mutual low dimensional matrices [55], Singular Value Decomposi-
information (PMI) between a target word and the positive tion (SVD) [56], probabilistic matrix and tensor factorization
[57], low rank approximation [58]. The prediction based
method directly predicts the context given the target word by
maximizing the conditional probability of the context words
given the target or vice versa [52]. More comprehensive stud-
ies use more kinds of contexts and knowledge base are
explored to improve word embedding including the use of
cross-lingual context [59], word definition context, knowl-
edge base context [60], morphology context [61], and word
embedding from multi-views or multi-resources [62], [63]. Fig. 3. The proposed regression method for affective representation
learning based on word embedding.
3 PROPOSED METHOD
parameters to be learned. The basic idea behind is that if
In this work, we want to make good use of the semantic
word w and context c co-occur, their corresponding vectors
meanings encoded in word embedding to predict the affec-
should have close correlation, modeled by w~~c. The objec-
tive meanings of words. This will help to build valuable lexi-
tive of negative sampling is to minimize the conditional
cal resources for affective computing using more
probability
comprehensive affective models. The basic idea of our pro-
posed approach is to use regression models to learn the affec- pðD ¼ 1jw; cN Þ ¼ sð~
w  c~N Þ; (2)
tive meaning in each affective dimension. For a multi-
where cN denotes the negative context of w, namely, context
dimensional affective model having m dimensions, the
that does not co-occur with word w. The method randomly
objective is to learn m regression models that are suited for
samples negative context cN of w from VW . Let PD be the
the m affective dimensions. Our method is based on the
empirical unigram distribution where
assumption that word embedding has encoded the general
semantic meaning into the dense vector and a certain dimen- #ðcÞ
sion in word embedding contributes differently to different PD ðcÞ ¼ : (3)
jDj
affective meanings. We consider our approach as a general
learning method by using word embedding and regression Combining Formulas (1) and (2), the objective for each
using a set of seed words, which will be referred to as the word-context pair can be translated into maximizing
Regression on Word Embedding approach, labeled as RoWE.
logsð~ cÞ þ k  EcN PD ½logsð~
w ~ w  c~N Þ;
3.1 Distributed Word Embedding
where k is the number of negative samples. For a given
The first step in our approach is to build a high-quality fea-
training corpus with a set of words VW , the final objective
ture representation for words using a vector space model
function for the whole corpus is
(VSM), which represents a word through a low dimensional
vector, also called word embedding or word vector [64]. As X X
J¼ #ðw; cÞðlogsð~
w ~ cÞ
introduced in Section 2.3, they are mainly count based and w2VW c2VW (4)
prediction based prediction methods for obtaining word
þ k  EcN PD ½logsð~
w  c~N ÞÞ:
embedding. According to a comprehensive study done by
[56], both methods can obtain similar information. In other The obtained w~ and ~c are the word embedding and con-
words, they are basically equivalent although fine tuning text embedding, respectively. The performance of the
may be needed. However, the prediction based method has embedding heavily relies on the hyper-parameters, as
lower computation cost because it does not need to perform shown in [56]. Because finding the optimal word embed-
matrix factorization over a large co-occurrence matrix. ding is not our focus, we simply use the recommended set-
Thus, in this work, we only use the prediction based tings from [56] for the SGNS model. Note that any kind of
method to obtain word embedding. learning model for word embedding can be used in our
The prediction based method is based on the neural net- framework including matrix factorization based word
work and one of the most widely used models is Skip-Gram embedding [55], ensemble based word embedding [65], etc.
with Negative Sampling (SGNS) [52]. Given a corpus with
vocabulary V and the extracted word-context pair set D, let 3.2 Regression Method for Affective Meanings
pðD ¼ 1jw; cÞ be the probability that ðw; cÞ comes from D Prediction
and let pðD ¼ 0jw; cÞ be the probability that ðw; cÞ does not. Fig. 3 shows a general learning method of using linear
The basic assumption of SGNS is that the conditional proba- regressions from word embedding to obtain affective mean-
bility of pðD ¼ 1jw; cÞ should be high if c is the context of ings of words. In the training phase, each seed word s as a
word w in a window and low otherwise. Let w ~ denote the training sample, has known word embedding ~ s which is a
vector representation of w, and ~ c denote the vector of c. vector of size n, and its affective meaning is defined in m
Then, pðD ¼ 1jw; cÞ is computed as dimensional space. A word embedding and annotated
1 affective meanings pair consists of one training sample.
pðD ¼ 1jw; cÞ ¼ sð~
w ~
cÞ ¼ ; (1) Given sufficient such pairs, we can learn a regression model
1 þ e~
w~
c
for every affective dimension Aj where j is in the range of
where w~ and ~
c are the word embedding and context embed- ½1 . . . m. Based on the regression model, we can then predict
ding in our model, respectively. Both w
~ and ~
c are the model the affective value of a new word based on its word
embedding. Consequently, we can extend an existing affec- TABLE 2
tive lexicon automatically. Summary of Lexicons Used in the Experiments
Given a seed, s, and its word embedding w ~s ¼ ½es1 ;
Lexicon Num Overlap std Affective Range
e2 ; . . . ; en , we need to learn the mapping function fj for the
s s
Num Meaning
jth affective dimension
GI 3,626 2,942 N Sentiment f1; 0; 1g
SemEval2015 1,515 751 N Sentiment ½1; 1
ws Þ ¼ gj ðaj1 es1 þ aj2 es2 þ    þ ajn esn Þ;
fj ð~ (5) VADER 7,502 3,124 Y Sentiment ½4; 4
ANEW 1,034 958 Y VAD ½1; 9
where aji is the weight of feature i, gj is the mapping func- E-ANEW 13,915 11,364 Y VAD ½1; 9
tion. When fj is a scalar value, gj can be the identity func- CVAW 1,647 1,309 Y VA ½1; 9
tion and this model becomes a typical linear regression EPA 4,505 2,901 Y EPA ½4; 4
model. When fj takes categorical labels, gj can be a logistic DAL 8,743 8,003 N EAI ½1; 3
function and this model becomes a typical logistic regres- Perceptual 1,001 826 Y Five senses ½0; 5
sion model. fj can be any kind of affective meanings, such Concreteness 39,954 18,111 Y Concreteness ½1; 5
as valence, arousal, dominance in the VAD model, or evalu-
ation, potency, activity in the EPA model, or a simple posi-
tive/negative label. 4.1 Inferring Affective Meanings
Let V denote the set of seeds. The objective function for The first set of experiments is set up to explore the effective-
regression learning of each affective dimension j is then ness of our proposed RoWE. The compared methods are
defined as follows: listed below.
X
min ws Þ  ysj Þjj22 þ aRð~
jjfj ð~ aj Þ; (6) 1) PMI [44]: This method learns the intensity value of a
~
a
s2V word through the pointwise mutual information
with the seed words.
where Rð~ aj Þ is the regularization part on the weight vector 2) qwn-ppv [47] This method automatically generates a
a ¼ ½a1 ; aj2 ; . . . ; ajn  and a is the regularization weight. When
~j j
set of positive and negative seed words over WordNet
a ¼ 0, the model degrades to the ordinary least squares lin- [66]. Then a word graph is constructed from WordNet
ear regression. When a 6¼ 0 and Rð~ aj jj22 , the model
aj Þ ¼ jj~ based on the relations in WordNet. PageRanking algo-
degrades to the Ridge regression model. When a 6¼ 0 and rithm is used to obtain sentiment intensity of unseen
Rð~ aj jj11 , the model degrades to the Lasso regression
aj Þ ¼ jj~ words. Here we directly use the provided lexicons for
model. Different regression models are evaluated in the
comparison because it is not affected by the corpus as
experiments.
the lexicon is produced from WordNet.
This model can be trained on existing affective lexicons.
3) Web GP [48]: This web-based graph propagation
Once the model is learned, given the embedding of a new
method constructs a weighted graph using cosine sim-
word, its corresponding affective meanings in m dimensions
ilarity of a word by a vector of co-occurrence with its
can be predicted using m regression models, respectively.
context words. This method only keeps the 25 highest
The size of the constructed lexicon depends on the size of
weighted edges for each node to reduce the effect of
available word embedding, which is in principle unlimited
noise in the web data. The iteration number is set to 5.
because of the large amount of available text corpora.
4) Wt-Graph [20]: This method uses the cosine similarity
of word embedding as the edge weights to construct
4 EXPERIMENTS AND ANALYSIS a weighted word graph and then use the PageRank
In this section, we first perform a set of experiments to algorithm to obtain the affective meanings (valence
evaluate our method in inferring affective meanings and arousal).
under different affective models including the sentiment, 5) DENSIFIER [51]: This method learns an orthogonal
the valence-arousal-dominance, the evaluation-potency- transformation from the original embedding space to
activity, the evaluation-activation-imagery (EAI). To fur- obtain task specific information in ultradense space,
ther prove the generality of our proposed method, we such as the one dimensional sentiment polarity space.
also evaluate our method in inferring other word mean- 6) SENTPROP [43]: Similar to Wt-Graph [20], this method
ings, including the concreteness-abstractness, the percep- also employs cosine similarity of word embedding
tual strength in five senses of hearing, seeing, touching, as the edge weights to construct a word graph and
tasting and smelling. The second set of experiments evalu- use random walk to obtain the affective meaning
ate the complexity of different methods. The third set of (sentiment polarity in their work).
experiments evaluate the effects of the seed words on dif- All the above methods need to use some seed words to
ferent methods. The fourth set of experiments evaluate infer the affective meanings of unseen words. To compare
the effects of the embedding dimension size. The fifth set fairly, all the methods in the evaluation use the same set of
of experiments looks at the performance of different seed words, the same corpus, and the same test settings.
regression models and also examine the different embed- The gold answers used for this set of experiments is a list
ding resources in terms their predictability on an existing of affective lexicons which are chosen because they are man-
lexicon. The last set of experiments evaluate the perfor- ually annotated and thus are considered to have high qual-
mance of the sentiment lexicons obtained by our method ity. A summary of the lexicons used as gold answers is
on a downstream sentiment analysis task. given in Table 2. The table lists the lexicon names (Lexicon),
their sizes (Num), the number of words in the lexicon which mapped back to the annotation range. For the regression
also appears in the word embeddings (Overlap Num), model in RoWE, Ridge regression is used in the scikit-learn
whether standard deviation of annotation is supplied or not tool with default parameters for the following experiments.
(std), the affective model (affective meaning), and the annota- Evaluation Metrics. For the GI lexicon, which is a ternary
tion range (Range). GI [10] is a sentiment lexicon annotated classification task, we use the area under curve (AUC) and
with positive, neutral, negative. During prediction, we use macro F-score as the evaluation metrics using the method in
class-mass normalization to give discrete labels as done in [43] to transform the predicted scalar values to sentiment
[43]. VADER [39] and SemEval2015 [38] are sentiment lexi- labels.4 For the other lexicons, which are continuous value
cons annotated with intensity and VADER also contains prediction task, we use the following evaluation metrics:
standard deviation of the annotation process. ANEW [17]
and E-ANEW [18] are manually annotated in the three 1) Root mean squared error (RMSE)
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2
dimensions of valence, arousal and dominance with values RMSE ¼ i¼1 ðAi  Pi Þ =n,
from 1 to 9 and E-ANEW is an extended version of ANEW 2) Mean absolute Perror (MAE)
through crowdsourcing. CVAW [19] is the Chinese version MAE ¼ n1 n1 jAi  Pi j,
of ANEW but annotated only on the two dimensions of 3) Mean absoluteP percentage error (MAPE)
valence and arousal. EPA [14] is annotated in the two MAPE ¼ n1 n1 jAiAP i
ij
 100%, and
dimensions of evaluation and potency. DAL[33] (dictionary 4) Kendall rank correlation coefficient t
of affect in language) is annotated in the three dimensions t ¼ CD
CþD ;
of evaluation, activation and imagery (EAI). Perceptual [67], where Ai is the gold standard value, Pi is the predicted
[68] is annotated with perceptual strength of a target word value, n is the total number of the test samples, A  and P are
by feeling through five sensations. During annotation, each the average value of A and P , C is the number of concordant
word is annotated through the question “To what extent do pairs and D is the number of discordant pairs. The lower the
you experience something being WORD” (with “WORD” being values of RMSE, MAPE and MAE, and the higher the value
the target word to be annotated). Underneath this question of t, the better the performance is. Note that the MAPE evalu-
are five separate rating scales for each perceptual modality, ation metric suffers from the zero-division problem. We do
labeled “by feeling through touch”, “by hearing”, “by not report the MAPE result if the gold value contains 0. So,
seeing”, “by smelling”, and “by tasting”. The participants for lexicons whose values contain zero (SemEval2015, EPA,
were asked to rate the extent to which they would experi- DAL, Perceptual), we do not use the MAPE metric because
ence about the five senses, from 0 (not at all) to 5 (greatly) MAPE is sensitive to zero. In addition, for the lexicons with
[67], [68]. Concreteness [69] is annotated on the degree of con- provided standard deviation on annotation, we also use a
creteness or abstractness of a word through crowdsourcing. new evaluation metric defined as follows:
Among those lexicons, only CVAW is Chinese and the
others are English. We include Perceptual and Concreteness 1X n
ac1s ¼ gðs i  jAi  Pi jÞ; (7)
lexicon, which are actually not affective lexicons, to test the n i¼1
generalization ability of our method on inferring other where
word meanings. 
Experiment Settings. For English lexicons, we train the 1: x > 0;
gðxÞ ¼
300 dimensional word embedding based on Wikipedia 0: otherwise:
August 2016 dump with 3.1 billion tokens.1 For Chinese, we
s i is the annotated standard deviation. ac1s indicates the
train the 300 dimensional word embedding based on Baidu
percentage of correctly predicted samples within 1 standard
Baike corpus with 1.8 billion tokens2 after performing word
deviation of the gold answers.
segmentation using the HIT LTP tool.3 Both embeddings
The lexicons can be divided into three types: the sentiment
are trained using the SGNS model introduced in Section 3.1.
lexicons including GI, SemEval2015 and VADER, the multi-
The respective overlap sets between the embeddings and
dimensional affective lexicons including ANEW, E-ANEW,
the lexicons are randomly split equally to form the training
CVAW, EPA and DAL, and other word meanings rather
sets and the testing sets. For each experiment, we run five
than affective meaning including Concreteness and Percep-
times and report the average result with standard deviation
tual. The results are shown in the three sub-tables of Table 3.
in the parenthesis. In addition, we use the relative standard
Table 3 a is for the sentiment lexicons, Table 3 b is for the
deviation as a metric of the robustness of the methods. To
multi-dimensional affective lexicons, and Table 3 c is for the
satisfy the requirement of bipolar scale of some baselines
concreteness and perceptual lexicons. The first dimension
(PMI, Web-GP, DENSIFIER, SENTPROP), we transform the affec-
(valence or evaluation) of the multi-dimensional lexicons is
tive scales to bipolar scale if needed. For example, ANEW,
the same as sentiment. So we include qwn-ppv for compari-
E-ANEW, and CVAW are mapped from ½1; 9 to ½4; þ4 lin-
son on this dimension too. There is no result for qwn-ppv on
early, DAL is mapped from ½1; 3 to ½1; þ1, Perceptual is
CVAW because they are in different languages. To make the
mapped from ½0; 5 to ½2:5; 2:5 and Concreteness is
tables more readable, we only show the standard deviations
mapped from ½1; 5 to ½2; 2. The final predicted values are
of the five runs for the sentiment lexicons.

1. https://dumps.wikimedia.org/enwiki/latest/ Accessed May 17, 4. Though we can directly predict discrete labels using logistic
2017 regression on word embedding, the baseline methods can only produce
2. http://www.nlpcn.org/resource/list/2 Accessed May 17, 2017 scalar value. To be consistent with the baselines, we also predict the
3. http://www.ltp-cloud.com/ Accessed May 17, 2017 scalar value using a linear regression model.
TABLE 3
Result on Inferring Affective Meaning

RM for RMSE; MA for MAE; MP for MAPE.

Based on the results from these tables, we make five major Perceptual lexicon, RoWE achieves a relative improvement
observations. (1) RoWE outperforms the other methods with over Wt-Graph of 26.2, 27.3, 22.8, 24.4 percent under RMSE,
large margins on all the affective dimensions of all the lexi- MAE, t and ac1s respectively. (2) Among different evalua-
cons under all the evaluation metrics. For example, on the GI tion metrics, rankings on RMSE, MAE and MAPE are simi-
lexicon, RoWE has a relative improvement of 1.2 percent on lar. But, Kendall correlation coefficient are different. For
AUC and 1.3 percent on Macro-F1 over the state-of-the-art example, the ranking for RMSE from best to worst is RoWE,
Wt-Graph method. On the ANEW lexicon, RoWE outper- Wt-Graph, PMI, SENTPROP, Web-GP and qwn-ppv, DENSIFIER.
forms the state-of-the-art Wt-Graph method with relative However, the ranking for t is RoWE, Wt-Graph, qwn-ppv,
improvement of 36.8, 47.1, 49.2, 14.2, and 51.5 percent for SENTPROP, PMI, Web-GP, DENSIFIER. The ranking for ac1s is
RMSE, MAE, MAPE, and the Kendall correlation coefficient RoWE, Wt-Graph, SENTPROP, Web-GP and qwn-ppv, PMI,
t metrics, respectively. On the touching dimension of DENSIFIER. This means that different methods may have their
TABLE 4
Example Words Close in Embedding Space, But Not Close in
Predicted Affective Space

Word G val P val Top 5 nearest words in embedding space


good 7.47 6.45 decent(5.94), bad(3.34), excellent(7.35),
poor(3.32), commendable(7.19)
heaven 7.3 6.80 heavens(6.33), heavenly(6.80), hell(4.74),
god(6.54), afterlife(5.63)
clouds 6.18 5.66 cloud(5.00), mist(5.00), droplets(4.85),
dust(4.27), overcast(4.54)
cold 4.02 4.16 warm(7.09), winters(5.27), colder(4.94),
cool(6.34), freezing(4.24)
displeased 2.79 3.64 angered(3.34), unhappy(3.43), incensed(3.37),
pleased(6.40), apprehensive(3.79)

Fig. 4. The learned weights of different affecetive meanings for the For example, the nearest word of cold is warm while their
ANEW lexicon. predicted valence value are 4.16 and 7.09 respectively. This
validates that our method can distinguish the affective
merits under different performance measures. (3) To con- meanings through assigning different weights to the fea-
sider the different dimensions for the VAD lexicons, the per- tures in the embedding space.
formance on valence for t is better than on arousal and
dominance. However, it is opposite for ac1s , RMSE, MAE 4.2 Method Complexity
and MAPE. This may be because t focuses on the ranking The complexity of different methods are shown in Table 5.
rather than value difference between the gold value and the In this table, N is the data sample size, d is the embedding
predicted value, whereas the other evaluation metrics focus dimension and k is the number of nearest neighbors used in
on the value difference between the gold value and the pre- Web-GP and SENTPROP. d and k are set as constants during
dicted value. (4) For the E-ANEW lexicon, which is anno- experiment. The second column in the table indicates that
tated through crowdsourcing, the mean absolute errors the asymptotic complexities of PMI, Web-GP, Wt-Graph
(MAE) of our method are 0.65, 0.58, 0.56 on valence, arousal, and SENTPROP grow quadratically with the data size,
dominance, respectively. This means that the predicted val- whereas the complexities of DENSIFIER and our RoWE grows
ues are quite close to the manually annotated values. On the linearly with the data size. The third column in the table
ac1s metric, our method’s performance achieves 93.4, 99.1, shows the complexity with constant coefficients d and k.
99.0 percent on valence, arousal, dominance, respectively. Even though d and k do not have a role to play in Big O
This means that almost all the predicted values are in one analysis, as shown in the second column, they do affect the
standard deviation of the manually annotated mean value. efficiency of the implementations especially when data sam-
(5) The standard deviations shown in parentheses of the sen- ples have limited size.
timent lexicons indicate that RoWE has smaller relative stan- To further examine their run time efficiency, we also run
dard deviations. In other words, RoWE is more robust and is an experiment to visually observe the difference in comput-
less seed word sensitive. ing time by varying the data size from 1,000 to 11,000 using
In conclusion, our proposed RoWE method achieves the the E-ANEW lexicon and set the seed word number to 300.
best result on all the lexicons under all the evaluation met- The remaining collection is used as test data. The hardware
rics, which validates our assumption that word embeddings platform is a desktop computer with processor of Intel (R)
do encode semantic information and the regression model Xeon (R) CPU E5-1620 and 64G RAM and during running
can effectively decode the affective meanings from the each method, we close all the other programs. The result is
embeddings by assigning different weights to different shown in Fig. 5. Web-GP is not listed because its running
dimensions in the embedding. Fig. 4 shows a visualized time is too high ranging from about 23,900 to 38,000 (in
weight values of ~ a on the first ten dimensions in the vector micro seconds). The figure shows that RoWE requires the
space of word embedding to the three affective dimensions least running time. When the data size increase from 1,000
on ANEW lexicon for the VAD model. Note that the weights to 11,000, the the running time of RoWE changes from 11 to
for the three affective dimensions can be quite different. For
example, for the first vector in embedding, its correspond-
TABLE 5
ing affectives weights are 1.11, -1.05, and 0.63, respectively. Complexity of Different Methods
Table 4 lists some example words in the ANEW lexicon
that are close in embedding space but not close in the Method Asymptotic Complexity with
valence dimension. In the table, the word column is the tar- Complexity coefficient
get word, the G val column is the gold valence value, P val is PMI OðN 2 Þ OðN 2 Þ
the predicted valence value, and the last column is the top 5 Web-GP OðN 2 Þ OðN 2 kdÞ
nearest words in embedding space based on cosine similar- Wt-Graph OðN 2 Þ OðN 2 dÞ
DENSIFIER OðNÞ OðNd3 Þ
ity. The value in the parenthesis is the predicted valence
SENTPROP OðN 2 Þ OðN 2 kdÞ
value. The words in bold are examples that are close in the RoWE OðNÞ OðNd2 Þ
embedding space but not close in the valence dimension.
Fig. 7. The effects of embedding dimension.
Fig. 5. The running time of different methods under different data size.
We break the y axis at 5,000 to 6,000 to make the figure more readable.
The numbers in parenthesis are the running time. result is shown in Fig. 7. Note that as the dimension increases
from 50 to 300, the performance improves steadily. However,
116 which basically translates a linear increase of 10.5 times. between 300 to 500, the curve is quite flat. Generally speak-
Although running time may be affected by actual imple- ing, larger dimensions do bring better performance, but it
mentations, this experiment can still reveal the computation would require more resources and computation power. To
advantage of RoWE over the other methods. In conclusion, balance the performance and computation cost, we suggest
RoWE has complexity advantage over the other methods. to set the dimension between 300 to 400.

4.5 Effects of Regression Models and Embedding


4.3 The Effect of Seed Words
Methods for RoWE
In this experiment, we explore the effects of seed word size
In previous experiments, we use the Ridge regression
using the ANEW lexicon. We change the size of the seed
model and the word embedding trained using the SGNS
words from 10 to 800 with 30 as the step size and the remain-
model. In principle, any regression model and word embed-
ing lexicon as the test data. Without loss of generality, we
ding method can be used in our proposed method. In prac-
only measure the valence dimension in terms of ac1s . Result
tice, however, different regression methods and the actual
shown in Fig. 6 indicates that Web-GP, SENTPROP, and Wt-
embedding method may affect the overall performance. In
Graph methods achieve almost similar result and they are
this section, we explore the effects of the regression models
stable without much room to improve when more seed
and word embedding methods.
words are added. PMI and DNSIFIER, however, is not quite
In principle different regression models can be used as
stable. RoWE has much better performance. It can also
explained in Section 3, such as linear regression, Ridge
improve its performance when more seed words are used.
regression, BayesianRidge regression, ElasticNet regression,
Note that even with a small set of seed words (such as 100,
Lasso regression, as well as Support Vector Regression with
which can be obtained easily through manual annotation or
linear kernel (SVM-Linear), Support Vector Regression with
crowdsourcing), RoWE still achieves much better result.
non-linear Gaussian kernel (SVM-RBF). We examine their
performance in terms of ac1s , using the one-dimensional
4.4 Effects of Dimension Size of Word Embedding
VADER lexicon. The size of the seed words changes from 10
In this experiment, we explore the effect of embedding to 600 with 30 as the step size and the remaining lexicon as
dimension size. We train word embeddings with different the test data. All the models are based on scikit-learn5 with
dimension sizes on the Wikipedia corpus using the SGNS default parameters. The result is shown in Fig. 8. Note that
model and report the RMSE performance on the VADER lex- the SVR-Linear, the Ridge and the Bayesian Ridge achieve
icon and the VAD dimensions of the E-ANEW lexicon. The similar and much better result than the other regression
models. This is because that Ridge regressions and SVR-Lin-
ear use norm 2 regularization on the weights to avoid over-
fitting. The linear regression model shows the U shape
because of overfitting without regularization on the weight
coefficients. SVR-Linear performs much better than SVR-
RBF, which indicates that linear models are more suitable
than non-linear model for inferring affective meanings from
word embedding. Similar results are obtained under other
evaluation metrics and other affective lexicons. Thus, we
suggest to use SVM-Linear, Ridge or Bayesian Ridge regres-
sion models in our framework.
We conduct evaluation on different embedding resour-
ces. In addition to the embedding trained from Wikipedia,

Fig. 6. The effect of seed word size on the ANEW lexicon. 5. scikit-learn.org/ Accessed May 17, 2017
slightly better than wikiEmb because GoogleEmb uses a
much larger training corpus. Since evaluating embedding
quality is not our focus, for the detailed discussion on the
quality of embedding methods, we suggest the paper [56].
Other than MVEmb, which seems to be low in performance,
all the other embeddings have comparable performance.
Even though CVNE has the best performance in this experi-
ment, it only indicates the usefulness of adding knowledge
base information to a non-supervised training method. It
does not by any means guarantee that CVNE is the best per-
former on a downstream task because the lexicon size is lim-
ited by the coverage of the knowledge base.
Table 6 shows the example words with the top 5 largest
Fig. 8. The performance of different regression models on the VADER
and top 5 smallest predicted values in each affective dimen-
lexicon. sion under different affective models using the Ridge
regression based on corresponding seed lexicons and
denoted as wikiEmb with size 204,981, as explained in CVNE embedding. Note that all the learned top words are
Section 4.2, we also use the following public available embed- quite reasonable. As sentiment indicators, ANEW-v, EPA-e
dings that are obtained from different learning methods. has the same word giving gift. Several words do get listed in
different lexicons such as giving gift, make happy. Note that
1) Google embedding (GoogleEmb) [52]: It is trained our method is not limited to predict the affective meaning
using the SGNS model as introduced in Section 3.1 of words only. Phrase prediction is not a problem in general
from a news corpus of 10 billion tokens.6 The embed- as long as phrase embeddings are given. Interestingly, on
ding vocabulary size is 3,000,000. the Concreteness, the last word istically actually is the
2) Glove 840B (Glove) [55]: It is based on weighted adverb suffix, which is quite abstract.
matrix factorization on the co-occurrence matrix
built from a corpus consisting of 840 billion tokens.7 4.6 Downstream Task for Sentiment Classification
The embedding vocabulary size is 2,196,017. In this section, we evaluate the effectiveness RoWE through
3) Meta-Embedding (MetaEmb) [65]: This method the performance of a downstream sentiment analysis task.
ensembles different embedding sources to obtain the In this experiment, we examine the effectiveness of the lexi-
final meta-embedding.8 The size is 2,746,092 cons obtained from RoWE compared to baseline lexicons
4) ConceptNet Vector Ensemble (CNVE) [70]: This method obtained from other methods including both manual ones
combines word2vec, Glove with structured knowl- and automatically obtained ones. The sentiment corpora
edge from ConceptNet [71].9 The size is 426,572. used in the experiment are listed in Table 8. The baseline
5) MVLSA (MVEmb) [62]: This method learns word lexicons are all publicly available and are listed in Table 9.
embedding from multiple sources including text cor- The list of lexicons is sorted according to their size. Note
pus, dependency relation, morphology, monolingual that the ANEW, VADER and E-ANEW are obtained manu-
corpus, knowledge base from FramNet based on ally or through crowdsourcing and the others are obtained
generalized canonical correlation analysis.10 The size automatically.
is 361,082. The setup of the experiment is to first use RoWE to
6) Paragram Embedding (ParaEmb) [72]: This method extend the VADER sentiment lexicon using different
learns word embeddings based on the paraphrase embeddings introduced in Section 4.5. RoWE is trained
constraint from PPDB.11 The size is 1,703,756. using the intersection of the VADER lexicon and the respec-
We test the embeddings on the common set of 1,079 words tive embeddings. The size for each of the extended lexicon
in all the selected embedding resources and the VADER lexi- is different depending on the vocabulary of the embed-
con. Among the 1,079 words, we randomly select 50 percent dings. For a fair comparison, we use the same downstream
as seed words and the other 50 percent as test words. We run sentiment classification method for all the different lexicons.
each experiment 5 times and report the average performance We use the VADER method for sentiment classification [39]
with standard deviation in the parenthesis as shown in because it is a lexicon-based method using heuristic rules.
Table 7. Note that the knowledge based CVNE achieves the We did not use any machine learning method to avoid the
best result under all the evaluation metrics, which indicates effects of other factors other than the evaluated lexicons.
that distilling knowledge base into embedding can improve The VADER method can better reflect the quality of the
the quality of word embedding. GoogleEmb performs evaluated sentiment lexicons. In the sentiment analysis
task, we use F-score as the evaluation metric.
Table 10 shows the evaluation result and the best results
6. https://code.google.com/archive/p/word2vec/ Accessed May 17, are in bold. Note that all the lexicons obtained by using
2017
7. http://nlp.stanford.edu/projects/glove/ Accessed May 17, 2017 RoWE are listed in the second part of the table and the size of
8. http://cistern.cis.lmu.de/meta-emb/ Accessed May 17, 2017 each obtained lexicon is included in parenthesis. In general,
9. https://github.com/commonsense/conceptnet-numberbatch the embedding based lexicons perform better than the base-
Accessed May 17, 2017
10. http://cs.jhu.edu/prastog3/mvlsa/Accessed May 17, 2017 line lexicons. The ParaEmb lexicon, in particular, achieves
11. http://ttic.uchicago.edu/wieting/Accessed May 17, 2017 the best result on all the sentiment corpora. In the baseline
TABLE 6
Example Words with Top 5 Largest and Smallest Predicted Affective Values Based on CVNE Embedding

VADER ANEW-v ANEW-a ANEW-d EPA-e EPA-p EPA-a DAL-e DAL-a DAL-i Concreteness
Examples words of top 5 largest predicted affective values
giving gift giving gift insanity paradise giving gift god raver giving gift dangerous neighbor’s non
activity house powered
device
making making gun win heaven ceo riot making climbing non opaque
happy happy happy mountain powered thing
device
excellentness make sex positive make christ gunfight make playing own home power
happy attitude happy happy snooker shovel
excavator
life of party reading rampage incredible making herculean fighter showing winning opaque non
books happy strength love game thing agentive
artifact
winning positive tornado self positive pope nightclub enjoying playing single user single
baseball attitude attitude day cricket device user
game device
Examples words of top 5 smallest predicted affective values
hell with stabbing to soothing uncontroll- able hell coward glum mommick scar that degree more equal
death
unpleasant life librarian earthquake murder weakling cemetery unpleasant shadows risibility confessedly
person threatening person
condition
hagridden poor devil dull lobotomy rape high and funeral plague elementary in such way hypostatize
dry
abusive crybully calm alzheimers unpleasant slave mummy plaguer supplement inhere neuter
language person substantive
hagride abusive grain dementia rapist powerless graveyard nidder oxgang in this istically
language

lexicons, SentiWords performs the best. We want to point out NNLexicon and Tang have about 184 K and 347 K respec-
that in both the baseline lexicons and lexicons obtained from tively. The best performer ParaEmb is also not the largest in
RoWE, lexicon size is not the determiner for the best perfor- lexicon size. In fact, CVNE which is only 0.4 M in size has
mance. Among the baseline lexicons, the best performer, very good performance. Note that MetaEmb performs much
SentiWords has only about 147K sentiment words whereas worse than other embedding based lexicons. Further analy-
sis indicates that although the size of MetaEmb is large, the
TABLE 7 overlap size of MetaEmb with the sentiment corpus vocabu-
Evaluation of Different Embeddings on lary is quite low, For example, there are only 512 overlapping
VADER Lexicon Using RoWE seeds out of 6,298 (10 percent) in the mpqa corpus compared
Method RMSE MAE t ac1s to 6,193 of ParaEmb. Also, most of the words in MetaEmb are
informal strings, such as rates.download, now!download. The
wikiEmb 1.2(.02) .96(.01) 49.9(1.1) 53.6(1.0)
general conclusion is that (1) the larger overlapping is gener-
GoogleEmb 1.1(.01) .86(.01) 55.4(1.0) 57.6(1.5)
Glove 1.0(.02) .80(.02) 59.4(1.2) 61.7(1.5) ally good, but again it is not the determining factor; and (2)
CVNE .88(.01) .69(.01) 66.0(.95) 67.3(1.2) the high quality word embedding also helps even if its size is
MetaEmb 1.1(.03) .86(.02) 56.4(1.3) 57.8(1.4) not large (as shown by CVNE).
MVEmb 1.3(.02) 1.0(.02) 42.4(1.0) 50.7(.31)
ParaEmb 1.0(.02) .80(.02) 59.6(1.4) 60.8(1.4) TABLE 9
Statistics of Baseline Sentiment Lexicons

TABLE 8 Lexicon size Description


Statistics of Sentiment Corpus
ANEW [17] 1034 manual annotation
VADER [39] 7,502 crowdsourcing annotation
Corpus num pos num vocab avg words Description
E-ANEW [18] 13,915 crowdsourcing annotation
sem [73] 3,583 2,570 18,965 19.8 SenEval 2013 SenticNet4 [42] 50,000 propagation on ConceptNet
mR [39] 10,605 5,242 29,864 18.9 movie review HashtagSenti [78] 54,129 statistics based on hashtag
aR [39] 3,708 2,128 8306 16.5 Amazon review senti140 [78] 62,468 statistics based on emoticon
nyt [39] 5,190 2,204 20,929 17.5 News qwn-ppv [47] 81,248 propagation on WordNet
cr [74] 3,771 2,405 5,712 20.1 customer review SentiWordNet3 [79] 89,631 automatic based on WordNet
mpqa [75] 10,603 3,311 6,298 3.1 news SentiWords [80] 147,305 ensemble on SentiWordNet
mr [76] 10,662 5,331 21,425 21.0 movie review NNlexicon [81] 184,579 neural network prediciton
SST [77] 1,821 909 7,576 19.2 movie review Tang [49] 347,446 representation learning
TABLE 10 [2] B. Pang and L. Lee, “Opinion mining and sentiment analysis,”
Result on Downstream Sentiment Analysis Task Found. Trends Inf. Retrieval, vol. 2, no. 1/2, pp. 1–135, 2008.
[3] E. Cambria, A. Hussain, and C. Eckl, “Taking refuge in your per-
sonal sentic corner,” in Proc. 5th Int. Joint Conf. Natural Language
Lexicon(size in M) sem mR aR nyt cr mpqa mr SST
Process., 2011, pp. 35–43.
ANEW 0.71 0.56 0.55 0.49 0.62 0.27 0.54 0.57 [4] B. Desmet and V. Hoste, “Emotion detection in suicide notes,”
VADER 0.83 0.66 0.71 0.57 0.78 0.63 0.66 0.70 Expert Syst. Appl., vol. 40, no. 16, pp. 6351–6358, 2013. [Online].
E-ANEW 0.85 0.68 0.74 0.63 0.79 0.58 0.68 0.70 Available: http://www.sciencedirect.com/science/article/pii/
SenticNet4 0.79 0.66 0.69 0.59 0.74 0.57 0.66 0.68 S0957417413003485
[5] J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the
HashtagSenti 0.81 0.62 0.66 0.53 0.71 0.41 0.62 0.66
stock market,” J. Comput. Sci., vol. 2, no. 1, pp. 1–8, 2011. [Online].
senti140 0.82 0.68 0.65 0.60 0.68 0.55 0.68 0.70 Available: http://www.sciencedirect.com/science/article/pii/
qwn-ppv 0.76 0.63 0.69 0.57 0.74 0.45 0.63 0.66 S187775031100007X
SentiWordNet3 0.65 0.56 0.56 0.49 0.62 0.43 0.56 0.60 [6] S. Poria, E. Cambria, and A. Gelbukh, “Aspect extraction for opin-
SentiWords 0.85 0.68 0.74 0.63 0.79 0.60 0.68 0.71 ion mining with a deep convolutional neural network,” Knowl.-
NNlexicon 0.77 0.64 0.68 0.53 0.73 0.55 0.64 0.67 Based Syst., vol. 108, pp. 42–49, 2016. [Online]. Available: http://
Tang 0.83 0.63 0.63 0.53 0.66 0.54 0.63 0.68 www.sciencedirect.com/science/article/pii/S0950705116301721
[7] S. Poria, E. Cambria, D. Hazarika, and P. Vij, “A deeper look into
wikiEmb(0.2 M) 0.84 0.68 0.74 0.62 0.78 0.66 0.68 0.69 sarcastic Tweets using deep convolutional neural networks,” in
GoogleEmb(3 M) 0.85 0.68 0.74 0.63 0.78 0.68 0.69 0.70 Proc. 26th Int. Conf. Comput. Linguistics, 2016, pp. 1601–1612.
Glove(2 M) 0.85 0.69 0.74 0.65 0.79 0.69 0.69 0.71 [Online]. Available: https://arxiv.org/abs/1610.08815
CVNE(0.4 M) 0.85 0.69 0.74 0.63 0.78 0.68 0.69 0.70 [8] N. Majumder, S. Poria, A. Gelbukh, and E. Cambria, “Deep learning-
MetaEmb(2.7 M) 0.73 0.47 0.49 0.43 0.48 0.04 0.49 0.47 based document modeling for personality detection from text,” IEEE
MVEmb(0.36 M) 0.85 0.68 0.74 0.62 0.78 0.68 0.68 0.69 Intell. Syst., vol. 32, no. 2, pp. 74–79, Mar./Apr. 2017. [Online]. Avail-
ParaEmb(1.7 M) 0.85 0.69 0.74 0.65 0.79 0.70 0.69 0.72 able: http://ieeexplore.ieee.org/abstract/document/7887639/
[9] J. Hoey, T. Schrder, and A. Alhothali, “Affect control processes:
Intelligent affective interaction using a partially observable Markov
decision process,” Artif. Intell., vol. 230, pp. 134–172, 2016. [Online].
5 CONCLUSION Available: http://www.sciencedirect.com/science/article/pii/
S000437021500140X
In this paper, we present a regression based method to auto- [10] P. J. Stone, D. C. Dunphy, M. S. Smith, and D. M. Ogilvie, “The
matically infer the affective meanings of words based on general inquirer: A computer approach to content analysis,”
word embedding. We argue that word embedding not only J. Regional Sci., vol. 8, no. 1, pp. 113–116, 1968. [Online]. Available:
carries general semantic meaning, but also meanings in some http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9787.1968.
tb01290.x/f ull
specific space, such as sentiments and affects which can be [11] S. M. Mohammad and P. D. Turney, “Crowdsourcing a word-
obtained through training. This framework first learns the emotion association lexicon,” Comput. Intell., vol. 29, no. 3,
word embedding through unsupervised way and then treat pp. 436–465, 2013.
word embedding as the feature representation to train a [12] J. Staiano and M. Guerini, “Depeche mood: A lexicon for emotion
analysis from crowd annotated news,” in Proc. Ann. Meeting Assoc.
Ridge regression model based on a small set of seed words. Comput. Linguistics, 2014, vol. 2, pp. 427–433. [Online]. Available:
Our framework can infer different kinds of affective mean- http://aclanthology.info/papers/depeche-mood-a-lexicon-for-
ings in multi-dimensional models. A whole set of evaluations emotion-analysis-from-crowd-annotated-news
[13] J. A. Russell, “A circumplex model of affect,” J. Personality Social
shows that our method achieves the state-of-the-art perfor- Psychology, vol. 39, no. 6, 1980, Art. no. 1161.
mance and outperforms all the baselines in both performance [14] D. R. Heise, “Semantic differential profiles for 1,000 most frequent
and computation cost. Existing lexicons can be easily English words,” Psychological Monographs: General Appl., vol. 79,
extended through our method and experiment on down- no. 8, 1965, Art. no. 1. [Online]. Available: http://psycnet.apa.
org/journals/mon/79/8/1/
stream sentiment analysis task shows that the extended lexi- [15] R. A. Calvo and S. Mac Kim , “Emotions in text: Dimensional and
con performs better than existing public sentiment lexicons categorical models,” Comput. Intell., vol. 29, no. 3, pp. 527–543,
on several sentiment corpora, which again indicates the effec- 2013. [Online]. Available: http://onlinelibrary.wiley.com/doi/
10.1111/j.1467-8640.2012.00456.x/full
tiveness of the proposed method. We make the extended lexi- [16] D. R. Heise, “Affect control theory: Concepts and model,” J. Math.
cons of different affective models publicly available.12 Future Sociology, vol. 13, no. 1/2, pp. 1–33, 1987. [Online]. Available:
work may include investigating on how to obtain higher qual- http://dx.doi.org/10.1080/0022250X.1987.9990025
ity word embeddings (especially incorporating text corpus [17] M. M. Bradley and P. J. Lang, “Affective norms for English words
(ANEW): Instruction manual and affective ratings,” Center Res.
and knowledge base) and how to apply the obtained multi- Psychophysiology, University of Florida, Gainesville, FL, USA,
dimensional lexicons on affective computing for longer text. Tech. Rep. C-1, 1999. [Online]. Available: http://www.uvm.edu/
pdodds/teaching/courses/2009-08UVM-300/docs/others/
everything/bradley1999a.pdf
ACKNOWLEDGMENTS [18] A. B. Warriner, V. Kuperman, and M. Brysbaert, “Norms of valence,
This work is supported by HK Polytechnic University arousal, and dominance for 13,915 English lemmas,” Behavior Res.
Methods, vol. 45, no. 4, pp. 1191–1207, 2013. [Online]. Available:
(PolyU RTVU and GRF PolyU 15211/14E). The work was http://link.springer.com/article/10.3758/s13428-012-0314-x
done when Lin Gui was a research assistant with in the [19] L.-C. Yu, et al., “Building Chinese affective resources in valence-
Hong Kong Polytechnic University. arousal dimensions,” in Proc. Conf. North Amer. Chapter Assoc.
Comput. Linguistics: Human Language Technol., 2016, pp. 540–545.
[Online]. Available: http://m-mitchell.com/NAACL-2016/
REFERENCES NAACL-HLT2016/pdf/N16-1066.pdf
[20] L.-C. Yu, J. Wang, K. R. Lai, and X.-J. Zhang, “Predicting valence-
[1] R. W. Picard, “Affective computing,” MIT Media Lab, Cambridge,
arousal ratings of words using a weighted graph method,” in
MA, USA, Tech. Rep. 321, 1995.
Proc. Annu. Meeting Assoc. Comput. Linguistics, 2015, vol. 2,
pp. 788–793. [Online]. Available: http://aclweb.org/anthology/
P/P15/P15-2129.pdf
12. https://www.dropbox.com/sh/t6yy9yyrkj354rf/
AAA49K2XimtcccawUszXJ7zLa?dl=0 Accessed May 17, 2017.
[21] G. N. Leech, Semantics: The Study of Meaning, 2nd ed. London, [44] P. D. Turney and M. L. Littman, “Measuring praise and criticism:
U.K.: Penguin Books, 1981. Inference of semantic orientation from association,” ACM Trans. Inf.
[22] W.-L. Wei, C.-H. Wu, and J.-C. Lin, “A regression approach to Syst., vol. 21, no. 4, pp. 315–346, 2003. [Online]. Available: http://
affective rating of Chinese words from ANEW,” in Affective Com- nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=rtdoc&an=5210015
puting and Intelligent Interaction. Berlin, Germany: Springer, 2011, [45] S. M. Mohammad and S. Kiritchenko, “Using hashtags to capture
pp. 121–131. fine emotion categories from Tweets,” Comput. Intell., vol. 31,
[23] N. Malandrakis, A. Potamianos, E. Iosif, and S. S. Narayanan, no. 2, pp. 301–326, 2015.
“Kernel models for affective lexicon creation,” in Proc. 12th [46] A. Alhothali and J. Hoey, “Good news or bad news: Using affect
Annu. Conf. Int. Speech Commun. Assoc., 2011, pp. 2977–2980. control theory to analyze readers’ reaction towards news articles,”
[Online]. Available: http://sail.usc.edu/malandra/files/ in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics:
papers/interspeech2011.pdf Human Language Technol., 2015, pp. 1548–1558. [Online]. Avail-
[24] P. Ekman, “Facial expression and emotion,” Amer. Psychologist, able: http://aclanthology.info/papers/good-news-or-bad-news-
vol. 48, no. 4, 1993, Art. no. 384. using-affect-control-theory-to-analyze-readers-reaction-towards-
[25] W. G. Parrott, Emotions in Social Psychology: Essential Readings. news-articles
Hove, U.K.: Psychology Press, 2001. [47] I. San Vicente , R. Agerri, G. Rigau, and D.-S. Sebastin, “Simple,
[26] N. H. Frijda, The Emotions. Cambridge, U.K.: Cambridge Univ. robust and (almost) unsupervised generation of polarity lexicons
Press, 1986. for multiple languages,” in Proc. 14th Conf. Eur. Chapter Assoc.
[27] S. S. Tomkins, “Affect theory,” Approaches Emotion, vol. 163, 1984, Comput. Linguistics, 2014, pp. 88–97. [Online]. Available: http://
Art. no. 195. www.aclweb.org/anthology/E14-1#page=114
[28] A. Ortony, The Cognitive Structure of Emotions. Cambridge, U.K.: [48] L. Velikovich, S. Blair-Goldensohn , K. Hannan, and R. McDonald,
Cambridge Univ. Press, 1990. “The viability of Web-derived polarity lexicons,” in Proc. Human
[29] L. Xu, H. Lin, P. Yu, H. Ren, and J. Chen, “Constructing the affective Language Technol.: Annu. Conf. North Amer. Chapter Assoc. Comput.
lexicon ontology,” J. China Soc. Sci. Tech. Inf., vol. 2, 2008, Art. no. 6. Linguistics, 2010, pp. 777–785. [Online]. Available: http://dl.acm.
[30] C. E. Osgood, G. J. Suci, and P. H. Tannenbaum, The Measurement org/citation.cfm?id=1858118
of Meaning. Champaign, IL, USA: Univ. Illinois Press, 1957. [49] D. Tang, F. Wei, B. Qin, M. Zhou, and T. Liu, “Building large-
[31] E. Cambria, A. Livingstone, and A. Hussain, “The hourglass of scale Twitter-specific sentiment lexicon: A representation learning
emotions,” in Cognitive Behavioural Systems. Berlin, Germany: approach,” in Proc. 25th Int. Conf. Comput. Linguistics: Tech. Papers,
Springer, 2012, pp. 144–157. 2014, pp. 172–182. [Online]. Available: http://aclweb.org/
[32] A. Mehrabian, “Pleasure-arousal-dominance: A general frame- anthology/C/C14/C14-1018.pdf
work for describing and measuring individual differences in tem- [50] S. Amir, R. Astudillo, W. Ling, B. Martins, M. J. Silva, and
perament,” Current Psychology, vol. 14, no. 4, pp. 261–292, 1996. I. Trancoso, “INESC-ID: A regression model for large scale Twitter
[33] C. Whissell, “The dictionary of affect in language,” Emotion: The- sentiment lexicon induction,” in Proc. 9th Int. Workshop Semantic
ory Res. Experience, vol. 4, no. 113–131, 1989, Art. no. 94. Evaluation., 2015, pp. 613–618. [Online]. Available: http://aclweb.
[34] J. R. Fontaine, K. R. Scherer, E. B. Roesch, and P. C. Ellsworth, org/anthology/S15-2102
“The world of emotions is not two-dimensional,” Psychological [51] S. Rothe, S. Ebert, and H. Schtze, “Ultradense word embeddings by
Sci., vol. 18, no. 12, pp. 1050–7, Dec. 2007. [Online]. Available: orthogonal transformation,” in Proc. Conf. North Amer. Chapter Assoc.
http://www.ncbi.nlm.nih.gov/pubmed/18031411 Comput. Linguistics: Human Language Technol., 2016, pp. 767–777.
[35] H. Lvheim, “A new three-dimensional model for emotions and [Online]. Available: http://aclanthology.info/papers/ultradense-
monoamine neurotransmitters,” Med. Hypotheses, vol. 78, no. 2, word-embeddings-by-orthogonal-transformation
pp. 341–348, 2012. [Online]. Available: http://www.sciencedirect. [52] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean,
com/science/article/pii/S0306987711005883 “Distributed representations of words and phrases and their
[36] T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing contextual compositionality,” in Proc. 27th Annu. Conf. Neural Inf. Process.
polarity in phrase-level sentiment analysis,” in Proc. Human Lan- Syst., 2013, pp. 3111–3119.
guage Technol. Conf. Conf. Empirical Methods Natural Language Pro- [53] M. Baroni, G. Dinu, and G. Kruszewski, “Don’t count, predict!
cess., 2005, pp. 347–354. [Online]. Available: http://aclanthology. A systematic comparison of context-counting versus context-
info/papers/recognizing-contextual-polarity-in-phrase-level- predicting semantic vectors,” in Proc. 52nd Annu. Meeting Assoc.
sentiment-analysis Comput. Linguistics, 2014, pp. 238–247. [Online]. Available: http://
[37] S. M. Mohammad, S. Kiritchenko, and X. Zhu, “NRC-Canada: anthology.aclweb.org/P/P14/P14-1023.pdf
Building the state-of-the-art in sentiment analysis of tweets,” in [54] Z. S. Harris, “Distributional structure,” Word, 1954. [Online].
Proc. 7th Int. Workshop Semantic Evaluation., 2013, pp. 321–327. Available: http://psycnet.apa.org/psycinfo/1956-02807-001
[38] S. Rosenthal, P. Nakov, S. Kiritchenko, S. M. Mohammad, [55] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vec-
A. Ritter, and V. Stoyanov, “SemEval-2015 task 10: Sentiment tors for word representation,” in Proc. Conf. Empirical Methods Nat-
analysis in Twitter,” in Proc. 9th Int. Workshop Semantic Evaluation., ural Language Process., 2014, pp. 1532–1543. [Online]. Available:
2015, pp. 451–463. [Online]. Available: http://www.aclweb.org/ http://nlp.stanford.edu/projects/glove/glove.pdf
website/old_anthology/S/S15/S15-2.pdf#page=493 [56] O. Levy, Y. Goldberg, and I. Dagan, “Improving distributional
[39] C. J. Hutto and E. Gilbert, “VADER: A parsimonious rule-based similarity with lessons learned from word embeddings,” Trans.
model for sentiment analysis of social media text,” in Proc. Eighth Assoc. Comput. Linguistics, vol. 3, pp. 211–225, 2015. [Online].
Int. Conf. Weblogs Social Media, 2014. [Online]. Available: http:// Available: https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/
www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/ article/view/570
viewPaper/8109 [57] J. Zhang, J. Salwen, M. R. Glass, and A. M. Gliozzo, “Word
[40] D. S. Schmidtke, T. Schrder, A. M. Jacobs, and M. Conrad, semantic representations using Bayesian probabilistic tensor
“ANGST: Affective norms for German sentiment terms, derived factorization,” in Proc. Conf. Empirical Methods Natural Language
from the affective norms for English words,” Behavior Res. Meth- Process., 2014, pp. 1522–1531. [Online]. Available: http://www.
ods, vol. 46, no. 4, pp. 1108–1118, Jan. 2014. [Online]. Available: aclweb.org/anthology/D14-1161
http://link.springer.com/article/10.3758/s13428-013-0426-y [58] S. Li, J. Zhu, and C. Miao, “A generative word embedding
[41] B. Liu, “Sentiment analysis and opinion mining,” Synthesis Lec- model and its low rank positive semidefinite solution,” in
tures Human Language Technol., vol. 5, no. 1, pp. 1–167, 2012. Proc. Conf. Empirical Methods Natural Language Process., 2015,
[42] E. Cambria, S. Poria, R. Bajpai, and B. Schuller, “SenticNet 4: pp. 1599–1609. [Online]. Available: http://aclweb.org/
A semantic resource for sentiment analysis based on conceptual anthology/D15-1183
primitives,” in Proc. 26th Int. Conf. Comput. Linguistics, 2016, pp. 2666– [59] M. Faruqui and C. Dyer, “Improving vector space word represen-
2677. [Online]. Available: http://www.sentic.net/senticnet-4.pdf tations using multilingual correlation,” in Proc. 14th Conf. Eur.
[43] W. L. Hamilton, K. Clark, J. Leskovec, and D. Jurafsky, “Inducing Chapter Assoc. Comput. Linguistics, 2014, pp. 462–471.
domain-specific sentiment lexicons from unlabeled corpora,” [60] Z. Wang, J. Zhang, J. Feng, and Z. Chen, “Knowledge graph and
in Proc. Conf. Empirical Methods Natural Language Process., 2016, text jointly embedding,” in Proc. Conf. Empirical Methods Natural
pp. 595–605. [Online]. Available: http://aclanthology.info/ Language Process., 2014, pp. 1591–1601. [Online]. Available:
papers/inducing-domain-specific-sentiment-lexicons-from- http://aclweb.org/anthology/D14-1167
unlabeled-corpora
[61] F. Sun, J. Guo, Y. Lan, J. Xu, and X. Cheng, “Inside out: Two jointly [79] S. Baccianella, A. Esuli, and F. Sebastiani, “SentiWordNet 3.0: An
predictive models for word representations and phrase repre- enhanced lexical resource for sentiment analysis and opinion min-
sentations,” in Proc. 30th AAAI Conf. Artif. Intell., 2016, pp. 2821– ing,” in Proc. Int. Conf. Language Resources Eval., 2010, vol. 10,
2827. [Online]. Available: http://www.aaai.org/Conferences/ pp. 2200–2204. [Online]. Available: http://www.researchgate.net/
AAAI/2016/Papers/15Sun11783.pdf profile/Fabrizio_Sebastiani/publication/220746537_SentiWordNet_
[62] P. Rastogi, B. Van Durme, and R. Arora, “Multiview LSA: Repre- 3.0_An_Enhanced_Lexical_Resource_for_Sentiment_Analysis_and_
sentation learning via generalized CCA,” in Proc. Conf. North Opinion_Mining/links/545fbcc40cf27487b450aa21.pdf
Amer. Chapter Assoc. Comput. Linguistics: Human Language Technol., [80] L. Gatti, M. Guerini, and M. Turchi, “SentiWords: Deriving a high
2015, pp. 556–566. [Online]. Available: http://aclweb.org/ precision and high coverage lexicon for sentiment analysis,” IEEE
anthology/N15-1058 Trans. Affect. Comput., vol. 7, no. 4, pp. 409–421, Oct.–Dec. 2016.
[63] S. L. Hyland, T. Karaletsos, and G. Rtsch, “A generative model [81] D. Tin Vo and Y. Zhang, “Don’t count, predict! An automatic
of words and relationships from multiple sources,” in Proc. 30th approach to learning sentiment lexicons for short text,” presented
AAAI Conf. Artif. Intell., 2016, pp. 2622–2629. [Online]. Available: at the 54th Annu. Meet. Assoc. Comput. Linguistics, Berlin, Germany,
http://www.aaai.org/Conferences/AAAI/2016/Papers/ 2016.
14Hyland12446.pdf
[64] P. D. Turney and P. Pantel, “From frequency to meaning: Vector Minglei Li receives the BE degree in mechanical
space models of semantics,” J. Artif. Intell. Res., vol. 37, pp. 141– engineering, in 2011 and the ME degree in
188, 2010. mechanical and electrical engineering, in 2014
[65] W. Yin and H. Schtze, “Learning word meta-embeddings,” in Proc. from the Huazhong University of Science and
54th Annu. Meeting Assoc. Comput. Linguistics, 2016, pp. 1351–1360. Technology, Wuhan, China. Currently, he is
[Online]. Available: http://aclweb.org/anthology/P16–1128 working toward the PhD degree in the Depart-
[66] T. Pedersen, S. Patwardhan, and J. Michelizzi, “WordNet:: ment of Computing, The Hong Kong Polytechnic
Similarity - measuring the relatedness of concepts,” in Proc. 19th University. His research interests include natural
Nat. Conf. Artif. Intell. 16th Conf. Innovative Appl. Artif. Intell., 2004, language processing, emotion analysis, compu-
pp. 1024–1025. [Online]. Available: http://dl.acm.org/citation. tational linguistic, and applied machine learning.
cfm?id=1614037
[67] D. Lynott and L. Connell, “Modality exclusivity norms for 423
object properties,” Behavior Res. Methods, vol. 41, no. 2, pp. 558– Qin Lu is currently a professor with the Hong
564, 2009. [Online]. Available: http://link.springer.com/article/ Kong Polytechnic University. Her main research
10.3758/BRM.41.2.558 works are in computational linguistics. That is,
[68] D. Lynott and L. Connell, “Modality exclusivity norms for 400 using computational methods to process Chinese
nouns: The relationship between perceptual experience and sur- text, extract useful information, and build Chinese
face word form,” Behavior Res. Methods, vol. 45, no. 2, pp. 516–526, NLP related resources. Her expertise is in lexical
2013. [Online]. Available: http://link.springer.com/article/ semantics, text mining, opinion analysis, and
10.3758/s13428-012-0267-0 knowledge discovery.
[69] M. Brysbaert, A. B. Warriner, and V. Kuperman, “Concreteness
ratings for 40 thousand generally known English word lemmas,”
Behavior Res. Methods, vol. 46, no. 3, pp. 904–911, 2014. [Online].
Available: http://link.springer.com/article/10.3758/s13428-013-
0403-5 Yunfei Long received the double bachelor’s
[70] R. Speer, J. Chin, and C. Havasi, “ConceptNet 5.5: An open multi- degree in both computer science and linguistics
lingual graph of general knowledge,” in Proc. 31st AAAI Conf. from JiLin University, Changchun, China, in 2013
Artif. Intell., 2017, pp. 4444–4451. [Online]. Available: http://aaai. and the Msc degree in cognitive science from the
org/ocs/index.php/AAAI/AAAI17/paper/view/14972 University of Edinburgh, United Kingdom, in 2015.
[71] R. Speer and C. Havasi, “Representing general relational knowl- He is currently working toward the PhD degree in
edge in ConceptNet 5,” in Proc. 8th Int. Conf. Language Resources the Department of Computing, The Hong Kong
Eval., 2012, pp. 3679–3686. [Online]. Available: http://redirect. Polytechnic University. His current research inter-
subscribe.ru/_/-/www.lrec-conf.org/proceedings/lrec2012/ ests include natural language processing, neural
pdf/1072_Paper.pdf network, and social media analysis.
[72] J. Wieting, M. Bansal, K. Gimpel, K. Livescu, and D. Roth, “From
paraphrase database to compositional paraphrase model and
back,” Trans. Assoc. Comput. Linguistics, vol. 3, pp. 345–358, 2015. Lin Gui is a lecturer in the College of Mathemat-
[73] P. Nakov, Z. Kozareva, A. Ritter, S. Rosenthal, V. Stoyanov, and ics and Computer Science, Fuzhou University.
T. Wilson, “SemEval-2013 task 2: Sentiment analysis in Twitter,” He received his BS degree from Nankai Univer-
in Proc. 7th Int. Workshop Semantic Eval., 2013, pp. 312–320.
sity and the MS, PhD degree from Harbin Insti-
[74] M. Hu and B. Liu, “Mining and summarizing customer reviews,”
tute of Technology. His research areas include
in Proc. 10th ACM SIGKDD Int. Conf. Knowl. Discovery Data Min- natural language processing, sentiment analysis,
ing, 2004, pp. 168–177. [Online]. Available: http://doi.acm.org/ emotion computation and machine learning.
10.1145/1014052.1014073
[75] J. Wiebe, T. Wilson, and C. Cardie, “Annotating expressions of
opinions and emotions in language,” Language Resources Eval.,
vol. 39, no. 2/3, pp. 165–210, 2005. [Online]. Available: http://dx.
doi.org/10.1007/s10579-005-7880-9
[76] B. Pang and L. Lee, “Seeing stars: Exploiting class relationships for
sentiment categorization with respect to rating scales,” in Proc.
43rd Annu. Meeting Assoc. Comput. Linguistics, 2005, pp. 115–124.
[Online]. Available: http://acl.ldc.upenn.edu/P/P05/P05-1015.
pdf
[77] R. Socher, et al., “Recursive deep models for semantic composi-
tionality over a sentiment treebank,” in Proc. Conf. Empirical Meth-
ods Natural Language Process., 2013, pp. 1631–1642.
[78] X. Zhu, S. Kiritchenko, and S. M. Mohammad, “NRC-Canada-
2014: Recent improvements in the sentiment analysis of Tweets,”
in Proc. 8th Int. Workshop Semantic Eval., 2014, pp. 443–447.
[Online]. Available: http://www.aclweb.org/anthology/S/S14/
S14-2.pdf#page=463

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy