Twitter Sentiment Analysis Using Natural Language Toolkit and VADER Sentiment
Twitter Sentiment Analysis Using Natural Language Toolkit and VADER Sentiment
Gilbert [6] developed VADER, which is a simple rule- publicly available raw tweets. To gather the data, we used
based model for general sentiment analysis and compared its Network Overview Discovery and Exploration for Excel
effectiveness to 11 typical state-of-the-practice benchmarks, (NodeXL) [10]. We collected a total of 2,430 political
including Affective Norms for English Words(ANEW), tweets concerning the 2016 US presidential election, which
Linguistic Inquiry and Word Count (LIWC), the General were published on Twitter’s public message board and
Inquirer, Senti WordNet, and machine learning-oriented posted from 22 to 24 November 2016. Also, NodeXL, set
techniques that rely on the Naive Bayes, Maximum Entropy, the limit to a maximum of 2,000 tweets, from which we
and Support Vector Machine (SVM) algorithms. The study obtained a reduced data set. In order to collect the most
described the development, validation, and evaluation of relevant tweets, we used hashtags containing the candidates’
VADER. The researcher used a combination of qualitative names, Hillary and Trump. These names and “Election”
and quantitative methods to produce and validate a were used as keywords to retrieve tweets, such as #Election
sentiment lexicon that is used in the social media domain. Day results, #US Election 2016, #Election 2016, #Hillary
VADER is utilizing a parsimonious rule-based model to Clinton, #Donald Trump.
assess the sentiment of tweets. The study showed that A tweet is a microblog message posted on Twitter. It is
VADER improved the benefits of traditional sentiment limited to 140 characters. Most tweets contain text and
lexicons, such as LIWC. VADER was differentiated from embed URLs, pictures, usernames, and emoticons. They
LIWC because it was more sensitive to sentiment also contain misspellings. Hence, a series of preprocessing
expressions in social media contexts, and it generalized steps were carried out to remove irrelevant information from
more favorably to other domains. the tweets. The reason is that the cleaner the data, the more
Mane et al. [8] presented a sentiment analysis using suitable they are for mining and feature extraction, which
Hadoop, which quickly processes vast amounts of data on a leads to the improved accuracy of the results. The tweets
Hadoop cluster in real time. The researchers aimed to were also preprocessed to eliminate duplicate tweets and re-
determine whether the users expressed a positive or negative tweets from the dataset, which led to a final sample of 1,415
opinion. This approach was focused on the speed of tweets. Each tweet was processed to extract its main
performing sentiment analysis of real-time Twitter data message. To preprocess these data, we used Python’s
using Hadoop. The Hadoop platform was designed to solve Natural Language Toolkit (NLTK). First, a regular
problems that involved large, unstructured, and complex expression (Regex) in Python was run to detect and discard
data. It used the divide and rule method for processing such tweets special characters, such as URLs (“http://url”),
data. The overall accuracy of the project was determined by retweet (RT), user mention (@), and unwanted punctuation.
the time required to access from various modules. In the Because hashtags (#) often explain the subject of the tweet
analysis, the code yielded outstanding accuracy. The study and contain useful information related to the topic of the
used a numbering approach to rate the statements in multi- tweet, they are added as a part of the tweet, but the “#”
classes, which assigned a suitable range of different symbol was removed.
sentiments. Moreover, the approach could be used in other Next, various functions of NLTK were used to convert
social media platforms, such as movie reviews (e.g., IMDB the tweets to lowercase, remove stop words (i.e., words that
reviews) and personal blogs. Along the same line, Bouazizi do not express any meaning, such as is, a, the, he, them,
and Ohtsuki [9] introduced SENTA, which helps users etc.), tokenize the tweets into individual words or tokens,
select from a wide variety of features those that are the best and stem the tweets using the Porter stemmer. When the
fit for the application used to run the classification. The preprocessing steps are complete, the dataset was ready for
researchers used SENTA to perform the multi-class sentiment classification.
sentiment analysis of texts collected from Twitter. The study In phase three, the sentiments expressed in the tweets
was limited to seven different sentiment classes. The results were classified. VADER Sentiment Analyzer was applied to
showed that the proposed approach reached an accuracy as the dataset. VADER is a rule-based sentiment analysis tool
high as 60.2% in the multi classification. This approach was and a lexicon that is used to express sentiments in social
shown to be sufficiently accurate in both binary media [6]. First, we created a sentiment intensity analyzer to
classification and ternary classification. categorize our dataset. Then the polarity scores method was
used to determine the sentiment. The VADER Sentiment
III. METHODOLOGY Analyzer was used to classify the preprocessed tweets as
positive, negative, neutral, or compound. The compound
value is a useful metric for measuring the sentiment in a
A. Proposed Method given tweet. In the proposed method, the threshold values
The current study consists of three phases. Phase one used to categorize tweets as either positive, negative, or
concerns the acquisition of Twitter data. Phase two focuses neutral. Typical threshold values used in this study are Refer
on the initial preprocessing work carried out to clean and to “(1)”:
remove irrelevant information from the tweets. Phase three
deals with the use of the NLTK’s VADER analyzer as well
Positive sentiment: compound value > 0.001, assign
as the scoring method applied to the VADER results to
assess its ability to classify tweets on a five-point scale. score = 1
As aforementioned that, the purpose of the data Neutral sentiment: (compound value > -0.001) and
acquisition phase was to obtain Twitter data. The methods
used to extract Twitter data allowed real-time access to (compound value < 0.001), assign score =0
Negative sentiment: compound value < -0.001, text processing libraries for classification, tokenization,
stemming, tagging, parsing, and semantic reasoning [13].
assign score = -1 (1)
3) Valence Aware Dictionary and sEntiment Reasoner
(VADER)
In the current study, a tweet with a compound VADER is a lexicon and rule-based sentiment analysis
value greater than the threshold was considered a tool that is specifically attuned to the sentiments expressed
positive tweet, and a tweet with a compound value less than in social media. It is an entirely free open-source tool.
the threshold was considered a negative tweet. In the VADER also takes into consideration word order and degree
remaining cases, the tweet was considered neutral. Next, we modifiers [6].
defined a scoring rule to determine whether the overall
sentiment polarity in each tweet was in one of five classes: IV. RESULTS AND DISCUSSION
high positive, positive, neutral, negative, and high negative
The results of a Twitter sentiment analysis using NLTK
Refer to “(2)”. In the proposed method, the scoring rule is
and VADER sentiment analysis tools are discussed in this
used to classify tweets into five sentiment classes as follows:
section. Fig. 1 shows the sentiment score of each tweet as
positive, negative, neutral, or compound as obtained by the
Test the overall sentiment of the tweet.
VADER Sentiment Analyzer.
If (score value) = 1:
Calculate the overall tweet polarity as: [{'compound': -0.1531,
If (positive value > 0.5) assign tweet polarity = +2 'neg': 0.164,
Else: (positive value < 0.5) assign tweet polarity = +1 'neu': 0.714,
If (score value) = -1: 'pos': 0.121,
Calculate the overall tweet polarity as: 'tweet': 'presid anti trump tshi 24hr ship small red amazon
If (negative value > 0.5) assign tweet polarity = -2 2016elect election result im with'},
Else: (negative value < 0.5) assign tweet polarity = -1 {'compound': -0.5859,
If (score value = 0) assign tweet polarity = 0 (2) 'neg': 0.352,
'neu': 0.648,
The polarity value gives the overall sentiment polarity of 'pos': 0.0,
the tweet. The polarity value is set between -2 (highly 'tweet': 'wtf americain 5 word wtf trump protest trump
negative) to +2 (highly positive). Positive tweets are presid trump train election2016 election result 'elect'},
classified as highly positive or positive depending on the {'compound': 0.0,
'neg': 0.0,
positive value; negative tweets are classified as highly
'neu': 1.0,
negative or negative depending on the negative value; in
'pos': 0.0,
other cases, tweets are classified as neutral.
'tweet': 'ask unifi behind un repent bigot election result'}]
TABLE I
50
PERCENTAGE OF TWEETS
THE TWEETS CLASSIFICATION
40
30
tweet score
20
1 wtf americain 5 word wtf trump protest trump presid... -1
2 ask unifi behind unrepent bigot election result… 0 10
3 trump say support practic big otri need stop via… 1 0
TABLE II
OVERALL SENTIMENT POLARITY FOR EVERY TWEET
TWEETS CATEGORY
Tweets label Polarity
Highly
first ignor laugh fight win gandhi everi true +2
Positive Fig. 3. Sentiment Polarity Percentage for Tweets in each Category
When ev sad sorrow come watch conanobrien Highly
-2
elect... Negative
Unit state hay election 2016 election result ima... 0 Neutral Fig. 5 shows the frequency distributions of positive,
tom obama pardon Clinton way president trump el... +1 Positive negative, and neutral words respectively. Here we examine
play dead soon cat election result -1 Negative the pattern of words; the plots show that the majority of
words appear less.
TABLE III
POLARITY COUNT FOR EACH CLASS 800
NUMBERS OF TWEETS 700
Polarity label Count Percentage 600
500
Highly Positive +2 34 2.402827 400
positive +1 375 26.501767 300
neutral 0 661 46.713781 200
negative -1 324 22.897527 100
0
Highly negative -2 21 1.484099
REFERENCES
[1] B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury, “Twitter power:
Tweets as electronic word of mouth,” J. Am. Soc. Inf. Sci. Technol.,
vol. 60, no. 11, pp. 2169–2188, 2009.
[2] V. Kharde and P. Sonawane, “Sentiment analysis of twitter data: a
survey of techniques,” arXiv Prepr. arXiv1601.06971, 2016.
[3] P. Selvaperumal and D. A. Suruliandi, “a Short Message
Classification Algorithm for Tweet Classification,” Int. Conf. Recent
Trends Inf. Technol., pp. 1–3, 2014.
[4] T. Singh and M. Kumari, “Role of Text Pre-processing in Twitter
Sentiment Analysis,” Procedia Comput. Sci., vol. 89, pp. 549–554,
2016.
[5] Y. R. Tausczik and J. W. Pennebaker, “The psychological meaning of
words: LIWC and computerized text analysis methods,” J. Lang. Soc.
Psychol., vol. 29, no. 1, pp. 24–54, 2010.
[6] C. J. H. E. Gilbert, “Vader: A parsimonious rule-based model for
sentiment analysis of social media text,” in Eighth International
Conference on Weblogs and Social Media (ICWSM-14). Available at
(20/04/16)
(b) Frequency Distributions of Negative http://comp.social.gatech.edu/papers/icwsm14.vader.hutto.pdf, 2014.
[7] B. Wagh, J. V Shinde, and P. A. Kale, “A Twitter Sentiment Analysis
Using NLTK and Machine Learning Techniques,” Int. J. Emerg. Res.
Manag. Technol., vol. 6, no. 12, pp. 37–44, 2018.
[8] S. B. Mane, Y. Sawant, S. Kazi, and V. Shinde, “Real Time
Sentiment Analysis of Twitter Data Using Hadoop,” Int. J. Comput.
Sci. Inf. Technol., vol. 5, no. 3, pp. 3098–3100, 2014.
[9] M. Bouazizi and T. Ohtsuki, “A Pattern-Based Approach for Multi-
Class Sentiment Analysis in Twitter,” IEEE Access, vol. 3536, no. c,
pp. 1–21, 2017.
[10] M. A. Smith et al., “Analyzing (social media) networks with
NodeXL,” in Proceedings of the fourth international conference on
Communities and technologies, 2009, pp. 255–264.
[11] D. L. Hansen, B. Shneiderman, and M. A. Smith, Analyzing social
media networks with NodeXL: Insights from a connected world.
Morgan Kaufmann, 2010.
[12] S. Bird, E. Klein, and E. Loper, Natural language processing with
Python: analyzing text with the natural language toolkit. “ O’Reilly
Media, Inc.,” 2009.
[13] Natural Language Toolkit http://www.nltk.org/ (Date Last Accessed,
November 20, 2018).
(c) Frequency Distributions of Neutral
V. CONCLUSION
In this study, the NLTK and the VADER analyzer were
applied to conduct a sentiment analysis of Twitter data and
to categorize tweets according to a multi-classification
system. The case study was the 2016 US presidential
election. The results indicated that the VADER Sentiment
Analyzer was an effective choice for sentiment analysis
classification using Twitter data. VADER easily and quickly
classified huge amounts of data. However, the present study
has the following limitations. First, a small volume of data
was used. Second, a general lexicon was used to categorize
specific data. Third, the data were not trained. In future
work, we will improve our system by using large volumes of
data, a specific lexicon, and a corpus for training the data to
obtain good results.