0% found this document useful (0 votes)
82 views9 pages

Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views9 pages

Digital Assignment-1 Literature Review On Twitter Sentiment Analysis Name: G.Tirumala Reg No: 16BCE0202 1)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

DIGITAL ASSIGNMENT-1

LITERATURE REVIEW ON TWITTER SENTIMENT ANALYSIS

Name : G.TIRUMALA

Reg No : 16BCE0202

1)
TITLE OF THE PAPER : Twitter Sentiment Analysis

AUTHORS : Aliza Sarlan, Chayanit Nadam, Shuib Basri

TITLE : Twitter Sentiment Analysis

JOURNAL : 2014 International Conference on Information Technology and Multimedia (ICIMU),


November 18 – 20, 2014, Putrajaya, Malaysia

ABSTRACT :

Social media have received more attention nowadays. Public and private opinion about a wide variety
of subjects are expressed and spread continually via numerous social media. Twitter is one of the
social media that is gaining popularity. Twitter offers organizations a fast and effective way to
analyze customers’ perspectives toward the critical to success in the market place. Developing a
program for sentiment analysis is an approach to be used to computationally measure customers’
perceptions. This paper reports on the design of a sentiment analysis, extracting a vast amount of
tweets. Prototyping is used in this development. Results classify customers’ perspective via tweets
into positive and negative, which is represented in a pie chart and html page. However, the program
has planned to develop on a web application system, but due to limitation of Django which can be
worked on a Linux server or LAMP, for further this approach need to be done.

Keywords-component :

Twitter, sentiment, opinion mining, social media, natural language processing

INTRODUCTION

millions of people are using social network sites to express their emotions, opinion and disclose about
their daily lives. However, people write anything such as social activities or any comment on
products. Through the online communities provide an interactive forum where consumers inform and
influence others.Moreover, social media provides an opportunity for business that giving a platform to
connect with their customers such as social media to advertise or speak directly to customers for
connecting with customer’s perspective of products and services.

Problem Statement :

Despite the availability of software to extract data regarding a person’s sentiment on a specific
product or service,organizations and other data workers still face issues regarding the data extraction.
METHODOLOGY

This project has been divided into 2 phases. First, literature study is conducted, followed by system
development. Literature study involves conducting studies on various sentiment analysis techniques
and method that currently in used. In phase 2, application requirements and functionalities are defined
prior to its development. Also, architecture and interface design of the program and how it will
interact are also identified. In developing the Twitter Sentiment Analysis application, several tools are
utilized, such as Python Shell 2.7.2 and Notepad.

LITERATURE REVIEW

A. Opining Mining
Opinion mining refers to the broad area of natural language processing, text mining,
computational linguistics, which involves the computational study of sentiments, opinions
and emotions expressed in text . Although, view or attitude based on emotion instead of
reason is often colloquially referred to as a sentiment . Hence, lending to an equivalent for
opinion mining or sentiment analysis.

B. Twitter
Twitter is a popular real time microblogging service that allows users to share short
information known as tweets which are limited to 140 characters . Users write tweets to
express their opinion about various topics relating to their daily lives. Twitter is an ideal
platform for the extraction of general public opinion on specific issues .

C. Twitter Sentiment Analysis

The sentiment can be found in the comments or tweet to provide useful indicators for many
different purposes . and stated that a sentiment can be categorized into two groups, which is
negative and positive words. Sentiment analysis is a natural language processing techniques
to quantify an expressed opinion or sentiment within a selection of tweets .

REFERENCES

1. M.Rambocas, and J. Gama, “MarketingResearch:TheRoleof SentimentAnalysis”. The 5th SNA-


KDD Workshop’11. Universityof Porto, 2013.

2. A. K. Jose, N. Bhatia, and S. Krishna, “TwitterSentimentAnalysis”. NationalInstituteof


TechnologyCalicut,2010.

3.P. Lai, “ExtractingStrongSentimentTrendfromTwitter”. Stanford University, 2012. [4] Y. Zhou,


and Y. Fan, “ A Sociolinguistic Study of American Slang,” Theory and Practice in Language Studies,
3(12), 2209–2213, 2013. doi:10.4304/tpls.3.12.2209-2213
2)
TITLE OF THE PAPER : Sentiment Analysis of Twitter Data Using Text Mining and Hybrid
Classification Approach

AUTHORS : Shubham Goyal

TITLE : Sentiment Analysis of Twitter Data Using Text Mining and Hybrid Classification Approach

JOURNAL : International Journal of Engineering Development and Research (www.ijedr.org)

ABSTRACT :

In Sentiment analysis we use natural language processing and information to extracting writer’s
comments or reviews. In this paper we use Data text mining and hybrid approach of KNN Algorithm
and Naïve Bayes Algorithm to find the sentiments of Indian people on Tweeter.

Keywords:

Sentiment Analysis, Text Mining

INTRODUCTION

Human life is filled with emotions and opinions. People love to share their emotions and opinions at
every place but social media is one of the most common and easy way to share our feelings. Today
people not only comment on the existing information, bookmark pages and provide ratings but they
also share their ideas, news and knowledge with the community at large. In this way, the entire
community becomes a writer, in addition to being a reader

LITERATURE SURVEY

Ortigosa and Alvaro et. proposed a novel method for sentiment analysis in social site giant Facebook
that, starting from the messages written by its users, supports: (i) to extract useful information about
the Facebook users’ sentiment polarity (whether it is positive, neutral or negative), which reflected
from the messages written by users; and (ii) to model the users’ normal sentiment polarity and to
analyze significant emotional changes in user.

Pak and Alexander et al. proposed By using the corpus, Author builds a sentiment classifier, which is
capable of determining positive, neutral and negative sentiments for the whole document.
Experimental results show that the proposed techniques are more efficient and perform better as
compared to previously proposed techniques.

METHODOLOGY

KNN Algorithm

KNN is type of instance based learning or lazy learning. In this learning the function is approximately
locally and all computation is deferred until classification. It is simplest of all machine learning
algorithms. In KNN classification, the output is class membership. An object is classified by majority
votes of its neighbors by the object being assigned to class most common among its k nearest
neighbor (k is positive small integer). The nearest neighbor is determined using similarity measure
usually distance functions are user.
Naïve Bayes Algorithm

The algorithm is named after famous statistician Thomas Bayes who proposed Bayesian theorem.
This theorem assumes that all the attributes are conditionally independent to each other. In this
algorithm, conditional probability for each attribute with respect to certain class level is calculated.

REFERENCES

1. Scholar, P. G. "Big-SoSA: Social Sentiment Analysis and Data Visualization on Big Data."

2. Ortigosa, Alvaro, José M. Martín, and Rosa M. Carro. "Sentiment analysis in Facebook and its
application to elearning." Computers in Human Behavior 31 (2014): 527-541.

3)
TITLE OF THE PAPER : SENTIMENT ANALYSIS ON TWITTER DATA

AUTHORS : Monika Malhotra, Onam Bharti

TITLE : SENTIMENT ANALYSIS ON TWITTER DATA

JOURNAL : Onam Bharti et al, International Journal of Computer Science and Mobile Computing,
Vol.5 Issue.6, June- 2016.

ABSTRACT:

Sentiment analysis is a type of natural language processing for tracking the mood of the public about a
particular product or topic. Sentiment analysis, which is also called opinion mining, involves in
building a system to collect and examine opinions about the product made in blog posts, comments,
reviews or tweets. Sentiment analysis can be useful in several ways. In fact, it has spread from
computer science to management sciences and social sciences due to its importance to business and
society as a whole. In recent years, industrial activities surrounding sentiment analysis have also
thrived. Numerous startups have emerged. Many large corporations have built their own in-house
capabilities. Sentiment analysis systems have found their applications in almost every business and
social domain. The goal of this report is to give an introduction to this fascinating problem and to
present a framework which will perform sentiment analysis on online mobile phone reviews by
associating modified K means algorithm with Naïve bayes classification and KNN.

Major tasks in NLP

The following is a list of some of the most commonly researched tasks in NLP. Note that some of
these tasks have direct real-world applications, while others more commonly serve as subtasks that are
used to aid in solving larger tasks. What distinguishes these tasks from other potential and actual NLP
tasks is not only the volume of research devoted to them but the fact that for each one there is
typically a well-defined problem setting, a standard metric for evaluating the task, standard corpora on
which the task can be evaluated, and competitions devoted to the specific task.

Modified approach K-mean algorithm:


The K-mean algorithm is a popular clustering algorithm and has its application in data mining, image
segmentation, bioinformatics and many other fields. This algorithm works well with small datasets. In
this paper we proposed an algorithm that works well with large datasets. Modified k-mean algorithm
avoids getting into locally optimal solution in some degree, and reduces the adoption of cluster -error
criterion.

Algorithm:

Modified approach (S, k), S={x1,x2,…,xn }

Input: The number of clusters k1( k1> k ) and a dataset containing n objects(Xij+).

Output: A set of k clusters (Cij) that minimize the Cluster - error criterion.

Algorithm

1. Compute the distance between each data point and all other data- points in the set D

2. Find the closest pair of data points from the set D and form a data-point set Am (1<= p <= k+1)
which contains these two data- points, Delete these two data points from the set D

3. Find the data point in D that is closest to the data point set Ap, Add it to Ap and delete it from D

4. Repeat step 4 until the number of data points in Am reaches (n/k)

5. If p<k+1, then p = p+1, find another pair of data points from D between which the distance is the

shortest, form another data-point set Ap and delete them from D, Go to step 4.

REFERENCES

[1] G.Vinodhini and RM.Chandrasekaran, “Sentiment Analysis and Opinion Mining: A Survey”,
Volume 2, Issue 6, June 2012 ISSN: 2277 128X International Journal of Advanced Research in
Computer Science and Software Engineering

[2] Zhongwu Zhai, Bing Liu, Hua Xu and Hua Xu, “Clustering Product Features for Opinion
Mining”, WSDM’11, February 9–12, 2011, Hong Kong, China. Copyright 2011 ACM 978-1-4503-
0493- 1/11/02...$10.00

4)
TITLE OF THE PAPER : A Study on Sentiment Analysis Techniques of Twitter Data

AUTHORS : Abdullah Alsaeedi, Mohammad Zubair Khan

TITLE : A Study on Sentiment Analysis Techniques of Twitter Data.

JOURNAL : (IJACSA) International Journal of Advanced Computer Science and Applications, Vol.
10, No. 2, 2019
ABSTRACT

The entire world is transforming quickly under the present innovations. The Internet has become a
basic requirement for everybody with the Web being utilized in every field. With the rapid increase in
social network applications, people are using these platforms to voice them their opinions with regard
to daily issues. Gathering and analyzing peoples’ reactions toward buying a product, public services,
and so on are vital. Sentiment analysis (or opinion mining) is a common dialogue preparing task that
aims to discover the sentiments behind opinions in texts on varying subjects. In recent years,
researchers in the field of sentiment analysis have been concerned with analyzing opinions on
different topics such as movies, commercial products, and daily societal issues. Twitter is an
enormously popular microblog on which clients may voice their opinions. Opinion investigation of
Twitter data is a field that has been given much attention over the last decade and involves dissecting
“tweets” (comments) and the content of these expressions. As such, this paper explores the various
sentiment analysis applied to Twitter data and their outcomes.

Keywords

Twitter; sentiment; Web data; text mining; SVM; Bayesian algorithm; hybrid; ensembles

IMPORTANCE AND BACKGROUND

Opinions are fundamental to every single human action since they are key influencers of our
practices. At whatever point we have to settle on a choice, we need to know others' thoughts. In
reality, organizations and associations dependably need to discover users’ popular sentiments about
their items and services. Clients use different types of online platforms for social engagement
including web-based social networking sites; for example, Facebook and Twitter. Through these
webbased social networks, buyer engagement happens modern order strategies.

The Naive Bayes is widely used in the task of classifying texts into multiple classes and was recently
utilized for sentiment analysis classification.

B. Maximum Entropy

The Maximum Entropy (MaxEnt) classifier estimates the conditional distribution of a class marked a
given a record b utilizing a type of exponential family with one weight for every constraint. The
model with maximum entropy is the one in the parametric family ( ) that maximizes the likelihood.
Numerical methods such as iterative scaling and quasi-Newton optimization are usually employed to
solve the optimization problem.

C. Support Vector Machine

The support vector machine (SVM) is known to perform well in sentiment analysis . SVM
investigates information, characterizes choice limits and uses the components for the calculation,
which are performed in the input space . The vital information is presented in two arrangements of
vectors, each of size m. At this point, each datum (expressed as a vector) is ordered into a class. Next,
the machine identifies the boundary between the two classes that is far from any place in the training
samples . The separate characterizes the classification edge, expanding the edge lessens ambivalent
choices. As demonstrated in , the SVM has been proven to perform more effectively than the Naïve
Bayes classifier in various text classification problems.
CLASSIFICATION TECHNIQUES

In the machine learning field, classification methods have been developed, which use different
strategies to classify unlabeled data. Classifiers could possibly require training data. Examples of
machine learning classifiers are Naive Bayes, Maximum Entropy and Support Vector Machine .
These are categorized as supervised-machine learning methods as these require training data. It is
important to mention that training a classifier effectively will make future predictions easier

REFERENCES

[1] R. Xia, C. Zong, and S. Li, "Ensemble of feature sets and classification algorithms for sentiment
classification," Information Sciences, vol. 181, no. 6, pp. 1138-1152, 2011/03/15/ 2011.

[2] R. Sharma, S. Nigam, and R. Jain, "Opinion mining of movie reviews at document level," arXiv
preprint arXiv:1408.3829, 2014.

[3] R. Sharma, S. Nigam, and R. Jain, "Polarity detection at sentence level," International Journal of
Computer Applications, vol. 86, no. 11, 2014.

5)
TITLE OF THE PAPER : Study of Twitter Sentiment Analysis using Machine Learning Algorithms
on Python

AUTHORS : Bhumika Gupta , Monika Negi, Kanika Vishwakarma, Goldi Rawat, Priyanka Badhani

TITLE : Study of Twitter Sentiment Analysis using Machine Learning Algorithms on Python

JOURNAL : International Journal of Computer Applications (0975 – 8887) Volume 165 – No.9,
May 2017

ABSTRACT

Twitter is a platform widely used by people to express their opinions and display sentiments on
different occasions. Sentiment analysis is an approach to analyze data and retrieve sentiment that it
embodies. Twitter sentiment analysis is an application of sentiment analysis on data from Twitter
(tweets), in order to extract sentiments conveyed by the user. In the past decades, the research in this
field has consistently grown. The reason behind this is the challenging format of the tweets which
makes the processing difficult. The tweet format is very small which generates a whole new
dimension of problems like use of slang, abbreviations etc. In this paper, we aim to review some
papers regarding research in sentiment analysis on Twitter, describing the methodologies adopted and
models applied, along with describing a generalized Python based approach.

Keywords

Sentiment analysis, Machine Learning, Natural Language Processing, Python.


1. INTRODUCTION

Twitter has emerged as a major micro-blogging website, having over 100 million users generating
over 500 million tweets every day. With such large audience, Twitter has consistently attracted users
to convey their opinions and perspective about any issue, brand, company or any other topic of
interest. Due to this reason, Twitter is used as an informative source by many organizations,
institutions and companies.

METHODOLOGY

In order to perform sentiment analysis, we are required to collect data from the desired source (here
Twitter). This data undergoes various steps of pre-processing which makes it more machine sensible
than its previous form.

Tweet Collection

Tweet collection involves gathering relevant tweets about the particular area of interest. The tweets
are collected using Twitter’s streaming API or any other mining tool (for example WEKA ), for the
desired time period of analysis. The format of the retrieved text is converted

Pre-processing of tweets

The preprocessing of the data is a very important step as it decides the efficiency of the other steps
down in line. It involves syntactical correction of the tweets as desired. The steps involved should aim
for making the data more machine readable in order to reduce ambiguity in feature extraction. Below
are a few steps used for pre-processing of tweets

 Removal of re-tweets.

 Converting upper case to lower case: In case we are using case sensitive analysis, we might take
two occurrence of same words as different due to their sentence case. It important for an effective
analysis not to provide such misgivings to the model.

 Stop word removal: Stop words that don’t affect the meaning of the tweet are removed (for example
and, or, still etc.). uses WEKA machine learning package for this purpose, which checks each word
from the text against a dictionary .

 Twitter feature removal: User names and URLs are not important from the perspective of future
processing, hence their presence is futile. All usernames and URLs are converted to generic tags or
removed .

SENTIMENT CLASSIFIERS

 Bayesian logistic regression:

selects features and provides optimization for performing text categorization. It uses a Laplace
prior to avoid overfitting and produces sparse predictive models for text data.

 Naïve Bayes:
It is a probabilistic classifier with strong conditional independence assumption that is optimal for
classifying classes with highly dependent features. Adherence to the sentiment classes is
calculated using the Bayes theorem.

 Support Vector Machine Algorithm:

Support vector machines are supervised models with associated learning algorithms that analyze
data used for classification and regression analysis [6], [9]. It makes use of the concept of decision
planes that define decision boundaries.

Feature Extraction

A feature is a piece of information that can be used as a characteristic which can assist in solving
a problem (like prediction [11]). The quality and quantity of features is very important as they are
important for the results generated by the selected model. Selection of useful words from tweets is
feature extraction.

 Unigram features – one word is considered at a time and decided whether it is capable of being
a feature.

 N-gram features – more than one word is considered at a time.

 External lexicon – use of list of words with predefined positive or negative sentiment.

Various methodologies for extracting features are available in the present day. Term frequency-
Inverse Document frequency is an efficient approach. TF-IDF is a numerical statistic that reflects
the value of a word for the whole document (here, tweet).

REFERENCES

[1] David Zimbra, M. Ghiassi and Sean Lee, “Brand-Related Twitter Sentiment Analysis using
Feature Engineering and the Dynamic Architecture for Artificial Neural Networks”, IEEE 1530-
1605, 2016.

[2] Varsha Sahayak, Vijaya Shete and Apashabi Pathan, “Sentiment Analysis on Twitter Data”,
(IJIRAE) ISSN: 2349-2163, January 2015.

[3] Peiman Barnaghi, John G. Breslin and Parsa Ghaffari, “Opinion Mining and Sentiment
Polarity on Twitter and Correlation between Events and Sentiment”, 2016 IEEE Second
International Conference on Big Data Computing Service and Applications.

[4] Mondher Bouazizi and Tomoaki Ohtsuki, “Sentiment Analysis: from Binary to Multi-Class
Classification”, IEEE ICC 2016 SAC Social Networking, ISBN 978-1- 4799-6664-6.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy