MP 1
MP 1
LEARNING
FOR EMOTION DETECTION
AND OPINION MINING
By Project Guide:
Navuru Sahithya- 160122737153 Mr. B. Harish Goud
Sai Tejaswi Edara-160122737159
Snikitha Govindu-160122737161
1. Abstract
2. Introduction
3. Literature Survey
4. Problem Statement
Agenda
5. Methodology
6. Results
7. Conclusion
8. References
Abstract
• Sentiment analysis is a branch of natural language processing that
focuses on identifying and analyzing emotions and opinions in text. Its
main goal is to determine whether expressions are positive, negative, or
neutral.
• This field is vital for understanding subjective opinions from sources like
social media, blogs, and reviews, acting as a gauge for public sentiment.
By using machine learning algorithms, sentiment analysis processes text
autonomously, eliminating the need for manual evaluation.
• However, it faces challenges such as language nuances, cultural
differences, and evolving sentiment expressions, requiring continuous
improvement to remain effective.
Introduction
• Sentiment Analysis (SA) in NLP is vital for gauging attitudes and emotions in
text. It utilizes machine learning to automate analysis, extracting insights
about sentiments towards various elements or products. By training models
with annotated emotional text examples, SA algorithms accurately discern
sentiments, considering context, sarcasm, and nuanced language.
7
Methodology
• Start: This marks the start of our sentiment analysis journey, where we
aim to understand and categorize the sentiment in textual data,
forming the foundation of our analytical endeavor.
8
Methodology
• Sentiment Identification and Classification using Learned Classifiers: Our
sentiment analysis method focuses on training machine learning
classifiers, such as Support Vector Machines, Naive Bayes, and neural
networks. They rigorously learn patterns from preprocessed data,
enabling accurate sentiment predictions for new text samples.
10
Results
- Initial observation: Positive sentiments were predominant over negative and
neutral sentiments through graphical data visualization.
- N-Gram analysis: Performed to scrutinize unigrams, bigrams, and trigrams
associated with each sentiment category for refining understanding of textual
nuances and improving initial results accuracy.
- Dataset balancing: Converted sentiments into numerical values and utilized
TF-IDF Vectorizer for feature extraction to ensure dataset balance.
- Resampling: Imbalanced data underwent resampling before being split into a
75/25 ratio for training and testing.
- K-Fold Cross Validation: Employed on the pre-resampled dataset to validate
results thoroughly and benefit from its robustness against data imbalance.
11
Result
- Core phase: Developed sentiment analysis models using machine learning
techniques.
- Tested models: K-Nearest Neighbors (KNN), Support Vector Classifier (SVC),
Logistic Regression, Random Forest, and Decision Tree Classifiers.
- Evaluation metrics: Test accuracy, precision, and recall.
- Results: Logistic Regression outperformed others slightly in accuracy
compared to SVC during 10-Fold Cross Validation.
- Fine-tuning: Optimized Logistic Regression model using Grid Search for
optimal hyperparameters.
- Training set accuracy: 94.80%.
- Test set accuracy: Impressive 95.21%.
- F1 Score: Achieved 95% across all sentiment categories.
- Conclusion: Model demonstrated effectiveness through comprehensive
approach from data visualization to rigorous testing and tuning.
12
Conclusion
- Project overview: Comprehensive process of managing, processing, and analyzing vast
textual data to categorize sentiments across digital platforms.
- Utility: Illustrates broad applicability in market research, brand monitoring, social
media analysis, and political discourse.
- TF-IDF method: Strategically tailored for nuanced analysis, accurately capturing term
significance within documents to enhance overall analysis.
- Decision factors: Size of dataset, required accuracy level, and available resources
guided the choice of TF-IDF.
- Insights: Provide stakeholders with clearer understanding of public opinion and
sentiment trends.
- Implications: Aid strategic decision-making in marketing, product development,
customer service, and policy formulation, facilitating dynamic adaptation to consumer
needs and market changes.
References
• [1] Ali Athar; Sikandar Ali; Muhammad Mohsan Sheeraz; Subrata Bhattachariee; Hee-Cheol
Kim“Sentimental Analysis of Movie Reviews using Soft Voting Ensemble-based Machine
Learning“Publisher: IEEE,Link:IEEE xplorer.
• [2] G Prema Arokia Mary; M S Hema; R Maheshprabhu; M Nageswara Guptha. “Sentimental Analysis of
Twitter Data using Machine Learning Algorithms”.Publisher: IEEE,Link:IEEE xplorer.
• [3] Muskan Agarwal; Richa Goyal; Eshika Verma; Hemlata Goyal; Gulrej Ahmed; Sunita
Singhal.”Predictive Sentimental Analysis of Spam Detection using Machine
Learning“,Publisher:IEEE,Link: IEEE xplorer.
• [4] Nirag T. Bhatt1, Asst. Prof. Saket J. Swarndeep2,” Sentiment Analysis using Machine Learning
Technique: A Literature Survey”, Publisher: irjet,Link:www.irjet.net
• [5] P Ancy Grana,” Sentiment analysis of text using machine learning Publisher: International
Research Journal of Modernization in Engineering Technology and
Science,Link:https://www.irjmets.com/.
• [6] Kaggle website for the CSV file - https://www.kaggle.com/eswarchandt/amazon-music-reviews?
select=Musical_instruments_reviews.csv