0% found this document useful (0 votes)
5 views17 pages

Synopsis format

Uploaded by

jatin arya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views17 pages

Synopsis format

Uploaded by

jatin arya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Synopsis Report

on

Opinion Mining of Pandemic using Machine Learning


Submitted as requirement for the

Final Year Project


Session 2024-25

By:

Ojas Garg

2103213060

Radhika

Mehrotra

2103213074

Under the guidance of:


Mr. Shyam Sharma
Assistant Professor

DEPARTMENT OF CSE-AIML
ABES ENGINEERING COLLEGE, GHAZIABAD

AFFILIATED TO
DR. A.P.J. ABDUL KALAM TECHNICAL UNIVERSITY, U.P., LUCKNOW
(Formerly UPTU)
Student’s Declaration
I / we hereby declare that the work being presented in this report entitled “Opinion
Mining of Pandemic using Machine Learning.” is an authentic record of my/ our own
work carried out under the supervision of Mr. Shyam Sharma, Assistant Professor,
CSE-AIML. The matter embodied in this report has not been submitted by us anywhere
else.

Date:

Signature of student Signature of student

(Name: Ojas Garg) (Name: Radhika Mehrotra)

(Roll No. 2103213060) (Roll No. 2103213074)

Department: CSE-AIML Department: CSE-AIML

This is to certify that the above statement made by the candidate(s) is correct to the best
of my knowledge.

Signature of HOD Signature of Supervisor

…………………… Mr. Shyam Sharma


CSE-AIML Assistant Professor

Date: CSE-AIML

i
Acknowledgement

We would like to convey our sincere thanks to Mr. Shyam Sharma for giving the
motivation, knowledge and support throughout the course of the project. The continuous
support helps in a successful completion of project. The knowledge provided is very
useful for us.
We also like to give a special thanks to the department of CSE-AIML for giving us the
continuous support and opportunities for fulfilling our mini project.

Signature of student Signature of student

Ojas Garg Radhika Mehrotra

(Roll No. 2103213060) (Roll No. 2103213074)

ii
Table of Contents

S. No. Contents Page No.


Student’s Declaration i
Acknowledgement ii
List of Figures iv
List of Tables v
Abstract vi

Chapter 1: Introduction 1

Chapter 2: Related Work/Methodology 2

2.1: Existing Approaches 2


2.2: Comparative Analysis of Existing Works 2

Chapter 3: Project Objective 3

Chapter 4: Proposed Methodology 4

Chapter 5: Design and Implementation 5

5.1: Work Flow Diagram 5

Chapter 6: Results and Discussion 6

Chapter 7: Conclusion and Future Scope 7

References 8

iii
List of Tables

Table Page No.


Table 1. Count of tweets in each hashtag 4

iv
List of Figures

Figure No and name Page No.


Fig.1. Proposed Approach 4
5
Fig.2. Work Flow Diagram
6
Fig.3. Proportion of positive, negative and neutral
tweets.

V
ABSTRACT

COVID19 or popularly known as Coronavirus is an infectious disease

originated in Wuhan, China in 2019, and it have been spread all parts of the

world. In India the first case is found in the early 2020. Soon after it the

lockdown was imposed to control the situation. By now India have become 2nd

most affected country by the virus. In this project, the sentiments of the people

on the social media platform during this current pandemic is determined and

also it is tried to find that which machine learning algorithm will fits best for

analyzing the sentiments. About 1.5 lac tweets from Twitter have been

analyzed to determine the positivity, negativity or neutrality of people.

VI
Chapter 1

Introduction

The first case of this novel coronavirus was reported in December 2019 in China. From

there it spreads different countries like Italy, Spain, USA, India etc. World Health

Organization declared it a health emergency. Soon after it all the countries started taking

measures to stop spread of the novel coronavirus. On March 25, the nationwide lockdown

was imposed as a safety measure. By now, India became the 2 nd most affected country

after USA from coronavirus.

This project has been made to examine the opinions of the people after the lockdown

was imposed all over the India and people were locked in their homes. Analyzing the

sentiments are the emerging area of NLP which categorize the opinions and the

sentiments of the people using different text mining techniques. It can be helpful in many

ways. For example, it helps a seller to gain feedback of its product from the customer

from the online sites and by analyzing those feedbacks, the seller can improve the quality

of their product.

Social media platform is a place where everyone can express themselves without any

hesitation [6,8]. Twitter is a popular social media platform on which people express

themselves in the form of tweets. These tweets are studied to find out the sentiments or

opinions of the people on a certain subjective information.

The main objective of the project are : -

1. To analyze the tweets from the twitter and divide the emotions in three categories

(i.e. either positive, negative or neutral) and the emotions of the people.[3].

1
2. To study different machine learning algorithms for sentiment analysis and to

find out the best one that fits it[7].

2
Chapter 2

Related

Work

The related work associated with our project is given below:

2.1. Existing Approaches

 Twitter Sentiment Analysis using Python:

 To do the sentiment analysis of twitter data using python and find

the positive and negative tweets percentage [5].

 Word frequency and sentiment analysis of twitter messages during Coronavirus

pandemic [9]

 To find the frequency of each word and do the sentiment analysis of

the pandemic dataset [2].

 COVID-19 pandemic: a sentiment analysis

 To perform the analysis of sentiments of COVID-19 dataset [4].

 An "Infodemic": Leveraging High-Volume Twitter Data to Understand Public

Sentiment for the COVID-19 Outbreak [10]

 To measure and study the early changes in content and opinion about

the COVID-19[1].

2.2. Comparative Analysis of Existing Works

 In the existing projects, the words with positive or negative polarity are

obtained but our project we are obtaining the polarity of the overall data set.

 In existing projects, it is not specified that which machine learning model is

best for sentiment analysis but in our project we will be determining that too.
3
Chapter 3

Project Objective

 This project will analyze the emotions of people during the pandemic.

 To implement an algorithm for automatic classification of tweets into

positive, negative or neutral.

 This project will analyze different Machine Learning Algorithms and finds the

one with best accuracy.

4
Chapter 4

Proposed Methodology

The proposed methodology related to our project is given below:

Step 1: Identify the famous hashtags during the pandemic in India on Twitter. Tweets

under those hashtags are extracted from the Twitter API using Tweepy library.

Step 2: The preprocessing of the dataset is done. It involves the following steps:

 Removal of hashtags.
 Removal of links, gifs, emoji, images and special characters.
 Removal of stop words.
 Removal of non-English words.
 Lemmatization
Step 3: Analyzing the polarity of the dataset.

Step 4: Giving the step 3 output in different machine learning algorithms and analyze it to
find the algorithm with best accuracy.

Step 5: The results are represented using different charts.


Table 1. Count of tweets in each hashtag

S.No. Hashtags No. of Tweets


1. #coronavirusIndia 10,000
2. #IndiafightsCorona 10,000
3. #IndiaLockdown 10,000

• Extraction of Dataset from Twitter API

• Pre-processing of Data to remove special characters, punctuations, Stop Words and Images

• Processing of Data to analyze the polarity of the Dataset

• To use Machine Learning Algorithm and find which fits best for performing Sentiment Analysis

• Results are represented using tables and graphs.

Fig.1. Proposed Approach


5
Chapter 5

Design and Implementation

The design and implementation of our project is as follows:

5.1. Work Flow Diagram

Fig.2. Work Flow Diagram


6
The dataset has been extracted from Twitter API using the tweepy library in

python. Python library Numpy is used for the numerical computation and pandas is

used for the data manipulation. Natural Language Toolkit is used for the

preprocessing of the dataset. Text Blob library is used for spelling checks and

analyzing the sentiments.

Matplotlib is used for the graphical representation of results.

7
Chapter 6

Results and Discussion

The result we got from analyzing the tweets is given below in Fig.3.

Fig.3. Proportion of positive, negative and neutral tweets.

Fig.3. shows that 46 % of the total tweets are neutral, about 36.5% tweets are positive

and 17.5% tweets are negative.

8
Chapter 7

Conclusion and Future Scope

 The project will give the overall polarity score of Tweets and will find which is the

best Algorithm for performing Sentiment Analysis.

 From the analyses of the tweets, we observe that most of the people feel neutral

during pandemic that is neither positive nor negative.

 In future we will be planning to perform the analysis on various other social

platforms Instagram, Facebook, etc. and also try to further classify the sentiments.

9
References

[1] Medford, R. J., Saleh, S. N., Sumarsono, A., Perl, T. M., & Lehmann, C. U. (2020). An"
Infodemic": Leveraging High-Volume Twitter Data to Understand Public Sentiment for the
COVID-19 Outbreak. medRxiv.

[2] Rajput, N. K., Grover, B. A., & Rathi, V. K. (2020). Word frequency and sentiment
analysis of twitter messages during Coronavirus pandemic. arXiv preprint
arXiv:2004.03925.

[3] Samuel, J., Ali, G. G., Rahman, M., Esawi, E., & Samuel, Y. (2020). Covid-19 public
sentiment insights and machine learning for tweets classification. Information, 11(6), 314.

[4] Kumar, A., Khan, S. U., & Kalra, A. (2020). COVID-19 pandemic: a sentiment
analysis. European Heart Journal.

[5] Ahuja, S., & Dubey, G. (2017, August). Clustering and sentiment analysis on Twitter
data. In 2017 2nd International Conference on Telecommunication and Networks
(TEL- NET) (pp. 1-5). IEEE.

[6] Suman, C., Saha, S., Bhattacharyya, P., & Chaudhari, R. S. (2020). Emoji Helps! A
Multi-modal Siamese Architecture for Tweet User Verification. Cognitive Computation, 1-
16
[7] Neethu, M. S., & Rajasree, R. (2013, July). Sentiment analysis in twitter using machine
learning techniques. In 2013 Fourth International Conference on Computing,
Communications and Networking Technologies (ICCCNT) (pp. 1-5). IEEE.

[8] Gupta, S., Singh, A., & Ranjan, J. (2020). Sentiment Analysis: Usage of Text and
Emoji for Expressing Sentiments. In Advances in Data and Information Sciences (pp.
477-486). Springer, Singapore

[9] Rajput, N. K., Grover, B. A., & Rathi, V. K. (2020). Word frequency and sentiment
analysis of twitter messages during coronavirus pandemic. arXiv preprint
arXiv:2004.03925.

[10] Medford, R. J., Saleh, S. N., Sumarsono, A., Perl, T. M., & Lehmann, C. U. An
“Infodemic”: Leveraging High-Volume Twitter Data to Understand Early Public
Sentiment for the COVID-19 Outbreak. In Open Forum Infectious Diseases.

1
0

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy