0% found this document useful (0 votes)
16 views20 pages

Multimodal Sentiment Analysis-6

Uploaded by

Nabayan Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views20 pages

Multimodal Sentiment Analysis-6

Uploaded by

Nabayan Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

CUSTOMER SATISFACTION

ANALYSIS: SENTIMENT
ANALYSID BASED APPROACH
INTRODUCTION
Sentiment analysis has been a popular research topic in machine learning and natural language
processing for many years. It aims to understand and interpret human sentiment through different
modalities (e.g., text, voice tones, or facial expressions). Recently, automatic and accurate sentiment
analysis has been shown to play a critical role in natural human-computer interactions, group
decision-making systems, and opinion mining. With the popularity of online video platforms (e.g.,
YouTube, Twitter, and Weibo), an increasing number of users are willing to express their emotions and
opinions by posting videos. To effectively recognize the effective orientation of these videos,
multimodal sentiment analysis (MSA) has been proposed and attracted increasing attention. For
example, given a monologue video, the target of MSA is to detect the sentiment involved by
leveraging multiple input modalities, including textual, auditory, and visual modalities.
In this Project, we are going to propose a new model to do sentiment analysis using 3 modalities and
the method of Modality Translation, in previously made models we’ve seen that among the three
modalities text has the highest importance and it’ll be helpful to tackle the problem of missing data
and alongside applying Transfer Learning for Video modality is much more useful than the traditional
transformer-based models.
RECENT STUDIES
Study Adopted Technique key findings future scope data

Uses Encoder Based approach which accounts all Can be used Alongside LSTM
TMTN Encoder and Decoder
three modalities. and CNN
CMU-MOSI

Utilizes the available modalities


to generate 3D feature-enhanced Only considers Video features
MFNET Encoder and Decoder
image representing the missing so adding others can be done
CMU-MOSI
modality

This takes Text, Video, Audio and use them pairwise


Limited for interpretability and
Multimodal Masked Language as text with any one of 3 and this Achieves really
AOBERT Modeling good results on standard datasets, Also beats the
model complexity so can be CMU-MOSI
further simplified
problem found with RNN.

Text analysis performs better than Audio and visual


TETFN Convolutional NN and LSTM
cases
polsemy is not addressed CMU-MOSI

Mainly focused between intermodality of Text and


work can be done on the conflicts dataset on social
LFIR Convolutional NN and LSTM social media image and the use of support vector
understanding such as polysemy media posts
machines in the final layer works really well
RESEARCH ISSUES
Most of the state-of-the-art models assume that all the modalities are present during
our analysis but this doesn’t happen in reality essentially the models assume an ideal
condition and all the models don’t integrate the technique for Late fusion of the
Transformer based model with classical LSTM or CNN models. Further, the video
modalities’ prediction can be seen as a computer vision task and ResNet or YOLO can
perform best alongside Transfer Learning.
so we are tackling the following issues:
i) Fusion of Modalities using the individual task being unimodal analysis
ii) Dealing with missing modalities
iii) Dealing with intermodality relation of modalitiesi
iv) Modality Translation
RESEARCH CONTRIBUTION
i) The project will deal with multiple issues in currently available MSA models as most
of them only deal with modalities being individual or don’t consider the concept of
modality Translation
ii) The usage of Transformer models with the integration of Transfer Learning can
really help the predictions to be much more accurate than only using models with
encoder-decoder or LSTM only.
iii) Dealing with missing modalities is a very new concept as discussed earlier all the
available model assumes that the modalities are completely available and there are
no missing modality.
iv) As it also deals with Missing modalities it can be used even in real life application
where the state-of-the-art model doesn’t do that great.
DataSets:
BENCHMARK DATASET:
I) yelp review polarity
ii) AG’s news
iii) Twitter Sentiment Analysis (Entity-Level Twitter Sentiment Analysis Dataset)
iv)TripAdvisor
State-of-the-art models:

TETFN
VDCNN
TEDT
LSTM
BERT
Performance metrics:
Accuracy Score
F1-score
Precision
AUC
Proposed
Methadology
Results obtained for YELP Review polarity

METHODS Accuracy F1-score Precision

VDCNN 95.72%

LSTM 94.74%

Our model 95% 0.95 0.95


Our model vs other models
Results obtained for AG’s News

METHODS Accuracy F1-score precision

VDCNN 91.33 %

LSTM 86.06%

Our model 90.06% 0.91 0.91


Our model vs other models
Results obtained for AG’s News

METHODS Accuracy F1-score precision

BERT 97%

LR 92%

Our model 96% 0.96 0.96


Our model vs other models
Results obtained for AG’s News

METHODS Accuracy F1-score precision

NB 57%

LR 58%

Our model 58% 0.96 0.96


Our model vs other models
Discussion:
Our model is performing really well on textual features and it outperforms
LSTM already.
Our model can outperform VDCNN as well if trained for sufficient epochs as
we’ve trained only for 2 epochs and it’s doing significantly good.
References:
Attention is all you need
Temporal Fusion Transformer(https://arxiv.org/abs/1912.09363)
ALBERT: A Lite BERT for Self-supervised Learning of Language
Representations( https://arxiv.org/abs/1909.11942)
BERT: Pre-training of Deep Bidirectional Transformers for Language
Understanding(https://arxiv.org/abs/1810.04805)
MFNet ( https://ieeexplore.ieee.org/abstract/document/8206396)
Timeline:
sl
Task subtask start date End Date Remarks
no.

1. Literature Review 1st May,2024 26th May, 2024

Not performing
2. Applying BERT 27th May,2024 30th May,2024
well

3. Coding Transformer 1st June,2024 Present

Feature extraction of
4. 1st June,2024 3rd June,2024
Videos

5. Paper writing 10th July,2024 25th July,2024

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy