0% found this document useful (0 votes)

44 views22 pages

Industrial Training Report Format

The document describes a project to detect depression using machine learning models on tweets. It discusses collecting tweet data from Twitter, preprocessing the text data which includes removing links and punctuation, and vectorizing the data. Several machine learning algorithms like KNN and Naive Bayes are explored to classify tweets as depressed or non-depressed. The results and performance of the models are evaluated on a test dataset to conclude on the effectiveness of using social media data and machine learning for depression detection.

Uploaded by

nothingyours5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views22 pages

Industrial Training Report Format

Uploaded by

nothingyours5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

INDUSTRIAL TRAINING REPORT

Depression Detection Using Tweets Through Machine Learning

NNN

Submitted By

Name: ABHISHEK PANDEY

University Roll No. 2000321540003

SUBMITTED TO:

Department of Computer Science & Engineering

ABES ENGINEERING COLLEGE
GHAZIABAD
DECLARATION

I hereby declare that the Industrial Training Report entitled Depression Detection Using Tweets
Through Machine Learning is an authentic record of my own work as requirements of Industrial
Training during the period from 02 JULY, 2023 to 12 AUGUST, 2023 for the award of degree of
B.Tech. (Computer Science & Engineering), ABES Engineering College, Ghaziabad, under the
guidance of Dr. Jagriti Singh.

(Signature of student)

ABHISHEK PANDEY
2000321540003
Date: ____________________
ACKNOWLEDGEMENT

I would like to express special thanks & gratitude to Dr. Jagriti Singh for
giving the motivation, knowledge and support throughout the course of the
project. The continuous support helps in a successful completion of
project. The knowledge provided is very insightful for me.
I would also like to extend our sincere obligation to Dr. Jagriti Singh for
providing this golden opportunity to me which led into doing a lot of Research
which diversified my knowledge to a huge extent for which I am thankful.
Also, I would like to thank my parents and friends who supported me a lot in
finalizing this project within the limited time frame.

Abhishek Pandey
ABOUT THE INSTITUTE

Centre for Advanced Studies is an in-campus research driven institute established by Dr. A.P.J
Abdul Kalam Technical University Lucknow to impart state of the art education to post graduate
students and to facilitate quality research work in the emerging areas of Engineering and
Technology. The institute offers M.Tech. and Ph.D programs in the disciplines of Computer
Science and Engineering, Mechatronics, Nanotechnology, Manufacturing Technology and
Automation, and Energy Science and Technology. Established by the Uttar Pradesh State
Government in 2017 with an objective to provide a stimulating platform to research scholars and
academicians for creating and disseminating research based knowledge and technologies for the
development of State/Country. The Institute is climbing up consistently on the path to visibility
across the globe. In a short span of time, significant progress has been made with quality
education, impactful research, publications and patents, funded projects, training and placement.
The University is also making constant efforts to create a healthy environment for meaningful
research outcomes, to mentor affiliated Institutions with an establishment of world class
laboratories and facilities in the Institute, and to enhance the knowledge of faculty and students
with the latest technologies and developments through training and education
TABLE OF CONTENTS

Page No.

Introduction………………………………………………………………………
Tools & technology………………………………………………………………
Objective of the project…………………………………………………………
System Design……………………………………………………………………
Methodology for implementation………………………………………………
Implementation Details…………………………………………………………
Results……………………………………………………………………………
Conclusion………………………………………………………………………
Reference…………………………………………………………………………
INTRODUCTION
Across the globe, of people experience depression, it is also the main reason
why individuals commit suicide. People now use social networking sites like
Twitter and Facebook to share their ideas and feelings, which has prompted
experts to look into how this data may be used to track mental health
disorders. The real-time nature of social media posts enables researchers to
examine emotional well-being and observe changes over time, which
traditional surveys are unable to capture. Expressions of loneliness or
melancholy, negative language, self-deprecating remarks, or a lack of
participation in social events are just a few examples of language patterns on
social media that can be a sign of depression. Finding these patterns enables
the identification of those who are vulnerable and may profit from care.
Monitoring social media data can also reveal how particular scenarios affect
the emotional well-being of individuals. Researchers analyze the
consequences of widely reported occurrences, such as celebrity deaths, and
how depressive symptoms are affected by campaigns to raise awareness of
mental health issues.
Even though it may be beneficial, there are limitations to relying solely on
social media for information about depressed people. These drawbacks
involve representation bias, problems with precision because not everyone
has access to or uses said particular social media, and privacy concerns as
data collection must be protected so it will not violate the confidentiality of
an individual rights.
In summary, social media data can be used to track depression, but concerns
about privacy and the necessity of appropriate post interpretation must be
tackled. Moreover, various sources should be merged for an accurate
assessment.
TOOLS & TECHNOLOGY USED

Hardware Requirements:
• Core i5/i7 processor
• At least 8 GB RAM
• At least 60 GB of Usable Hard Disk Space

Software Requirements:
• Python 3.x
• Anaconda Distribution
• NLTK Toolkit
• Jupyter Notebook // Google Collaboratory
OBJECTIVE OF THE PROJECT
Scrapping Tweets from various Tweeter Handles featuring various news, reviews from
Tweeter.com.

Analyse and categorize text.

Analyse sentiment on dataset from document level.

Categorization or classification of opinion sentiment into-

• Depressed
• Non-Depressed

A Typical Sentiment Analysis Model

SYSTEM DESIGN
Data Information:
• The Dataset is from the Tweeter and is retrieved from the Kaggle.com
• The Dataset contains close to 20,000 Tweets.
• The tweets are labelled 1 and 0 for depressed and non-depressed tweets respectively

A representation of dataset

Data Format:
The dataset we will use is in .csv format. The sample of the dataset is given below.
METHODOLOGY FOR IMPLEMENTATION
Resources: -
In order to facilitate the preprocessing part of the data, we introduce resources which are,

• A stop word dictionary corresponding to words which are filtered out before or after
processing of natural language data because they are not useful in our case.

Pre-Processing: -

We can pre-process the tweets now that we have the corpus of tweets and all the
resources that might be helpful. It is crucial because all the changes we make during this
process will have an immediate effect on how well the classifier functions.

The preprocessing includes step includes following actions:

o Removal of hyperlinks, username and punctuation
o Cleaning of text to keep only English text
o Normalization
o Vectorization

Preprocessing will produce uniform and consistent data that can be used to optimize the
performance of the classifier.
Since we need to extract features from our data set of tweets, we use three different
vectorization methods:
▪ TF-IDF
▪ Count Vectorizer
▪ N-gram vectorizer

In text classification, the count (number of time) of each word appears is a document is
used as a feature for training the classifier.

Firstly, we divide the data set into two parts, the training set and the test set. To do this,
we first shuffle the data set to get rid of any order applied to the data, the training dataset
contains 2/3rd of the dataset, while rest is the testing dataset.
A third set of data, known as the validation set, is actually required after the training set
and test set have been produced. It will be utilized to test our model against previously
unreported data and adjust the learning algorithm's potential parameters to prevent
underfitting and overfitting, among other things.
We need this validation set because our test set should be used only to verify how well
the model will generalize. If we use the test set rather than the validation set, our model
could be overly optimistic and twist the results.

To make the validation set, there are two main options:

• Split the training set into two parts (60%, 20%) with a ratio 2:8 where each part
contains an equal distribution of example types. We train the classifier with the largest
part, and make prediction with the smaller one to validate the model. This technique
works well but has the disadvantage of our classifier not getting trained and validated on
all examples in the data set (without counting the test set).
• The Kfold cross validation. We split the data set into k parts, hold out one,
combine the others and train on them, then validate against the held out portion. We
repeat that process k times (each fold), holding out a different portion each time. Then we
average the score measured for each fold to get a more accurate estimation of our model's
performance.

Classification Algorithms:-

K-Nearest Neighbors Algorithm:-

The supervised learning classifier k-nearest neighbors, often known as KNN or k-NN,
uses proximity to make classifications or predictions about the grouping of a single data
point. Although it can be used to solve classification or regression problems, it is
frequently used as a classification technique because it is predicated on the notion that
similar points can be found nearby.

• Firstly, we will choose the number of neighbors.

• Next, we will calculate the Euclidean distance between the data points. The
Euclidean distance is the distance between two points, which we have already
studied in geometry. It can be calculated as:

Naïve Bayesian classifier:

The Naïve Bayesian classifier works as follows: Suppose that there exist a set of training
data, D, in which each tuple is represented by an n-dimensional feature vector, X=x 1,x
2,..,x n , indicating n measurements made on the tuple from n attributes or features.
Assume that there are m classes, C 1,C 2,...,C m . Given a tuple X, the classifier will
predict that X belongs to C i if and only if: P(C i |X)>P(C j |X), where i,j∈[1,m]a n d i≠j.
P(C i |X) is computed as:
For discrete counts, the multinomial model is typically utilized. By measuring the number
of times a word $w_i$ appears over the total number of words in text categorization, as
opposed to just stating 0 or 1, we further extend the Bernoulli model.
Logistic Regression:-
Logistic regression can forecast the likelihood of a result with only two alternative
values, or a dichotomy. The foundation of the prediction is the employment of one or
more categorical and numerical predictors. A linear regression is ineffective for
estimating the value of a binary variable for the reasons listed below:

• A linear regression will predict values outside the acceptable range (e.g.,
predicting probabilities outside the range 0 to 1)
• Since the dichotomous experiments can only have one of two possible values for
each experiment, the residuals will not be normally distributed about the predicted line.

Contrarily, a logistic regression results in a logistic curve that can only have values
between 0 and 1. Similar to a linear regression, a logistic regression builds its curve using
the natural logarithm of the target variable's "odds" rather than the probability.
Furthermore, neither the predictors nor the variance in each group must be regularly
distributed.

Support Vector Machine: -

Support vector machine (SVM) is a method for the classification of both linear and
nonlinear data. If the data is linearly separable, the SVM searches for the linear optimal
separating hyperplane (the linear kernel), which is a decision boundary that separates data
of one class from another.
Mathematically, a separating hyper plane can be written as: W·X+b=0, where W is a
weight vector and W=w1, w2,...,w n. X is a training tuple. b is a scalar. In order to
optimize the hyperplane, the problem essentially transforms to the minimization of ∥W∥,
which is eventually computed as:
where αi are numeric parameters, and yi are labels based on support vectors, Xi .
That is: if y i =1 then

if y i =−1 then

If the data is linearly inseparable, the SVM uses nonlinear mapping to transform the data
into a higher dimension. It then solve the problem by finding a linear hyperplane.
Functions to perform such transformations are called kernel functions. The kernel
function selected for our experiment is the Gaussian Radial Basis Function (RBF):

where Xi are support vectors, X j are testing tuples, and γ is a free parameter that uses the
default value from scikit-learn in our experiment. Figure shows a classification example
of SVM based on the linear kernel and the RBF kernel.
Confusion Matrics:-
The performance of the model during the classification process may be seen and its
accuracy can be assessed using a confusion matrix. It is "about" because these figures can
change based on, for instance, how we shuffle our data

We can hopefully tell that there are more true positive and true negative categorised
tweets than there are false positive and false negative tweets. However, based on this
outcome, we experiment with various strategies to try and increase the classifier's
accuracy, and we repeat the procedure using k fold cross validation to assess its average
accuracy
IMPLEMENTATION DETAILS
The training of dataset consists of the following steps:

Loading necessary libraries:

Loading of Dataset: - A small python code is written to load the csv file

Preprocessing Data:
This is a vital part of training the dataset. Here Words present in the file are accessed both as a
solo word and also as pair of words. Because, for example the word “bad” means negative but
when someone writes “not bad” it refers to as positive. In such cases considering single word for
training data will work otherwise. So words in pairs are checked to find the occurrence to
modifiers before any adjective which if present which might provide a different meaning to the
outlook
After pre-processing:

Training And Evaluating dataset:

Dividing the data set into 80-20 ratio.

Vectorizing data:
Implementing various models.
RESULTS

The machine is now able to determine if a sentence that has been entered will receive an either
positive or negative response as a result of this training dataset of public comments.
The proportion of relevant instances among the recovered instances is known as precision (also
known as positive predictive value), whereas the proportion of relevant instances that have been
retrieved relative to the total number of relevant instances is known as recall (also known as
sensitivity). Therefore, a comprehension of and a measurement of relevance are the foundations
of both precision and recall.

Results obtained (using Train-Test split and TF-IDF ).

Name of Classifier Accuracy Score
(in %)
K-NN 57.6

LOGISTIC REGRESSION 75.52

NAÏVE BAYES 74.36

SVC 76.15
NAÏVE BAYES

LOGISTIC REGRESSION
SVC
CONCLUSION

Texts are categorized using sentiment analysis based on the emotions they express. Data
preparation, review analysis, and sentiment classification are the three key steps of a
conventional sentiment analysis model. The three steps are the main topic of this article, which
also discusses typical methods employed in each.

A growing area of text mining and computer linguistics, sentiment analysis has attracted a lot of
research attention in recent years.
Future research will focus on in-depth methods for extracting opinion and product attributes, as
well as cutting-edge classification models that can take the ordered labels property in rating
inference into account. Applications that utilize the sentiment analysis results are also anticipated
to surface soon.
REFERENCE
• Priya A, Garg S, Tigga NP (2020) Predicting anxiety, depression and stress in modern life using machine
learning algorithms. Procedia Computer Science 167:1258-1267
• Alsagri HS, Ykhlef M (2020) Machine learning-based approach for depression detection in
twitter using
• Content and Activity Features. IEICE Transactions on Information and Systems E103.D
(8):1825-1832.doi:10.1587/transinf.2020EDP7023
• Kumar P, Garg S, Garg A (2020) Assessment of anxiety, depression and stress using machine
learning models.
• Procedia Computer Science 171:1989-1998. doi:https://doi.org/10.1016/j.procs.2020.04.213
• Shelton, J. 2019. Depression Definition and DSM-5 Diagnostic Criteria. Retrieved June 13, 2019,
from https://www.psycom.net/depression-definition-dsm-5- diagnostic-criteria/
• Cavazos-Rehg, P. A., Krauss, M. J., Sowles, S., Connolly, S., Rosas, C., Bharadwaj, M., and
Bierut, L. J. 2016. A content analysis of depression-related tweets. Computers in Human
Behaviour. https://doi.org/10.1016/j.chb.2015.08.023.
• Neethu M S, Rajasree R(2014) Sentiment analysis in twitter using machine learning techniques

Fake News Detection
100% (1)
Fake News Detection
25 pages
3 Hinge Analysis of Masonry Arches PDF
No ratings yet
3 Hinge Analysis of Masonry Arches PDF
5 pages
Fiber Glass Protection
100% (1)
Fiber Glass Protection
679 pages
Underground Mining Fundamentals P13GR37WEBPDF
No ratings yet
Underground Mining Fundamentals P13GR37WEBPDF
4 pages
012 M13 Geometrical Modeling 2007 Potsdam
No ratings yet
012 M13 Geometrical Modeling 2007 Potsdam
8 pages
Final Minor Project
No ratings yet
Final Minor Project
83 pages
Second Review
No ratings yet
Second Review
28 pages
Detection and Prediction of Future Mental Disorder From Social Media Data Using Machine Learning
No ratings yet
Detection and Prediction of Future Mental Disorder From Social Media Data Using Machine Learning
34 pages
Report Doucmentation
No ratings yet
Report Doucmentation
20 pages
Projecr Report - Pagenumber
No ratings yet
Projecr Report - Pagenumber
49 pages
Seminar Report
No ratings yet
Seminar Report
20 pages
Depression Detection and Analysis Using ML
No ratings yet
Depression Detection and Analysis Using ML
7 pages
NLP Project G4 R
No ratings yet
NLP Project G4 R
8 pages
Exploring Depression Through Social Media A Textual Analysis
No ratings yet
Exploring Depression Through Social Media A Textual Analysis
7 pages
Major ppt-1
No ratings yet
Major ppt-1
8 pages
ML Documentation
No ratings yet
ML Documentation
76 pages
Comparative Analysis of NLP Models For Detecting Depression On Twitter
No ratings yet
Comparative Analysis of NLP Models For Detecting Depression On Twitter
6 pages
Chapter 4
100% (1)
Chapter 4
166 pages
Chapter 3 - Static Performance Characterstics
No ratings yet
Chapter 3 - Static Performance Characterstics
29 pages
Projectsysnopsis
No ratings yet
Projectsysnopsis
7 pages
17BIT051
No ratings yet
17BIT051
26 pages
Sarthak Synopsis
No ratings yet
Sarthak Synopsis
7 pages
Activity 2 - Crossword Puzzle
No ratings yet
Activity 2 - Crossword Puzzle
2 pages
Synopsis 3
No ratings yet
Synopsis 3
7 pages
Major Project
No ratings yet
Major Project
16 pages
Priyanka RDC 2
No ratings yet
Priyanka RDC 2
26 pages
Pptfinal Year
No ratings yet
Pptfinal Year
11 pages
Mu Checker - 2215 1
No ratings yet
Mu Checker - 2215 1
20 pages
Schedule of Examination - Second Semester AY 2024 2025
No ratings yet
Schedule of Examination - Second Semester AY 2024 2025
8 pages
Segregating Tweets Using Machine Learning
No ratings yet
Segregating Tweets Using Machine Learning
4 pages
Saipem Modern Slavery Statement 22 FINAL
No ratings yet
Saipem Modern Slavery Statement 22 FINAL
20 pages
La Vanya
No ratings yet
La Vanya
44 pages
Restaurant Review Production Analysis Using Python
No ratings yet
Restaurant Review Production Analysis Using Python
33 pages
Position Description BIM Manager
No ratings yet
Position Description BIM Manager
5 pages
Conference PPTT
No ratings yet
Conference PPTT
20 pages
Fin Irjmets1651825107
No ratings yet
Fin Irjmets1651825107
4 pages
Digital Dominance T.D. Wilson
No ratings yet
Digital Dominance T.D. Wilson
3 pages
Daniel Robert Middleton
No ratings yet
Daniel Robert Middleton
3 pages
Akshada Tweet Report With Pages Removed
No ratings yet
Akshada Tweet Report With Pages Removed
15 pages
Deep Learning-Based Depression Detection From Social Media
No ratings yet
Deep Learning-Based Depression Detection From Social Media
20 pages
Fake News Synopsis
No ratings yet
Fake News Synopsis
10 pages
Mini Porject On Social Media Suicidal Content - Bhavani
No ratings yet
Mini Porject On Social Media Suicidal Content - Bhavani
21 pages
New Microsoft Word Document (3) BBBB
No ratings yet
New Microsoft Word Document (3) BBBB
85 pages
Depression Detection From Social
No ratings yet
Depression Detection From Social
17 pages
Recruitment and Selection
No ratings yet
Recruitment and Selection
2 pages
IJNGC Latex Research Paper
No ratings yet
IJNGC Latex Research Paper
10 pages
Depression Analysis Using Sentiment Analysis Via Social Media
50% (2)
Depression Analysis Using Sentiment Analysis Via Social Media
4 pages
1822 B.tech It Batchno 359
No ratings yet
1822 B.tech It Batchno 359
86 pages
B Tech District-Wise
No ratings yet
B Tech District-Wise
10 pages
Franck Hertz
No ratings yet
Franck Hertz
6 pages
Estimation of A Population Mean
No ratings yet
Estimation of A Population Mean
1 page
MRR Format For GED109
No ratings yet
MRR Format For GED109
1 page
Vaibhav DSBDA Project
No ratings yet
Vaibhav DSBDA Project
16 pages
(15PR201203644338) PDF
No ratings yet
(15PR201203644338) PDF
4 pages
Sample Course End Project Report
No ratings yet
Sample Course End Project Report
25 pages
Continous Power UPS Selectivity DTC
No ratings yet
Continous Power UPS Selectivity DTC
28 pages
Dsbda
No ratings yet
Dsbda
12 pages
A. Definitions of Traditional Literacies
100% (1)
A. Definitions of Traditional Literacies
6 pages
Here Are The Stages in The Procurement Process
No ratings yet
Here Are The Stages in The Procurement Process
6 pages
D860 Pico Macom
No ratings yet
D860 Pico Macom
8 pages
Untitled - Notepad
No ratings yet
Untitled - Notepad
1 page
Brochr AS350B3e
100% (1)
Brochr AS350B3e
16 pages
DYNA102 Stanadyne Pump
100% (3)
DYNA102 Stanadyne Pump
4 pages
IJRPR6548
No ratings yet
IJRPR6548
5 pages
Numerical Analysis: MATLAB Practical (Autumn 2020) B.E. III Semester Thapar Institute of Engineering & Technology Patiala
No ratings yet
Numerical Analysis: MATLAB Practical (Autumn 2020) B.E. III Semester Thapar Institute of Engineering & Technology Patiala
6 pages
BS en 10223-5 (1998)
No ratings yet
BS en 10223-5 (1998)
13 pages
Sir - 11 - 21 Rate List 2022
No ratings yet
Sir - 11 - 21 Rate List 2022
10 pages
Final Twitter - Sentiment - Analysis - Report
100% (1)
Final Twitter - Sentiment - Analysis - Report
14 pages
Ultimate Machine Learning with ML.NET
From Everand
Ultimate Machine Learning with ML.NET
Kalicharan Mahasivabhattu
No ratings yet
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
From Everand
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
Partha Pritam Deka
No ratings yet
Ultimate Enterprise Data Analysis and Forecasting using Python
From Everand
Ultimate Enterprise Data Analysis and Forecasting using Python
Shanthababu Pandian
No ratings yet
Blockchain Foundation Courseware - English
From Everand
Blockchain Foundation Courseware - English
Eppo Luppes
No ratings yet
AZ-900: Microsoft Azure Fundamentals Practice Questions Third Edition
From Everand
AZ-900: Microsoft Azure Fundamentals Practice Questions Third Edition
IP Specialist
No ratings yet
Microsoft Azure Fundamentals: AZ-900- +250 Practices Questions - Second Edition
From Everand
Microsoft Azure Fundamentals: AZ-900- +250 Practices Questions - Second Edition
IP Specialist
4.5/5 (3)
Instant Approach to Software Testing
From Everand
Instant Approach to Software Testing
Anand Nayyar
No ratings yet
Big Data and Data Science: Analytics for the Future
From Everand
Big Data and Data Science: Analytics for the Future
Dhaanyalakshmi Ahuja
No ratings yet
Touchpad Information Technology Class 10: Skill Education Based on Windows & OpenOffice Code (402)
From Everand
Touchpad Information Technology Class 10: Skill Education Based on Windows & OpenOffice Code (402)
Dr. Sanjay Jain
No ratings yet
AWS Certified Cloud Practitioner: Study Guide with Practice Questions and Labs
From Everand
AWS Certified Cloud Practitioner: Study Guide with Practice Questions and Labs
Nouman Ahmed Khan
5/5 (1)
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
From Everand
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
Chitra Lele
No ratings yet
Microsoft Azure AI-102 Practice Tests
From Everand
Microsoft Azure AI-102 Practice Tests
CertSquad Professional Trainers
No ratings yet
IGNOU BCA System Analysis and Design Previous Year Solved Papers MCS 014
From Everand
IGNOU BCA System Analysis and Design Previous Year Solved Papers MCS 014
Manish Soni
No ratings yet
Ai-102: Designing and Implementing a Microsoft Azure Ai Solution Practice Questions
From Everand
Ai-102: Designing and Implementing a Microsoft Azure Ai Solution Practice Questions
IP Specialist
No ratings yet
Intelligent Reliability Analysis Using MATLAB and AI: Perform Failure Analysis and Reliability Engineering using MATLAB and Artificial Intelligence (English Edition)
From Everand
Intelligent Reliability Analysis Using MATLAB and AI: Perform Failure Analysis and Reliability Engineering using MATLAB and Artificial Intelligence (English Edition)
Dr. Cherry Bhargava
No ratings yet
Fundamentals of Software Engineering: Designed to provide an insight into the software engineering concepts
From Everand
Fundamentals of Software Engineering: Designed to provide an insight into the software engineering concepts
Hitesh Mohapatra
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Be Data Curious!: Be Data Curious!, #1
From Everand
Be Data Curious!: Be Data Curious!, #1
Nick Jewell
No ratings yet
AZ-400: Designing and Implementing Microsoft DevOps Solutions Practice Questions
From Everand
AZ-400: Designing and Implementing Microsoft DevOps Solutions Practice Questions
IP Specialist
No ratings yet
Oracle Cloud Infrastructure (OCI) developer Associate 2021 Practice Questions with Explanations and Reference Links
From Everand
Oracle Cloud Infrastructure (OCI) developer Associate 2021 Practice Questions with Explanations and Reference Links
IP Specialist
1/5 (1)
AI-900: Microsoft Azure AI Fundamentals Practice Questions
From Everand
AI-900: Microsoft Azure AI Fundamentals Practice Questions
IP Specialist
No ratings yet
MCS-034: Software Engineering
From Everand
MCS-034: Software Engineering
Dr. DK Sukhani
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Industrial Training Report Format

Uploaded by

Industrial Training Report Format

Uploaded by

INDUSTRIAL TRAINING REPORT

Depression Detection Using Tweets Through Machine Learning

Name: ABHISHEK PANDEY

Department of Computer Science & Engineering

Analyse and categorize text.

Analyse sentiment on dataset from document level.

Categorization or classification of opinion sentiment into-

A Typical Sentiment Analysis Model

The preprocessing includes step includes following actions:

To make the validation set, there are two main options:

K-Nearest Neighbors Algorithm:-

• Firstly, we will choose the number of neighbors.

Naïve Bayesian classifier:

Support Vector Machine: -

Loading necessary libraries:

Training And Evaluating dataset:

Dividing the data set into 80-20 ratio.

Results obtained (using Train-Test split and TF-IDF ).

LOGISTIC REGRESSION 75.52

NAÏVE BAYES 74.36

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.