0% found this document useful (0 votes)
19 views6 pages

Amna Bagh Ali

The document outlines a Python notebook for sentiment analysis on Amazon reviews using libraries such as pandas, numpy, and sklearn. It includes steps for data loading, preprocessing, sentiment analysis using TextBlob, and training a Naive Bayes classifier. The final step evaluates the model's performance with a classification report and confusion matrix visualization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

Amna Bagh Ali

The document outlines a Python notebook for sentiment analysis on Amazon reviews using libraries such as pandas, numpy, and sklearn. It includes steps for data loading, preprocessing, sentiment analysis using TextBlob, and training a Naive Bayes classifier. The final step evaluates the model's performance with a classification report and confusion matrix visualization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 6
2:31 2S 4d 58% OQ 2% search.googlecom + @ amazon-reviews-model.ipynb © +46> 497 Connect + Aa [ ] #Step 2: Import Required Libraries # Import necessary libraries import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.model_selection import tr from sklearn.feature_extraction.text i from sklearn.naive_bayes import Multir from sklearn.metrics import classifice from imblearn.over_sampling import SMC from textblob import TextBlob [ ] #Step 3: Load and Explore the Data # Load dataset df = pd.read_csv('/content/amazon.csv' # Check the first few rows of the date df .head() # Check for missing values df. isnull().sum() # Get dataset info (data types and nor df. info() # Summary statistics df. describe() 4] 4 497 Connect + Aa CJ] 0. Text 19996 non-null object —_ 1 label 19996 non-null int64 2 atypes: inte4(1), object(1) memory usage: 312.6+ KB label count 19996.000000 mean 0.761652 std 0.426083 min 0.000000 25% 1.000000 50% 1.000000 75% 1.000000 max 1.000000 [ ] #Step 4: Preprocessing the Data import string from nltk.corpus import stopwords # Download NLTK stopwords (if not alre import nltk n1tk.download( ' stopwords") # Text Preprocessing function def preprocess_text(text): # Handle missing or non-string val if not isinstance(text, str): return "" # Return an empty < # Convert to lowercase a e < 2:31 2 Ad 58% 0 + 26 search.google.com @ amazon-reviews-model.ipynb © +46> 497 Connect + Aa text = text. lower() # Remove punctuation text = text.translate(str.maketrar # Tokenize (split into words) words = text.split() # Remove stop words stop_words = set(stopwords.words(' words = [word for word in words if return ' '.join(words) C1 # Apply preprocessing to the ‘text’ cc df['cleaned_text'] = df['Text'].apply( # Check the cleaned text df..head() = [nltk_data] [nltk_data] Downloading package sto) Package stopwords is . Text label cleaned_text This is the best apps best apps acording 0 acording 1 bunch people to a bunch agree bombs of... eg... This is a pretty pretty good good 1 version game version of free lots the game different |... for... ue really bunch 2 therearea 1 a it! bunch of Cena Os eee super fun a @ < 2:32 2 4d 58% OQ 2% search.googlecom + @ amazon-reviews-model.ipynb © +46> 497 Connect + Aa game on pad hrs fun o) any pad. grandkids love > Hrs of great... ~ fun... [ ] #Step 4: Sentiment Analysis using Text from textblob import TextBlob # Sentiment analysis function def get_sentiment (text): analysis = TextBlob(text) # Return sentiment polarity: negat if analysis.sentiment.polarity > ¢ return ‘positive’ elif analysis.sentiment.polarity = return ‘neutral’ else: return ‘negative’ # Apply sentiment analysis df['sentiment'] = df['cleaned_text'].< # Check the sentiment distribution print(df['sentiment'].value_counts()) 2y sentiment positive 15000 negative 3823 neutral 1173 Name: count, dtype: int64 [ ] #Step 5: Prepare Data for Classificati Split the dataset into features (Xx) df['cleaned_text'] # Text data dfficantimant'1 # Targat cantimer a e < x 2:32 ou 4d 58% OQ 2% search.googlecom + @ amazon-reviews-model.ipynb © +46> 497 Connect + Aa [ ] ¥ = GfL'sentiment'] # Target sentimer # Split the data into training and tes X_train, X test, y train, y test = tre #Step 6: Vectorize the Text Data # Initialize TF-IDF Vectorizer tfidf = TfidfVectorizer(max_features= # Fit and transform the training data, X_train_tfidf = tfidf.fit_transform(x_ X_test_tfidf = tfidf.transform(x_test) [ ] #Step 7: Train a Sentiment Analysis Mc # Train a Naive Bayes classifier model = MultinomialNB() model. fit(X train tfidf, y train) MultinomialNB lultinomialNB() averone wg GO: Step 8: Evaluate the Model Predict on the test set /_pred = model.predict(X_test_tfidf) Print the classification report rint(classification_report(y_test, y Confusion matrix visualization onf_matrix = confusion_matrix(y_test ns .heatmap(conf_matrix, annot=True, 1t.xlabel(' Predicted’ ) 1t.ylabel('True') 1t.title('Confusion Matrix') Lt. show() a e < /usr/local/1ib/python3.11/dist-packages/sklea _warn_prf(average, modifier, f"{metric.capi /usr/local/lib/python3.11/dist-packages/sklea _warn_prf(average, modifier, f"{metric.capi /usr/local/lib/python3.11/dist-packages/sklea _warn_prf(average, modifier, f"{metric.capi precision recall f1-score negative 0.92 0.26 0.41 neutral 0.00 0.00 0.00 positive 0.79 1.00 0.88 accuracy 0.80 macro avg 0.57 0.42 0.43 weighted avg 0.77 0.80 0.74 Confusion Matrix & 2000 3 &. 7 ° 28 - 1500 re - 1000 2- 10 ° - 500 & : : -0 negative neutral positive Predicted Drag from top and touch the back button to exit full screen.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy