0% found this document useful (0 votes)

32 views22 pages

Wine

The document outlines a data analysis process for a wine quality dataset, detailing steps from importing libraries and data exploration to data cleaning and visualization. It includes methods for handling missing values, generating descriptive statistics, and creating visual representations of the data. Additionally, it describes preprocessing steps for preparing the data for machine learning, including encoding categorical variables and classifying wine quality.

Uploaded by

gauthamsivathan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views22 pages

Wine

Uploaded by

gauthamsivathan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

gedovzo2h

March 9, 2025

0.1 1. Importing Required Dependencies

[455]: import requests as re
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier

0.2 2. Importing and Exploring Data

[456]: # data importing
wine = pd.read_csv("C:\\Users\\Gautham\\Downloads\\winequality-red-mod.csv")
wine.head()

[456]: id fixed acidity volatile acidity citric acid residual sugar \

0 1 7.4 0.70 0.00 1.9
1 2 7.8 0.88 0.00 2.6
2 3 7.8 0.76 0.04 2.3
3 4 11.2 0.28 0.56 1.9
4 5 7.4 0.70 0.00 1.9

flavonoids chlorides free sulfur dioxide total sulfur dioxide density \

0 0.590 0.076 11.0 34.0 0.9978
1 0.754 0.098 25.0 67.0 0.9968
2 0.688 0.092 15.0 54.0 0.9970
3 0.459 0.075 17.0 60.0 0.9980
4 0.590 0.076 11.0 34.0 0.9978

pH sulphates alcohol country quality

0 3.51 0.56 9.4 UK 5

1
1 3.20 0.68 9.8 Italy 5
2 3.26 0.65 9.8 Italy 5
3 3.16 0.58 9.8 Spain 6
4 3.51 0.56 9.4 UK 5

[457]: # data exploring

print(wine.shape)
print(wine.ndim)
wine.info()

(1506, 15)
2
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1506 entries, 0 to 1505
Data columns (total 15 columns):
id 1506 non-null int64
fixed acidity 1504 non-null float64
volatile acidity 1504 non-null float64
citric acid 1504 non-null float64
residual sugar 1504 non-null float64
flavonoids 1504 non-null float64
chlorides 1504 non-null float64
free sulfur dioxide 1504 non-null float64
total sulfur dioxide 1504 non-null float64
density 1504 non-null float64
pH 1504 non-null float64
sulphates 1504 non-null float64
alcohol 1504 non-null float64
country 1504 non-null object
quality 1506 non-null int64
dtypes: float64(12), int64(2), object(1)
memory usage: 170.6+ KB

0.3 3. Cleaning Data

Two rows are missing categorical data and it is better to drop them since we cannot find the mean
values of categorical data.
[458]: # dropping 2 rows with nan since there is no categorical data available
wine = wine.dropna()
wine.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1504 entries, 0 to 1505
Data columns (total 15 columns):
id 1504 non-null int64
fixed acidity 1504 non-null float64
volatile acidity 1504 non-null float64

2
citric acid 1504 non-null float64
residual sugar 1504 non-null float64
flavonoids 1504 non-null float64
chlorides 1504 non-null float64
free sulfur dioxide 1504 non-null float64
total sulfur dioxide 1504 non-null float64
density 1504 non-null float64
pH 1504 non-null float64
sulphates 1504 non-null float64
alcohol 1504 non-null float64
country 1504 non-null object
quality 1504 non-null int64
dtypes: float64(12), int64(2), object(1)
memory usage: 182.1+ KB

[459]: wine.describe()

[459]: id fixed acidity volatile acidity citric acid \

count 1504.000000 1504.000000 1504.000000 1504.000000
mean 752.880984 8.406981 0.524890 0.273910
std 434.843358 1.757966 0.183892 0.198186
min 1.000000 -1.000000 -1.000000 -1.000000
25% 376.750000 7.200000 0.390000 0.100000
50% 752.500000 8.000000 0.520000 0.260000
75% 1128.250000 9.300000 0.635000 0.430000
max 1506.000000 15.900000 1.580000 1.000000

residual sugar flavonoids chlorides free sulfur dioxide \

count 1504.000000 1504.000000 1504.000000 1504.000000
mean 2.540060 0.597992 0.087533 15.597074
std 1.406215 0.252626 0.055560 10.456825
min -1.000000 -1.000000 -1.000000 -1.000000
25% 1.900000 0.494000 0.071000 7.000000
50% 2.200000 0.559000 0.080000 13.000000
75% 2.600000 0.633000 0.091000 21.000000
max 15.500000 3.206000 0.611000 72.000000

total sulfur dioxide density pH sulphates \

count 1504.000000 1504.000000 1504.000000 1504.000000
mean 46.764628 0.996155 3.302553 0.657606
std 33.240872 0.057661 0.190515 0.177967
min -1.000000 -1.000000 -1.000000 -1.000000
25% 22.000000 0.995700 3.200000 0.550000
50% 38.000000 0.996800 3.300000 0.620000
75% 63.000000 0.997900 3.400000 0.730000
max 289.000000 1.999400 4.010000 2.000000

3
alcohol quality
count 1504.000000 1504.000000
mean 10.427238 5.635638
std 1.434245 0.815816
min -1.000000 3.000000
25% 9.500000 5.000000
50% 10.100000 6.000000
75% 11.100000 6.000000
max 45.300000 8.000000

0.4 4. Data Visualisation

[460]: #Correlation Heatmap of Variables
plt.figure(figsize=(10,8))
sns.heatmap(wine.corr(),cmap='YlGnBu',annot=True,cbar=False)

[460]: <matplotlib.axes._subplots.AxesSubplot at 0x68febef0>

4
[461]: # plotting the quality of wine using a countplot
fig = plt.figure(figsize = (8,4))
sns.set_style("whitegrid")
sns.countplot(wine['quality'],palette='plasma')
plt.title("QUALITY OF WINE", size=18)

[461]: Text(0.5,1,'QUALITY OF WINE')

[462]: #plotting the wine from different countries using a countplot

fig = plt.figure(figsize = (8,4))
sns.set_style("whitegrid")
sns.countplot(wine['country'],palette='plasma')
plt.title("WINE FROM COUNTRIES", size=18)

[462]: Text(0.5,1,'WINE FROM COUNTRIES')

5
[463]: # plotting the quality of wine against citric acid using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'citric acid', data=wine, palette="Spectral")
plt.title('Quality vs Citric Acid', size=20)

[463]: Text(0.5,1,'Quality vs Citric Acid')

6
[464]: # plotting the quality of wine against residual sugar using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'residual sugar', data=wine, palette="Spectral")
plt.title('Quality vs Residual Sugar', size=20)

[464]: Text(0.5,1,'Quality vs Residual Sugar')

7
[465]: # plotting the quality of wine against flavonoids using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'flavonoids', data=wine, palette="Spectral")
plt.title('Quality vs Flavonoids', size=20)

[465]: Text(0.5,1,'Quality vs Flavonoids')

8
[466]: # plotting the quality of wine against chlorides using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'chlorides', data=wine, palette='Spectral')
plt.title('Quality vs Chlorides', size=20)

[466]: Text(0.5,1,'Quality vs Chlorides')

9
[467]: # plotting the quality of wine against free sulphur dioxide using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'free sulfur dioxide', data=wine, palette='Spectral')
plt.title('Quality vs Free Sulphur Dioxide', size=20)

[467]: Text(0.5,1,'Quality vs Free Sulphur Dioxide')

10
[468]: # plotting the quality of wine against total sulphur dioxide using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'total sulfur dioxide', data=wine, palette='Spectral')
plt.title('Quality vs Total Sulphur Dioxide', size=20)

[468]: Text(0.5,1,'Quality vs Total Sulphur Dioxide')

11
[469]: # plotting the quality of wine against the density using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'density', data=wine, palette='Spectral')
plt.title('Quality vs Density', size=20)

[469]: Text(0.5,1,'Quality vs Density')

12
[470]: # plotting the quality of wine against PH level using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'pH', data=wine, palette='Spectral')
plt.title('Quality vs PH Level', size=20)

[470]: Text(0.5,1,'Quality vs PH Level')

13
[471]: # plotting the quality of wine against Sulphates using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'sulphates', data=wine, palette='Spectral')
plt.title('Quality vs Sulphates', size=20)

[471]: Text(0.5,1,'Quality vs Sulphates')

14
[472]: # plotting the quality of wine against the alcohol content using a barplot
sns.set_style("whitegrid")
sns.barplot('quality', 'alcohol', data=wine, palette='Spectral')
plt.title('Quality vs Alcohol', size=20)

[472]: Text(0.5,1,'Quality vs Alcohol')

15
0.5 5. PreProcessing Data for Building Machine Learning Algorithm
[473]: #Creating binary classificaion for prediction variable.
#classifyin wine as good and bad by giving a limit for the quality
bins = (2, 6.5, 8)
review = ['bad', 'good']
wine['quality'] = pd.cut(wine['quality'], bins = bins, labels = review)

[474]: #assigning labels to quality variable

label_quality = LabelEncoder()

#0 is bad review and 1 is good review

wine['quality'] = label_quality.fit_transform(wine['quality'])

#countplot for quality variable

sns.countplot(wine['quality'], palette='colorblind')

[474]: <matplotlib.axes._subplots.AxesSubplot at 0x61547970>

16
[475]: ## replacing the categorical variable to an integer
country_to_nums = {'country': {'UK': 1, 'Italy': 2, 'Spain':3}}
wine.replace(country_to_nums, inplace=True)
wine.head()

[475]: id fixed acidity volatile acidity citric acid residual sugar \

0 1 7.4 0.70 0.00 1.9
1 2 7.8 0.88 0.00 2.6
2 3 7.8 0.76 0.04 2.3
3 4 11.2 0.28 0.56 1.9
4 5 7.4 0.70 0.00 1.9

flavonoids chlorides free sulfur dioxide total sulfur dioxide density \

0 0.590 0.076 11.0 34.0 0.9978
1 0.754 0.098 25.0 67.0 0.9968
2 0.688 0.092 15.0 54.0 0.9970
3 0.459 0.075 17.0 60.0 0.9980
4 0.590 0.076 11.0 34.0 0.9978

pH sulphates alcohol country quality

0 3.51 0.56 9.4 1 0
1 3.20 0.68 9.8 2 0
2 3.26 0.65 9.8 2 0
3 3.16 0.58 9.8 3 0

17
4 3.51 0.56 9.4 1 0

[476]: # dividing the dataset as prediction variable and feature variabes

x = wine.drop('quality', axis = 1)
y = wine['quality']

0.6 6. PCA Analysis

[549]: #Scaling the data using StandardScalar
sc = StandardScaler()
x = sc.fit_transform(x)

[550]: # Performing PCA

pca = PCA()
x_pca = pca.fit_transform(x)

#plot the graph to find the principal components

sns.set_style("whitegrid")
plt.figure(figsize=(7,7))
plt.plot(np.cumsum(pca.explained_variance_ratio_), 'ro-')

[550]: [<matplotlib.lines.Line2D at 0x5fc81fb0>]

18
[551]: #from the plot above, we can see that 9 principal components attribute for 90%␣
↪of variation in the data.

#we can pick the first 9 components for our prediction.

pca_n = PCA(n_components=9)
x_pca_new = pca_n.fit_transform(x)

[552]: # Splitting the data into train and test

x_train, x_test, y_train, y_test = train_test_split(x_pca_new, y, test_size = 0.
↪2)

[553]: print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

19
(1203, 9)
(301, 9)
(1203,)
(301,)

0.7 7. Logistic Regression

[554]: lr = LogisticRegression()
lr.fit(x_train, y_train)
lr_predict = lr.predict(x_test)

c:\users\gautham\python\python36-32\lib\site-
packages\sklearn\linear_model\logistic.py:433: FutureWarning: Default solver
will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
FutureWarning)

[555]: lr_conf_matrix = confusion_matrix(y_test, lr_predict)

lr_acc_score = accuracy_score(y_test, lr_predict)
print(lr_conf_matrix)
print(lr_acc_score*100)

[[261 10]
[ 18 12]]
90.69767441860465

0.8 8. Decision Tree Classifier

[556]: dt = DecisionTreeClassifier()
dt.fit(x_train,y_train)
dt_predict = dt.predict(x_test)

[557]: dt_conf_matrix = confusion_matrix(y_test, dt_predict)

dt_acc_score = accuracy_score(y_test, dt_predict)
print(dt_conf_matrix)
print(dt_acc_score*100)

[[245 26]
[ 14 16]]
86.71096345514951

0.9 9. Random Forest Classifier

[558]: rf = RandomForestClassifier()
rf.fit(x_train, y_train)
rf_predict=rf.predict(x_test)

c:\users\gautham\python\python36-32\lib\site-
packages\sklearn\ensemble\forest.py:246: FutureWarning: The default value of

20
n_estimators will change from 10 in version 0.20 to 100 in 0.22.
"10 in version 0.20 to 100 in 0.22.", FutureWarning)

[559]: rf_conf_matrix = confusion_matrix(y_test, rf_predict)

rf_acc_score = accuracy_score(y_test, rf_predict)
print(rf_conf_matrix)
print(rf_acc_score*100)

[[264 7]
[ 17 13]]
92.02657807308971

0.10 10. Accuracy Comparison

[560]: ## Creating a datafram to compare the accuracy of the three different␣
↪algorithms performed.

acScore = pd.DataFrame()
acScore['Model'] = ['Linear Regression', 'Decision Tree', 'Random Forest␣
↪Classifier']

ac1 = lr_acc_score*100
ac2 = dt_acc_score*100
ac3 = rf_acc_score*100
acScore['Score'] = [ac1,ac2,ac3]
acScore

[560]: Model Score

0 Linear Regression 90.697674
1 Decision Tree 86.710963
2 Random Forest Classifier 92.026578

[561]: # plotting the accuracy of algorithms using a barplot

sns.set_style("whitegrid")
plot = sns.catplot(x='Model', y='Score', kind='bar', data=acScore,␣
↪palette='Spectral')

# plot.ax gives the axis object

# plot.ax.patches gives list of bars that can be access using index starting at␣
↪0

for i, bar in enumerate(plot.ax.patches):

h = bar.get_height()
plot.ax.text(
i, # bar index (x coordinate of text)
h+10, # y coordinate of text
'{}'.format(int(h)), # y label
ha='center',
va='center',

21
fontweight='bold',
size=20)

0.11 11. Conclusion

Through visualisation, it shows that Italy produces the most number of wines, followed by Spain
and UK.
While predicting the quality of wine, random forest classifier algorithm produced the best results
with 92% accuracy, followed by Logistic Regression with 90% accuracy and Decision Tree Classifier
with 86% accuracy.

Techniques of Value Analysis and Engineering by Lawrence D Miles
84% (38)
Techniques of Value Analysis and Engineering by Lawrence D Miles
383 pages
The New Homemade Kitchen: 250 Recipes and Ideas for Reinventing the Art of Preserving, Canning, Fermenting, Dehydrating, and More
From Everand
The New Homemade Kitchen: 250 Recipes and Ideas for Reinventing the Art of Preserving, Canning, Fermenting, Dehydrating, and More
Joseph Shuldiner
4.5/5 (5)
CSA Section by Section 8.6
100% (3)
CSA Section by Section 8.6
4 pages
Mini Project Report
No ratings yet
Mini Project Report
12 pages
Wine
No ratings yet
Wine
15 pages
Exercise#9 Instructions 2021
No ratings yet
Exercise#9 Instructions 2021
5 pages
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
No ratings yet
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
7 pages
Wine DS
No ratings yet
Wine DS
14 pages
Wine Quality Prediction Using Machine Learning
No ratings yet
Wine Quality Prediction Using Machine Learning
10 pages
Wine Quality Predictions
No ratings yet
Wine Quality Predictions
13 pages
Decision Trees
No ratings yet
Decision Trees
2 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
6 pages
SUBQUERIES
No ratings yet
SUBQUERIES
8 pages
21brs1715 Lab3
No ratings yet
21brs1715 Lab3
4 pages
Guillermo Garcia Rodriguez - Rivendel S.L
No ratings yet
Guillermo Garcia Rodriguez - Rivendel S.L
85 pages
MLP Slides Merged
No ratings yet
MLP Slides Merged
480 pages
The Art of Effective Visualization of Multi-Dimensional Data
No ratings yet
The Art of Effective Visualization of Multi-Dimensional Data
51 pages
Red Wine Mine
100% (1)
Red Wine Mine
32 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
R Project
No ratings yet
R Project
22 pages
45B AIML Practical07 Clustering
No ratings yet
45B AIML Practical07 Clustering
8 pages
From Import Import As From Import From Import From Import Import Import From Import From Import From Import
No ratings yet
From Import Import As From Import From Import From Import Import Import From Import From Import From Import
3 pages
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
No ratings yet
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
13 pages
Wine Quality Prediction: Implementation
No ratings yet
Wine Quality Prediction: Implementation
3 pages
Big Data Projecct
No ratings yet
Big Data Projecct
12 pages
Wine Quality Prediction Project Report
No ratings yet
Wine Quality Prediction Project Report
4 pages
Lab Assignment 10: Web Mining
No ratings yet
Lab Assignment 10: Web Mining
5 pages
Lab Assignment 10: Web Mining
No ratings yet
Lab Assignment 10: Web Mining
5 pages
Wine Quality Prediction GHAR
No ratings yet
Wine Quality Prediction GHAR
19 pages
Project CST 383
No ratings yet
Project CST 383
1,083 pages
Mahima 2020
No ratings yet
Mahima 2020
8 pages
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
100% (1)
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
10 pages
Quality Prediction Checkpoint
No ratings yet
Quality Prediction Checkpoint
14 pages
Eda Red Wine
No ratings yet
Eda Red Wine
16 pages
Machine Learning Miniproject
No ratings yet
Machine Learning Miniproject
10 pages
Wine Quality Questions
No ratings yet
Wine Quality Questions
2 pages
Central Tendency and Dispersion Analysis - 12212204
No ratings yet
Central Tendency and Dispersion Analysis - 12212204
14 pages
FINLATICS
No ratings yet
FINLATICS
8 pages
Wine Quality Analysis
No ratings yet
Wine Quality Analysis
27 pages
Practical04.ipynb - Colab
No ratings yet
Practical04.ipynb - Colab
2 pages
Basic Python Analysis
No ratings yet
Basic Python Analysis
33 pages
R Console
No ratings yet
R Console
1 page
Pandas Usefull Code
No ratings yet
Pandas Usefull Code
2 pages
WINE Prediction Quality
100% (1)
WINE Prediction Quality
6 pages
Python Project 2 Colab
No ratings yet
Python Project 2 Colab
6 pages
Prediction of Wine Quality Using Machine Learning
100% (1)
Prediction of Wine Quality Using Machine Learning
12 pages
Statistics and Probability PROJECT 2
No ratings yet
Statistics and Probability PROJECT 2
8 pages
Honours LY Project
No ratings yet
Honours LY Project
31 pages
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
No ratings yet
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
5 pages
Datamining Exp5 Datanormalisation
No ratings yet
Datamining Exp5 Datanormalisation
14 pages
Import As From Import From Import Import As
No ratings yet
Import As From Import From Import Import As
5 pages
DataFrame and Series
No ratings yet
DataFrame and Series
2 pages
Wine
No ratings yet
Wine
2 pages
10.1007@978 981 13 7403 623
No ratings yet
10.1007@978 981 13 7403 623
9 pages
Wine Quality Synopsis
No ratings yet
Wine Quality Synopsis
3 pages
Wine Prediction
100% (1)
Wine Prediction
13 pages
Devesh
No ratings yet
Devesh
11 pages
HW04
No ratings yet
HW04
3 pages
ML PR
No ratings yet
ML PR
32 pages
Learning Concepts Hackers Realm
No ratings yet
Learning Concepts Hackers Realm
78 pages
A List of Factorial Math Constants
From Everand
A List of Factorial Math Constants
StreetLib
No ratings yet
MikroTik MTCNA
From Everand
MikroTik MTCNA
Ehab Abo Elazm
No ratings yet
Ring Gasket Sizing Chart VF
No ratings yet
Ring Gasket Sizing Chart VF
11 pages
PADANG 0308 Test
No ratings yet
PADANG 0308 Test
122 pages
Unit Rate 2024
No ratings yet
Unit Rate 2024
44 pages
Paperbangkok
No ratings yet
Paperbangkok
17 pages
Msds Coagulation
No ratings yet
Msds Coagulation
54 pages
HAVE TO - NEED TO Semana 1
No ratings yet
HAVE TO - NEED TO Semana 1
5 pages
Few Practical Obs e 00 Stap
No ratings yet
Few Practical Obs e 00 Stap
96 pages
Easybib 553e7541694d58 39916757
No ratings yet
Easybib 553e7541694d58 39916757
7 pages
Purchase Spec. For Plates-Copper Alloy (SB171 Uns C46400)
No ratings yet
Purchase Spec. For Plates-Copper Alloy (SB171 Uns C46400)
4 pages
Development of An Automated Multi-Level Car Parking System: December 2015
No ratings yet
Development of An Automated Multi-Level Car Parking System: December 2015
8 pages
Green City Planning
67% (3)
Green City Planning
16 pages
ELISA
100% (2)
ELISA
24 pages
Finite Potential Well - Scattering
No ratings yet
Finite Potential Well - Scattering
10 pages
Lectures On Digital Design Principles (2023, River Publishers, Routledge) - Libgen - Li
No ratings yet
Lectures On Digital Design Principles (2023, River Publishers, Routledge) - Libgen - Li
280 pages
Desert Rivers
No ratings yet
Desert Rivers
6 pages
LEAKTESTING COMMISSIONING Ref AC
No ratings yet
LEAKTESTING COMMISSIONING Ref AC
8 pages
Corrosion and Corrosion Testing: Standard Terminology Relating To
No ratings yet
Corrosion and Corrosion Testing: Standard Terminology Relating To
5 pages
3DS Max 2011 Shortcuts
100% (1)
3DS Max 2011 Shortcuts
16 pages
Critical Evaluation of Socio-Cultural and Climatic Aspects in A Traditional Community: A Case Study of Pillayarpalayam Weavers' Cluster, Kanchipuram
No ratings yet
Critical Evaluation of Socio-Cultural and Climatic Aspects in A Traditional Community: A Case Study of Pillayarpalayam Weavers' Cluster, Kanchipuram
16 pages
FDA Confirms Graphene Oxide Is in The mRNA COVID-19 Vaccines
No ratings yet
FDA Confirms Graphene Oxide Is in The mRNA COVID-19 Vaccines
8 pages
Chapter 6 (Convective Heat Transfer Only)
No ratings yet
Chapter 6 (Convective Heat Transfer Only)
28 pages
Response of Newly Collected Acetobacter Isolates in Sweet Corn (Zea Mays L. Saccharata)
No ratings yet
Response of Newly Collected Acetobacter Isolates in Sweet Corn (Zea Mays L. Saccharata)
5 pages
SA Health Cleaning Standard 2014 - (v1.1) CDCB Ics 20180301 PDF
No ratings yet
SA Health Cleaning Standard 2014 - (v1.1) CDCB Ics 20180301 PDF
48 pages
The Automotive Gray Market John B Hege Download
No ratings yet
The Automotive Gray Market John B Hege Download
50 pages
P02-F01-03 HACCP Plan - Fried Ice Cream
No ratings yet
P02-F01-03 HACCP Plan - Fried Ice Cream
5 pages
Military Chants For Criminology.
No ratings yet
Military Chants For Criminology.
3 pages
Chem Lab Report
No ratings yet
Chem Lab Report
6 pages
Kinetic Molecular Theory of Gases Worksheet
100% (1)
Kinetic Molecular Theory of Gases Worksheet
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Wine

Uploaded by

Wine

Uploaded by

gedovzo2h

0.1 1. Importing Required Dependencies

0.2 2. Importing and Exploring Data

[456]: id fixed acidity volatile acidity citric acid residual sugar \

flavonoids chlorides free sulfur dioxide total sulfur dioxide density \

pH sulphates alcohol country quality

[457]: # data exploring

0.3 3. Cleaning Data

[459]: id fixed acidity volatile acidity citric acid \

residual sugar flavonoids chlorides free sulfur dioxide \

total sulfur dioxide density pH sulphates \

0.4 4. Data Visualisation

[460]: <matplotlib.axes._subplots.AxesSubplot at 0x68febef0>

[461]: Text(0.5,1,'QUALITY OF WINE')

[462]: #plotting the wine from different countries using a countplot

[462]: Text(0.5,1,'WINE FROM COUNTRIES')

[463]: Text(0.5,1,'Quality vs Citric Acid')

[464]: Text(0.5,1,'Quality vs Residual Sugar')

[465]: Text(0.5,1,'Quality vs Flavonoids')

[466]: Text(0.5,1,'Quality vs Chlorides')

[467]: Text(0.5,1,'Quality vs Free Sulphur Dioxide')

[468]: Text(0.5,1,'Quality vs Total Sulphur Dioxide')

[469]: Text(0.5,1,'Quality vs Density')

[470]: Text(0.5,1,'Quality vs PH Level')

[471]: Text(0.5,1,'Quality vs Sulphates')

[472]: Text(0.5,1,'Quality vs Alcohol')

[474]: #assigning labels to quality variable

#0 is bad review and 1 is good review

#countplot for quality variable

[474]: <matplotlib.axes._subplots.AxesSubplot at 0x61547970>

[475]: id fixed acidity volatile acidity citric acid residual sugar \

flavonoids chlorides free sulfur dioxide total sulfur dioxide density \

pH sulphates alcohol country quality

[476]: # dividing the dataset as prediction variable and feature variabes

0.6 6. PCA Analysis

[550]: # Performing PCA

#plot the graph to find the principal components

[550]: [<matplotlib.lines.Line2D at 0x5fc81fb0>]

#we can pick the first 9 components for our prediction.

[552]: # Splitting the data into train and test

0.7 7. Logistic Regression

[555]: lr_conf_matrix = confusion_matrix(y_test, lr_predict)

0.8 8. Decision Tree Classifier

[557]: dt_conf_matrix = confusion_matrix(y_test, dt_predict)

0.9 9. Random Forest Classifier

[559]: rf_conf_matrix = confusion_matrix(y_test, rf_predict)

0.10 10. Accuracy Comparison

[560]: Model Score

[561]: # plotting the accuracy of algorithms using a barplot

# plot.ax gives the axis object

for i, bar in enumerate(plot.ax.patches):

0.11 11. Conclusion

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.