0% found this document useful (0 votes)

19 views9 pages

Lab Rep

Uploaded by

Blahblah blahblah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views9 pages

Lab Rep

Uploaded by

Blahblah blahblah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Optimizing Wine Quality Prediction: A Machine Learning Approach

Using Chemical Properties

Submitted by:

Ballad, Jeremiah Khalil T.

Beig, Zyrin B.
Dela Cruz, John Benedict C.
Garrido, Kyla C.
Manalese, Jan Raya Altaire P.
Origenes, Joshua Paul Andrae A.

DS100-4 / B12
1st Term AY 2024 – 2025
INTRODUCTION
The physicochemical properties of wine play a crucial role in determining its quality, as
perceived by human tasters and analyzed through data-driven methodologies. Research conducted by
Cortez et al. (2009) and Nebot et al. (2018) highlights the application of various machine learning
techniques to analyze and predict wine quality based on these properties, ultimately aiming to support
the wine industry by providing objective and scalable assessments.
In their study, Cortez et al. (2009) employed multiple regression, neural networks, and support
vector machines (SVM) on the Vinho Verde dataset, concluding that certain chemical properties—such
as alcohol content, volatile acidity, and residual sugar—significantly influence wine ratings. This
research demonstrates that by identifying and modeling the relationships between chemical variables
and quality, machine learning models can predict wine quality with promising accuracy.
Conversely, fuzzy logic techniques were utilized to capture the intricacies of wine preferences,
revealing that factors such as alcohol content, fixed acidity, free sulfur dioxide, and volatile acidity are
critical indicators of wine quality. This study further corroborates that specific physicochemical
properties consistently affect quality, even across different modeling approaches. The interpretability
of the fuzzy model proved particularly advantageous for industry applications, where understanding
the influence of each variable is essential (Nebot et al., 2018).
Additionally, Angus (2020) explored the use of neural networks to automate wine scoring,
highlighting how the chemical properties of wine can predict sensory scores without the need for
human tasters. Collectively, the results from these studies suggest that machine learning and data
mining methodologies offer reliable and insightful approaches for assessing wine quality, benefiting
production processes and facilitating objective quality assessments. Together, these findings
underscore the significance of specific chemical properties in wine quality, supporting the
development of machine learning models as viable tools for the wine industry.
The quality of wine arises from a delicate balance of chemical elements that collectively shape
its taste, texture, and aging potential. According to Volschenk et al. (2017), one key factor is fixed
acidity, which consists mainly of tartaric and malic acids found naturally in grapes. This type of acidity
remains stable throughout fermentation, lending structure and a crispness to wine. This is especially
valued in white wines, where it enhances freshness and balances sweetness (Payan et al., 2023).
Another form of acidity is the volatile acidity which is primarily acetic acid that can add complexity but,
if present in excess, gives the wine an undesirable vinegar-like taste.
In addition to the natural acidity from tartaric and malic acids, winemakers introduce small
amounts of citric acid to boost acidity which adds a bright, fresh note that sharpens the wine’s edge.
Another key component shaping a wine’s profile is residual sugar. It is the sugar remaining after
fermentation that defines its level of sweetness. As Gadd (2021) notes, wines low in residual sugar are
considered dry, while those with higher levels are sweeter, catering to diverse palates and preferences.
Additionally, salt, specifically sodium chloride, also influences a wine's taste; as Logothetis and Walker
(2010) point out, it adds a subtle salinity that enhances texture, though an excess can disrupt the
wine’s balance.
Beyond the components that shape a wine's taste, it also contains elements with antimicrobial
properties. Sulfur dioxide, for instance, is commonly used as a preservative to prevent oxidation and
control microbial growth, but, as Grogan (2015) points out, overuse can compromise aroma and taste,
making balance essential. Additionally, the wine’s density reflects its sugar and alcohol content which
affects the body and mouthfeel. Furthermore, pH, generally between 3 and 4, plays a crucial role by
influencing acidity, which helps maintain freshness and balance between sweetness and alcohol.
Finally, Granuzzo et al. (2023) found that sulfates, acting as antioxidants, stabilize the wine, while
carefully managed alcohol levels contribute body and warmth, qualities especially valued in fuller-
bodied red wines. Together, these elements allow winemakers to craft wines with varied profiles,
catering to diverse tastes and preferences.
The wine quality dataset by Cortez et al. (2009) consists of 13 variables in total. There are 11
numerical variables representing various chemical properties, one categorical variable indicating wine
type (red or white), and one discrete variable for quality rating, with each sample evaluated on a scale
from 0 (worst) to 10 (best).
The objective of this project is to develop a machine learning model capable of predicting wine
quality based on its chemical properties. A prediction system like this could also be helpful for
marketing or oenology student training (Cortez et al., 2009). Furthermore, this study aims to identify
which chemical properties most significantly impact wine quality and compare the influence of these
chemical properties on quality between red and white wines. It is also necessary to evaluate the
predictive performance of the model to ensure its accuracy and reliability. Ultimately, this project
aspires to provide valuable insights into the relationship between chemical composition and perceived
quality in wines, enhancing the understanding of what defines exceptional wines in the competitive
market.

MODEL DESCRIPTION

Figure 1. Machine learning project procedure flowchart

Data Loading and Preparation

The dataset was sourced from the Kaggle repository and includes records of Portuguese Vinho
Verde wine, with 1,599 samples of red wine and 4,898 samples of white wine, totaling 6,497 entries.
Data loading was accomplished using the Pandas library in Python, with the pd.read_csv() function
to read the dataset. Once loaded, the dataset was divided into two distinct DataFrames for red and
white wines to facilitate separate analyses for each type. The initial data structure, types, and
completeness were assessed using the info() function, confirming that there were no missing values.
Additionally, Matplotlib and Seaborn libraries were utilized for visualization to allow for subsequent
visual analysis of wine characteristics by type.
Exploratory Data Analysis (EDA)
Exploratory Data Analysis (EDA) was conducted to gain insights into the distribution of the
dataset’s features and assess relationships between variables. Histograms were generated to
represent the distribution of physicochemical properties and quality scores for red and white wines
using sns.histplot function, allowing visual assessment of each variable's spread and central
tendency. Skewness was calculated for all numeric variables using the skew() function to identify any
significant asymmetries, with a threshold of absolute skewness >1 set for logarithmic transformations.
Furthermore, Pearson correlation coefficients were calculated, and heatmaps were generated to
assess the strength and direction of correlations between wine quality and other features in both the
red and white wine datasets.
Data Preprocessing
Since no missing values were detected in the dataset, imputation was deemed unnecessary.
Data preprocessing focused primarily on normalizing and standardizing skewed variables to meet
model assumptions. Variables with an absolute skewness greater than 1 were log-transformed to
approximate a normal distribution, after which they were standardized to ensure consistency across
features. Additionally, feature selection was conducted based on correlation significance. Only features
with correlation coefficients of |R| > 0.19 relative to the quality variable were retained, reducing
dimensionality and meeting assumptions of homoscedasticity (equal variance) and linearity for the
regression model.
Model Training and Testing
The study employed a multiple linear regression (MLR) model to predict wine quality based
on chemical and sensory attributes. The dataset was split into training (80%) and testing (20%) subsets
to allow model training and subsequent evaluation. The multiple linear regression model aimed to
predict the target variable (wine quality) using the selected features. Model predictions were
evaluated against both actual and synthetic data samples, with the ultimate goal of establishing a
reliable prediction model for wine quality based on physicochemical properties.
Dataset Summary
The dataset consists of chemical and sensory attributes, including fixed acidity, volatile acidity,
residual sugar, chlorides, sulfur dioxide levels (free and total), density, pH, sulphates, and alcohol
content. Quality scores, as determined by expert sensory evaluations, serve as the target variable. For
analysis purposes, the quality scores were grouped into three distinct categories: Low Quality (scores
0-4), Average Quality (scores 5-7), and High Quality (scores 8-10). This classification facilitated a
structured approach to examining the impact of each feature on wine quality across different quality
categories.
This machine learning facilitated an organized investigation of the physicochemical
characteristics of wine samples and their relationship with quality, culminating in a predictive model
designed to assess wine quality based on chemical composition and sensory metrics.

RESULTS AND DISCUSSION

Figure 2. White (left) and red (right) wine variable distributions

Figure 2 shows the histograms of the numerical variables in the white and red wine datasets
providing valuable insights into their distributions. Many variables exhibit right-skewed distributions,
such as the chlorides in white wine and sulphates in red wine, indicating a concentration of data points
towards lower values. On the other hand, the density and pH variables for both wine types exhibit
approximately normal distributions, indicating a more balanced distribution of values. The quality
variable, a discrete variable representing the wine quality rating, is also approximately normal, having
a central tendency around 5 to 6. This suggests that most wines in the dataset are of moderate quality.
Table 1. Skewness values of white and red wine variables before and after log transformations
White Red
Variable Before After Before After
fixed acidity 0.647553 0.647553 0.981829 0.981829
volatile acidity 1.576497 0.872987 0.670962 0.670962
citric acid 1.281528 0.612170 0.318039 0.318039
residual sugar 1.076764 0.004297 4.536395 0.958917
chlorides 5.021792 0.984836 5.675017 0.961571
free sulfur dioxide 1.406314 -0.828047 1.249394 -0.097307
total sulfur dioxide 0.390590 0.390590 1.514109 -0.035712
density 0.977474 0.977474 0.071221 0.071221
pH 0.457642 0.457642 0.193502 0.193502
sulphates 0.976894 0.976894 2.426393 0.877375
alcohol 0.487193 0.487193 0.860021 0.860021
quality 0.155749 0.155749 0.217597 0.217597

Table 1 above lists the quantitative skewness of the variables, before and after treatment, for
both white and red wine types. Reinforcing the histograms, the values calculated also show that some
of the variables are positively skewed, thereby making the data distribution far from normality. To
make the data suitable for regression modeling, the skewed variables underwent logarithmic
transformations to reduce the skewness and handle the outlier values. The processing assumes an
approximately normal distribution for skewness below the threshold of 1. For skewed data with
significant correlations, as seen later on, their contributions to the model prediction will also be on a
logarithmic scale.

Figure 3. Correlation heat map for white (left) and red (right) wine variables to quality

Figure 3 displays the heat maps for the correlations of white and red wine parameters with
the quality variable. In the used color map, a deep red represents a strong positive correlation,
whereas a deep blue connotes a strong negative relationship; a neutral gray color symbolizes very
weak to no correlation. For both, parameters for each type with a correlation coefficient absolute value
of 0.2 and higher, considered significant in determining wine quality, were selectively filtered from the
other variables. These include, for white wine: chlorides, density, and alcohol; and for red wine: volatile
acidity, citric acid, sulfates, and alcohol. This filtering of wine features enables the creation of a much
simpler regression model for the wine quality; this also decreases the noise of the data caused by the
effect of the other factors.
Figure 4. Predicted vs. true quality for the multiple regression model for white (left) and red (right) wine

Using Scikit-Learn’s linear regression tool, multiple linear regression models predicting the
quality of white and red wine using the data for the identified features with significant effects were
determined. The regression equations for the two models are demonstrated below (Note: VA – volatile
acidity, CA – citric acid). Notice that the features that were originally highly skewed, which were treated
with log transformation to normalize, were inscribed in a log function in the equations.
white_quality = 5.8721 – 0.0586 × log[chlorides] + 0.0756 × [density] + 0.4133 × [alcohol]
red_quality = 5.6278 – 0.2100 × [VA] – 0.0210 × [CA] + 0.1227 × log[sulphates] + 0.3394 × [alcohol]

Figure 4 above displays the scatterplot for the predicted versus true quality values for the
different entries in the database. The solid diagonal line denotes the perfect regression model, i.e.,
predicted value equals true value. Based on the visualization, the data points do not exactly follow the
trend of the solid line; the white wine data shows a more “flat” trend, while the red data is more
slanted but still diverting from the line. Noticeably, the modal values (5 and 6) are only those
successfully predicted by the model with good precision, exemplifying the limitation of both models.
This suggests that a better regression model, like the logistic model for dichotomous response (i.e.,
bad and good quality), could be more appropriate to model the dataset; the nature of the quality
feature as a discrete variable from 0 to 10, not a mere good/bad variable, averted us to use the said
regression model. This plot illustrates a weak fit of the regression model to the actual data; an
analytical or numerical evaluation highlights this hunch more.
Table 2. Regression model evaluation metrics
Metric White wine model Red wine model
R2 (test) 0.212749 0.341379
R2 (train) 0.192415 0.332091
MSE 0.609705 0.423061
RMSE 0.780836 0.641312
MAE 0.619060 0.503290

Table 2 enumerates the evaluation metrics for the white and red wine regression models. For
both models, the relatively low coefficient of determination (R2) model demonstrates a weak
prediction power, consistent with the inference from the visualization above. Included as well the
mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE), all
indicating quite large but tolerable variability between the predicted and true values. The R2 score for
the red wine is higher than that of white wine, also consistent with the aforementioned trend
discussed. The difference between the testing and training R2 is relatively small for both models,
meaning that the formulated regression model is not overfitted to the training data, and can
appropriately predict wine quality from a new set of data.

CONCLUSION
The development of this machine learning model has highlighted the potential of data driven
approaches in accurately predicting wine quality based on measurable chemical properties. Through
rigorous data exploration and analysis, several key variables were identified as significant indicators of
quality in both red and white wines. Among these, alcohol content was found to have a strong positive
correlation with quality ratings, suggesting that higher alcohol levels may enhance certain desirable
sensory attributes. In contrast, volatile acidity, chlorides, and residual sugar were generally associated
with lower quality scores, implying that the excessive levels of these compounds can detract from the
wine’s balance and overall flavor profile.
The process of data preprocessing, which included normalization, log transformation, and the
filtering of highly skewed data, ensured that the model was well-equipped to handle variability within
the dataset. Selecting only the most relevant features further streamlined the model, enhancing its
predictive capacity while minimizing unnecessary complexity. This focused approach not only
improved the model’s accuracy but also facilitated an interpretative framework for understanding the
influence of each variable on wine quality. In addition to identifying the primary factors affecting
quality, the model categorized wines into low, average, and high-quality tiers, providing an accessible
way to interpret the results and making it easier to derive actionable insights. This classification can
serve as a valuable tool for winemakers, offering guidance on the ideal chemical compositions that
may yield higher-quality wines. The findings also underscore the potential for machine learning
applications to standardize quality assessments in the wine industry, reducing reliance on subjective
sensory evaluations and supporting consistent quality control.
Future research could expand this work by incorporating additional sensory data and testing
the model on a more diverse range of wine types and varieties. Additionally, integrating more
advanced machine learning algorithms may uncover deeper insights into the complex relationships
between wine composition and perceived quality. The adaptability of this model positions it as a useful
asset for wine producers and researchers, who can leverage these insights to refine production
processes and tailor wine profiles to meet evolving consumer tastes. This project illustrates the power
of machine learning in offering precise, scalable solutions for quality prediction, with the potential to
transform quality assessment practices in oenology.

REFERENCES
Gadd, D. (2021, December 13). Understanding the dryness scale of wines. Wine Wisdoms.
https://winewisdoms.com/article/understanding-dryness-scale-of-wines
Granuzzo, S., Righetto, F., Peggion, C., Bosaro, M., Frizzarin, M., Antoniali, P., Sartori, G., & Lopreiato,
R. (2023). Sulphate uptake plays a major role in the production of sulphur dioxide by yeast cells
during oenological fermentations. Fermentation, 9(3), 280.
https://doi.org/10.3390/fermentation9030280
Grogan, K. A. (2015). The value of added sulfur dioxide in French organic wine. Agricultural and Food
Economics, 3(1). https://doi.org/10.1186/s40100-015-0038-1
Logothetis, S., & Walker, G. (2010). Influence of sodium chloride on wine yeast fermentation
performance. International Journal of Wine Research, 35. https://doi.org/10.2147/ijwr.s10889
Payan, C., Gancel, A., Jourdes, M., Christmann, M., & Teissedre, P. (2023). Wine acidification
methods: a review. OENO One, 57(3), 113–126. https://doi.org/10.20870/oeno-
one.2023.57.3.7476
Volschenk, H., Van Vuuren, H., & Viljoen-Bloom, M. (2017). Malic acid in wine: Origin, function and
metabolism during vinification. South African Journal of Enology and Viticulture, 27(2).
https://doi.org/10.21548/27-2-1613
APPENDIX: Python Code

# Import Python libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import skew
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, root_mean_squared_error, mean_absolute_error

# Load csv file

data = pd.read_csv('wine-quality-white-and-red.csv')
data

# Split red and white wine data

whitewine_data = data[data['type'] == 'white']
redwine_data = data[data['type'] == 'red']

Exploratory Data Analysis

# Show information on datatypes, columns, and number of entries

whitewine_data.info()
redwine_data.info()

# Show statistics on different columns

whitewine_data.describe()
redwine_data.describe()

# Show distributions of other numerical variables

# Note that some of the distributions are right-skewed, thus needing treatment to normalize
whitewine_data.hist(bins=20, figsize=(14,10), color='pink')
plt.suptitle('White Wine Variable Distributions');
redwine_data.hist(bins=20, figsize=(14,10), color='red')
plt.suptitle('Red Wine Variable Distributions');

# Calculate skewness
white_numeric = whitewine_data.select_dtypes(include=[np.number]).columns
white_skewval = whitewine_data[white_numeric].apply(skew)
red_numeric = redwine_data.select_dtypes(include=[np.number]).columns
red_skewval = redwine_data[red_numeric].apply(skew)
white_skewval, red_skewval

# Analyze correlation between X variables vs. quality

white_num = whitewine_data.select_dtypes(include=[np.number])
white_correl = white_num.corr()
white_quality_correl = white_correl['quality'].drop('quality')
plt.figure(figsize=(8,6))
sns.heatmap(pd.DataFrame(white_quality_correl), annot=True, cmap='coolwarm', center=0, fmt='.2f',
linewidth=0.5, cbar_kws={'shrink': .8})
plt.title('Correlation Heat Map for White Wine Parameters to Quality');
red_num = redwine_data.select_dtypes(include=[np.number])
red_correl = red_num.corr()
red_quality_correl = red_correl['quality'].drop('quality')
plt.figure(figsize=(8,6))
sns.heatmap(pd.DataFrame(red_quality_correl), annot=True, cmap='coolwarm', center=0, fmt='.2f',
linewidth=0.5, cbar_kws={'shrink': .8})
plt.title('Correlation Heat Map for Red Wine Parameters to Quality');

Preprocessing

# Identify highly skewed variables, then normalize using log transformation

# Iteration is necessary for variables with really high skewness
white_highskew = white_skewval[abs(white_skewval) > 1].index
for col in white_highskew:
while abs(whitewine_data[col].skew()) > 1:
whitewine_data[col] = np.log1p(whitewine_data[col])
whitewine_data[white_numeric].apply(skew)
red_highskew = red_skewval[abs(red_skewval) > 1].index
for col in red_highskew:
while abs(redwine_data[col].skew()) > 1:
redwine_data[col] = np.log1p(redwine_data[col])
redwine_data[red_numeric].apply(skew)

# Filter variables with significant correlation

white_highcorrel = white_correl['quality'][abs(white_correl['quality']) >= 0.2].index
white_filtered = whitewine_data[white_highcorrel]
red_highcorrel = red_correl['quality'][abs(red_correl['quality']) >= 0.2].index
red_filtered = redwine_data[red_highcorrel]
white_highcorrel, red_highcorrel

# Classify variables as X or y
X_white = white_filtered.drop('quality', axis=1)
y_white = white_filtered['quality']
X_red = red_filtered.drop('quality', axis=1)
y_red = red_filtered['quality']

# Standardize variables
scaler = StandardScaler()
X_white_scaled = scaler.fit_transform(X_white)
X_red_scaled = scaler.fit_transform(X_red)

# Split data into training and testing sets

Xw_train, Xw_test, yw_train, yw_test = train_test_split(X_white_scaled, y_white, test_size=0.2)
Xr_train, Xr_test, yr_train, yr_test = train_test_split(X_red_scaled, y_red, test_size=0.2)

Multiple Linear Regression Model

# Creating linear regression model

# White wine model
linreg_white = LinearRegression()
white_model = linreg_white.fit(Xw_train,yw_train)
white_target = linreg_white.predict(Xw_test)
plt.figure()
plt.scatter(yw_test, white_target, color='pink')
plt.plot(range(10), range(10), color='black')
plt.xlabel('True Values')
plt.ylabel('Predictions');

# White wine model evaluation metrics

print('R2 (test):', white_model.score(Xw_test, yw_test))
print('Coefficients:', linreg_white.coef_)
print('R2 (train):', white_model.score(Xw_train, yw_train))
print('Intercept:', linreg_white.intercept_)
print('MSE:', mean_squared_error(yw_test, white_target))
print('RMSE:', root_mean_squared_error(yw_test, white_target))
print('MAE:', mean_absolute_error(yw_test, white_target))

# Red wine model

linreg_red = LinearRegression()
red_model = linreg_red.fit(Xr_train,yr_train)
red_target = linreg_red.predict(Xr_test)
plt.figure()
plt.scatter(yr_test, red_target, color='red')
plt.plot(range(10), range(10), color='black')
plt.xlabel('True Values')
plt.ylabel('Predictions');

# Red wine model evaluation

print('R-squared Score (test):', red_model.score(Xr_test, yr_test))
print('Coefficients:', linreg_red.coef_)
print('R-squared Score (train):', red_model.score(Xr_train, yr_train))
print('Intercept:', linreg_red.intercept_)
print('Mean Squared Error:', mean_squared_error(yr_test, red_target))
print('RMSE:', root_mean_squared_error(yr_test, red_target))
print('MAE:', mean_absolute_error(yr_test, red_target))

Prediction of Wine Quality Using Machine Learning
100% (1)
Prediction of Wine Quality Using Machine Learning
12 pages
SPSS For Beginners
100% (7)
SPSS For Beginners
445 pages
Wine Quality Prediction Using ML PPR
100% (1)
Wine Quality Prediction Using ML PPR
8 pages
Beverages: Current Research Related To Wine Sensory Perception Since 2010
No ratings yet
Beverages: Current Research Related To Wine Sensory Perception Since 2010
18 pages
Wine Case Report
100% (2)
Wine Case Report
16 pages
VinQCheck: An Intelligent Wine Quality Assessment
No ratings yet
VinQCheck: An Intelligent Wine Quality Assessment
9 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
An Investigation of Wine Quality Testing Using Machine Learning Techniques
No ratings yet
An Investigation of Wine Quality Testing Using Machine Learning Techniques
8 pages
Project Report AS
No ratings yet
Project Report AS
32 pages
Business Statistics Communicating With Numbers by Sanjiv Jaggia
No ratings yet
Business Statistics Communicating With Numbers by Sanjiv Jaggia
313 pages
Wine Quality Predictor
0% (1)
Wine Quality Predictor
9 pages
Irjmets Journal
No ratings yet
Irjmets Journal
7 pages
Full Text 2
No ratings yet
Full Text 2
18 pages
Combined Synthetic Minority Oversampling Technique and Deep Neural Network For Red Wine Quality Prediction
No ratings yet
Combined Synthetic Minority Oversampling Technique and Deep Neural Network For Red Wine Quality Prediction
6 pages
Foods 11 02417
No ratings yet
Foods 11 02417
16 pages
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
No ratings yet
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
13 pages
Homework #1 - Hida Efri Nurfina
No ratings yet
Homework #1 - Hida Efri Nurfina
13 pages
The Use of Fourier Transform Infrared Ftir Spectroscopy and Artificial Neural Networks Anns To Assess Wine Quality 2329 6798.1000110
No ratings yet
The Use of Fourier Transform Infrared Ftir Spectroscopy and Artificial Neural Networks Anns To Assess Wine Quality 2329 6798.1000110
8 pages
Econometrics Project AARYAN BHANOT
No ratings yet
Econometrics Project AARYAN BHANOT
13 pages
Pred Analytics
No ratings yet
Pred Analytics
5 pages
Wine Quality Prediction Using Data Mining
No ratings yet
Wine Quality Prediction Using Data Mining
13 pages
AI Projects
No ratings yet
AI Projects
41 pages
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
No ratings yet
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
7 pages
Wine5 PDF
No ratings yet
Wine5 PDF
29 pages
Report Revathy
No ratings yet
Report Revathy
13 pages
Humair Arshad Wine Quality Revised
No ratings yet
Humair Arshad Wine Quality Revised
16 pages
ICDMpaperv 1
No ratings yet
ICDMpaperv 1
11 pages
Understanding Wines
No ratings yet
Understanding Wines
16 pages
Analytics Report
No ratings yet
Analytics Report
3 pages
1 s2.0 S2212429223010052 Main
No ratings yet
1 s2.0 S2212429223010052 Main
16 pages
EDA Mini Project Report
No ratings yet
EDA Mini Project Report
23 pages
DWDM Glob
No ratings yet
DWDM Glob
20 pages
The Classification of White Wine and Red Wine Acco
No ratings yet
The Classification of White Wine and Red Wine Acco
5 pages
ML Miniproject
No ratings yet
ML Miniproject
19 pages
Machine Learning Miniproject
No ratings yet
Machine Learning Miniproject
10 pages
A Data Mining Approach To Wine Quality Prediction - Radosavljevic, Ilic, Pitulic
No ratings yet
A Data Mining Approach To Wine Quality Prediction - Radosavljevic, Ilic, Pitulic
5 pages
Xstkfinal
No ratings yet
Xstkfinal
29 pages
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
No ratings yet
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
5 pages
In Vino Veritas Data Mining and Machine Learning Final Project
No ratings yet
In Vino Veritas Data Mining and Machine Learning Final Project
11 pages
Data Analysis and Modeling in R
No ratings yet
Data Analysis and Modeling in R
12 pages
Nuriel Shalom Mor - Wine Quality and Type Prediction
No ratings yet
Nuriel Shalom Mor - Wine Quality and Type Prediction
13 pages
Wine Quality Predictions
No ratings yet
Wine Quality Predictions
13 pages
Wine Quality Prediction Research Paper 22
No ratings yet
Wine Quality Prediction Research Paper 22
6 pages
Molecules 28 06326 v2
No ratings yet
Molecules 28 06326 v2
15 pages
Exploring The Influence of Terroir On Douro White and Red Wines Characteristics: A Study of Human Perception and Electronic Analysis
No ratings yet
Exploring The Influence of Terroir On Douro White and Red Wines Characteristics: A Study of Human Perception and Electronic Analysis
17 pages
ETMHS19309
No ratings yet
ETMHS19309
6 pages
Jmapb 680 Regents Workbook by Topic PDF
No ratings yet
Jmapb 680 Regents Workbook by Topic PDF
210 pages
Wine Quality Classification
No ratings yet
Wine Quality Classification
36 pages
Chemometric Characterization of Italian Wines by Thin-Film Multisensors Array and Artificial Neural Networks
No ratings yet
Chemometric Characterization of Italian Wines by Thin-Film Multisensors Array and Artificial Neural Networks
14 pages
Winerylaboratorymanual Rev2012
No ratings yet
Winerylaboratorymanual Rev2012
24 pages
Machine Learning Based Predictive Modelling For The Enhancement of Wine Quality
No ratings yet
Machine Learning Based Predictive Modelling For The Enhancement of Wine Quality
18 pages
Physiocochemical Properties That Affects Wine Quality: A Multiple Linear Analysis
No ratings yet
Physiocochemical Properties That Affects Wine Quality: A Multiple Linear Analysis
12 pages
Geographical Origin Traceability of Red Wines Based On Chemometric Classification Via Organic Acid Profiles
No ratings yet
Geographical Origin Traceability of Red Wines Based On Chemometric Classification Via Organic Acid Profiles
17 pages
S Selection Nofimp Portant Fe Machi Eatures A Ne Learn and Pred Ning Tech Dicting W Hniques Wine Qual Lity Using G
No ratings yet
S Selection Nofimp Portant Fe Machi Eatures A Ne Learn and Pred Ning Tech Dicting W Hniques Wine Qual Lity Using G
8 pages
Wine Quality Dataset
No ratings yet
Wine Quality Dataset
2 pages
Wine Quality Prediction: Implementation
No ratings yet
Wine Quality Prediction: Implementation
3 pages
Mahima 2020
No ratings yet
Mahima 2020
8 pages
Association Between Timed Up-And-Go and Memory, Executive Function, and Processing Speed.
No ratings yet
Association Between Timed Up-And-Go and Memory, Executive Function, and Processing Speed.
7 pages
Anreg
No ratings yet
Anreg
587 pages
Scatter Plots & Trend Lines: Algebra 1
No ratings yet
Scatter Plots & Trend Lines: Algebra 1
26 pages
Studenmund Ch14 v2
No ratings yet
Studenmund Ch14 v2
48 pages
Fixed and Random Effects
No ratings yet
Fixed and Random Effects
23 pages
Nuriel Shalom Mor - Physicochemical Properties Importance Wine
No ratings yet
Nuriel Shalom Mor - Physicochemical Properties Importance Wine
17 pages
Wine Quality Synopsis
No ratings yet
Wine Quality Synopsis
3 pages
Wine Quality Analysis
No ratings yet
Wine Quality Analysis
27 pages
Kano Model For Mobile Phone Designing
No ratings yet
Kano Model For Mobile Phone Designing
15 pages
(1961) Tukey - The Future of Data Analysis
No ratings yet
(1961) Tukey - The Future of Data Analysis
68 pages
Performance Characteristics of A Coconut
No ratings yet
Performance Characteristics of A Coconut
14 pages
Adolescent Romantic Relationships and Delinquency Involvement
No ratings yet
Adolescent Romantic Relationships and Delinquency Involvement
34 pages
Determination of Titratable Acidity in White Wine
No ratings yet
Determination of Titratable Acidity in White Wine
17 pages
Hoffmann Post 2014 JBEE PDF
No ratings yet
Hoffmann Post 2014 JBEE PDF
6 pages
Tutorial7 Dummy Variables
No ratings yet
Tutorial7 Dummy Variables
47 pages
Modelling Plume Rise and Dispersion From Pool Fires
No ratings yet
Modelling Plume Rise and Dispersion From Pool Fires
10 pages
HND Statistics Nbte Curriculum 64
No ratings yet
HND Statistics Nbte Curriculum 64
146 pages
LN Strategic Management & Define
No ratings yet
LN Strategic Management & Define
99 pages
B.B.A. (Revised) 76
No ratings yet
B.B.A. (Revised) 76
49 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
29 pages
FYP Meeting 2
No ratings yet
FYP Meeting 2
19 pages
Lec 9 Linear Correlation and Linear Regression
No ratings yet
Lec 9 Linear Correlation and Linear Regression
71 pages
The Influence of Experiential Marketing and Location On Customer Loyalty (
No ratings yet
The Influence of Experiential Marketing and Location On Customer Loyalty (
12 pages
Introduction To Matlab and Numerical Method
No ratings yet
Introduction To Matlab and Numerical Method
4 pages
Workplace Harassment - 2
No ratings yet
Workplace Harassment - 2
12 pages
EC395 Lab 6
No ratings yet
EC395 Lab 6
4 pages
Mat102 - Statistics For Business - s1-2024-2025
No ratings yet
Mat102 - Statistics For Business - s1-2024-2025
14 pages
ML Lecture#02
No ratings yet
ML Lecture#02
20 pages
Assignment 2 Mechanical Vibration
No ratings yet
Assignment 2 Mechanical Vibration
44 pages
Eloho Dennis Project-1
No ratings yet
Eloho Dennis Project-1
26 pages
Data Analytics - TYBCS
No ratings yet
Data Analytics - TYBCS
6 pages
Genetically Modified and non-Genetically Modified Food Supply Chains: Co-Existence and Traceability
From Everand
Genetically Modified and non-Genetically Modified Food Supply Chains: Co-Existence and Traceability
Yves Bertheau
No ratings yet
Linear Regression with Multiple Covariates
From Everand
Linear Regression with Multiple Covariates
Brett Kottmann
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lab Rep

Uploaded by

Lab Rep

Uploaded by

Optimizing Wine Quality Prediction: A Machine Learning Approach

Using Chemical Properties

Ballad, Jeremiah Khalil T.

Figure 1. Machine learning project procedure flowchart

Data Loading and Preparation

RESULTS AND DISCUSSION

Figure 2. White (left) and red (right) wine variable distributions

# Import Python libraries

# Load csv file

# Split red and white wine data

Exploratory Data Analysis

# Show information on datatypes, columns, and number of entries

# Show statistics on different columns

# Show distributions of other numerical variables

# Analyze correlation between X variables vs. quality

# Identify highly skewed variables, then normalize using log transformation

# Filter variables with significant correlation

# Split data into training and testing sets

Multiple Linear Regression Model

# Creating linear regression model

# White wine model evaluation metrics

# Red wine model

# Red wine model evaluation

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.