0% found this document useful (0 votes)
19 views12 pages

ML Project Report

Uploaded by

graphichub2304
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views12 pages

ML Project Report

Uploaded by

graphichub2304
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Project Report

On
"Predicting Wine Quality Using
Wine Quality Dataset"
Submitted to Punjab Technical University, Jalandhar

In partial fulfillment of the


requirements
For the degree of

Bachelor of Computer Applications


(Session 2021-2024)

Submitted To: Ms. Navkiran Gill

Submitted By: Saurav


` PUNJAB COLLEGE OF TECHNICAL EDUCATION
BADDOWAL (LUDHIANA)

Declaration

I swore that the work being presented by me in the dissertation titled “Predicting
Wine Quality Using Wine Quality Dataset” in partial requirements for the
fulfillment of degree of Bachelor of Computer Applications (BCA) to be
submitted in Punjab College of Technical Education (PCTE), Baddowal
(Ludhiana) affiliated to PTU, Jalandhar is authentic record of my own work
carried out by me in BCA underthe supervision of (Ms. Navkiran Gill, PCTE),
Ludhiana.

Saurav
Acknowledgement

On the very outset I would like to thank the almighty GOD for showering his
blessing & providing me with the courage, motivation & strength to complete
myproject.

Every seminar work demands a lot of hard work, time, patience, and
concentration. While working on this seminar, apart from these aspects, I
have developed necessary skills and an attitude, which are always required
in a professional field. I am thankful to all those who helped me in completing
this project.

I express my deep sense of gratitude & indebtedness towards my respected


Project In-charge (Ms. Navkiran Gill), other faculty members of PCTE from
whom Ihave learnt “Machine Learning” without their guidance I would have
found it difficult to undertake the project work. I would like to thank them
for their ever available, unconditional help & guidance that they made
available throughout theproject work.

I would also like to acknowledge the encouraging attitude of my friends &


otherstaff members of P.C.T.E family that helped me to complete the project
work.

Saurav
Table of Content

Serial No Content Page No


1 Introduction 1
2 Objective and dataset overview 2
3 Approach 3
4 Libraries and Models used 4
5 Main Code and Results 5-8
Introduction
In the realm of viticulture and oenology, the quest for producing exceptional wines is an art
intertwined with science. Winemakers continuously strive to understand and optimize the factors
influencing wine quality, encompassing a delicate balance of chemical composition and sensory
attributes. Leveraging the power of machine learning, we embark on a journey to decode this
complexity and develop a predictive model capable of assessing and forecasting wine quality
based on intrinsic characteristics.

The "Wine Quality Dataset" serves as our canvas, offering a comprehensive collection of data
points derived from the analysis of Portuguese "Vinho Verde" wines. This dataset encapsulates
key chemical properties and sensory features that contribute to the perceived quality of both red
and white wine variants. Each wine entry is accompanied by a quality rating ranging from 3 to 9,
representing an expert evaluation of its overall quality.

1
Objective and Dataset Overview
The primary goal of our machine learning project is twofold: to unravel the intricate relationships
between wine attributes and quality ratings, and to construct a robust predictive model capable of
generalizing these relationships to new observations. By accomplishing this, we aim to empower
winemakers and enthusiasts with actionable insights that can enhance decision-making processes
in viticulture and wine production.

Our dataset unveils a tapestry of chemical and physical characteristics inherent to each wine
sample. These attributes include levels of acidity, residual sugar, pH, alcohol content, and
more—factors known to influence the taste, aroma, and overall quality of wine. The challenge
lies in distilling this multifaceted dataset into meaningful patterns that underpin wine quality
assessments.

2
Approach
1. Data Exploration and Preprocessing:
 We commence our exploration by delving into the dataset's structure and
dimensions using data manipulation libraries like pandas. This step involves
gaining insights into feature distributions, identifying missing values, and
assessing the need for preprocessing steps.
 Data preprocessing encompasses tasks such as scaling numerical features to a
uniform range, handling categorical variables through encoding techniques, and
potentially addressing outliers or skewed distributions.
2. Feature Engineering and Selection:
 The next phase entails feature engineering, where we extract valuable insights
from existing attributes or derive new features that encapsulate deeper nuances of
wine quality.
 Feature selection techniques may be employed to identify the most influential
predictors, streamlining model complexity while preserving predictive power.
3. Model Development and Evaluation:
 Armed with a curated dataset, we embark on constructing predictive models using
a suite of machine learning algorithms. Potential candidates include regression-
based approaches like linear regression, decision trees, ensemble methods (e.g.,
random forests), or advanced techniques like support vector machines (SVM) and
gradient boosting (e.g., XGBoost).
 The performance of these models is rigorously evaluated using appropriate
metrics such as mean squared error (MSE), R-squared, or classification accuracy
for discretized quality ratings.
4. Model Fine-Tuning and Validation:
 To optimize model performance and mitigate overfitting, we engage in
hyperparameter tuning using techniques like grid search or randomized search.
This iterative process involves selecting optimal model configurations that yield
superior generalization on unseen data.
 Validation procedures such as cross-validation ensure the reliability.

3
Libraries and Models/Classifiers Used

1. Libraries Used:
 numpy (imported as np): A library for numerical operations in Python.
 pandas (imported as pd): A library for data manipulation and analysis.
 matplotlib.pyplot (imported as plt): A library for creating visualizations like
plots and charts.
 seaborn (imported as sb): A library built on top of matplotlib for creating
attractive statistical graphics.
Total libraries: 4

2. Models/Classifiers:
 sklearn.svm.SVC: Support Vector Classifier (SVC) from scikit-learn, used for
support vector machine classification.
 xgboost.XGBClassifier: XGBoost Classifier from the XGBoost library, a popular
gradient boosting algorithm.
 sklearn.linear_model.LogisticRegression: Logistic Regression model from
scikit-learn, used for binary classification tasks.
Total models/classifiers: 3

4
Main Code and Results

5
6
7
8

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy