0% found this document useful (0 votes)

10 views7 pages

Lab Report 10 FDS

The lab report details a comprehensive machine learning workflow applied to the Iris Dataset, focusing on data loading, preprocessing, model training, evaluation, and prediction using Python and libraries like Scikit-learn and Pandas. The Random Forest classifier achieved an impressive accuracy of 95%, with high precision and recall across all flower species, demonstrating its effectiveness in classification tasks. Future work includes hyperparameter tuning, model comparison, and cross-validation to enhance model performance.

Uploaded by

mukeshreddy6766

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views7 pages

Lab Report 10 FDS

Uploaded by

mukeshreddy6766

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Student Name J.

Mukesh Reddy
Student Registration Number 231U1R1120 Class &Section: CSE & C
Study Level : UG/PG UG Year &Term: II & III
Subject Name Foundations of Data Science

Name of the Assessment

Lab Report 10
Date of Submission 10/05/2025
Lab Report 10: Case Study on Load, Pre-process, Split, Train, Evaluate, and Predict a
Dataset

Objective:

The objective of this lab report is to apply a comprehensive machine learning workflow that
includes data loading, preprocessing, splitting, model training, evaluation, and prediction.
The dataset chosen for this study is the well-known Iris Dataset, commonly used for
classification tasks. This case study focuses on performing these steps using Python, with the
help of machine learning libraries like Scikit-learn and Pandas.

Dataset Chosen:

The Iris Dataset is a classical dataset in machine learning, containing 150 instances of iris
flowers, with four features describing the physical attributes of the flowers: sepal length,
sepal width, petal length, and petal width. The target variable represents the species of the
flower, with three possible categories: Setosa, Versicolor, and Virginica.

Tools and Libraries:

The following tools and libraries were used in this case study:

 Python: The primary programming language.

 Pandas: For data manipulation and analysis.
 Scikit-learn: For machine learning algorithms and evaluation.
 Matplotlib: For visualizing data (if required).

1. Load the Dataset:

The first step in the machine learning workflow is loading the dataset. In this case, the Iris
Dataset was loaded directly from Scikit-learn's built-in library, which simplifies the process
of data acquisition. The dataset was separated into two parts: features (X) and target labels
(y), with the features containing the physical measurements of the flowers and the target
labels representing the species.

2. Pre-process the Data:

Before feeding the data into the model, preprocessing was necessary to ensure that the data is
clean, consistent, and suitable for training. In this step, we checked for any missing values in
the dataset, which was not an issue in the Iris dataset. The data was then standardized using
normalization techniques. This step ensures that all features have the same scale, which is
important for many machine learning algorithms, including distance-based models.

3. Split the Data:

After preprocessing, the dataset was split into training and test sets. This was done to ensure
that the model could be trained on one portion of the data and tested on another, which helps
evaluate its performance on unseen data. The data was split in an 80-20 ratio, meaning 80%
of the data was used for training, and 20% was used for testing the model’s performance.

4. Train the Model:

For this case study, we selected a Random Forest classifier as our model. Random Forest is
an ensemble learning method that combines multiple decision trees to improve accuracy and
reduce overfitting. The model was trained on the training set, learning to predict the species
of flowers based on the four feature measurements.

5. Evaluate the Model:

Once the model was trained, we evaluated its performance on the test set. The evaluation was
done using multiple metrics:

 Accuracy: This indicates the overall correctness of the model’s predictions.

 Confusion Matrix: A confusion matrix was used to visualize the performance of the
model in predicting the correct class for each instance.
 Classification Report: This provided precision, recall, and F1-score metrics for each
class, giving a detailed view of how well the model performed for each flower
species.

The model achieved an accuracy of 95%, indicating that it correctly predicted the species of
flowers most of the time. The confusion matrix showed that the model made very few
misclassifications, with the most significant errors occurring between the Versicolor and
Virginica species. The classification report confirmed that the model had high precision and
recall across all classes, with the overall performance being satisfactory.

6. Make Predictions:

To demonstrate the model’s ability to make predictions on new, unseen data, a prediction was
made for a hypothetical flower with specific feature measurements (sepal length, sepal width,
petal length, petal width). The model successfully predicted that the flower belonged to the
Versicolor species.

Results:

1. Accuracy: The model achieved an impressive accuracy of 95% on the test dataset.
2. Confusion Matrix: The confusion matrix showed a strong performance, with only a
few misclassifications between species, particularly Versicolor and Virginica.
3. Classification Report: The report indicated high precision and recall values for all
three species (Setosa, Versicolor, and Virginica), suggesting the model performs well
in distinguishing between the species.
4. Prediction: The model successfully predicted the species of a new flower sample,
showing that it can generalize its learning to new instances.

Conclusion:

This lab report successfully demonstrated the key steps in a machine learning workflow:

 Data loading: The Iris dataset was loaded and prepared for modeling.
 Pre-processing: Data was standardized to ensure consistency across features.
 Model training: A Random Forest classifier was trained on the dataset.
 Model evaluation: The model was evaluated using accuracy, confusion matrix, and
classification report, showing a high performance.
 Prediction: The model was able to predict the species of a new, unseen flower based
on its features.

The results indicate that the Random Forest classifier is an effective model for classifying iris
species. The accuracy and evaluation metrics suggest that the model performs well, making it
suitable for similar classification tasks.

Future Work:

 Hyperparameter Tuning: To improve model performance further, hyperparameter

tuning could be performed using techniques like grid search or random search.
 Model Comparison: Other classifiers, such as Support Vector Machines or k-Nearest
Neighbors, could be tested to see if they offer better performance.
 Cross-Validation: Cross-validation could be used to better assess the model’s
performance and ensure it is not overfitting to the training data.

This case study offers a strong foundation for applying machine learning models to
classification problems and can be adapted to other datasets with similar tasks.

Oracle DB Basic Commands
75% (4)
Oracle DB Basic Commands
1 page
Attiq Ahmad Afsar MLAssignment 3 Flask
No ratings yet
Attiq Ahmad Afsar MLAssignment 3 Flask
9 pages
JAYESH BANSAL - FinalProjectReport - Jayesh Bansal
No ratings yet
JAYESH BANSAL - FinalProjectReport - Jayesh Bansal
38 pages
SUMITs MINOR REPORT
No ratings yet
SUMITs MINOR REPORT
16 pages
Iris Flower Classification Final
No ratings yet
Iris Flower Classification Final
15 pages
Lab 6
No ratings yet
Lab 6
4 pages
Iris Classification
No ratings yet
Iris Classification
6 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
3 pages
ST1 4483 8995 Capstone PPT Template
No ratings yet
ST1 4483 8995 Capstone PPT Template
10 pages
22BCS14374 - Sanya - Singh - Assignment 2
No ratings yet
22BCS14374 - Sanya - Singh - Assignment 2
8 pages
BT-2016 SEM-IV Project Report (Review 1)
No ratings yet
BT-2016 SEM-IV Project Report (Review 1)
42 pages
Iris Flower Classification Project
100% (1)
Iris Flower Classification Project
14 pages
ML Lecture 10 Project
No ratings yet
ML Lecture 10 Project
20 pages
AML Lab3 2021wb15156
No ratings yet
AML Lab3 2021wb15156
13 pages
Practical File DL
No ratings yet
Practical File DL
14 pages
Major Project (Kartik Joshi)
No ratings yet
Major Project (Kartik Joshi)
4 pages
ML 10
No ratings yet
ML 10
3 pages
Nomlab 14 Ai
No ratings yet
Nomlab 14 Ai
3 pages
Untitled Document
No ratings yet
Untitled Document
2 pages
Machine Learning Project
No ratings yet
Machine Learning Project
9 pages
PS1 Ramos 2B
No ratings yet
PS1 Ramos 2B
2 pages
Classification of Iris Flower Species Updated
100% (1)
Classification of Iris Flower Species Updated
5 pages
Data Science Project
No ratings yet
Data Science Project
3 pages
Flower Project DL
No ratings yet
Flower Project DL
7 pages
Machine Learning: Lecture 7: Create Your First Project
No ratings yet
Machine Learning: Lecture 7: Create Your First Project
17 pages
Module 4 - Supervised Learning - First ML Model
No ratings yet
Module 4 - Supervised Learning - First ML Model
23 pages
Iris Dataset Project Report - Compress
No ratings yet
Iris Dataset Project Report - Compress
16 pages
61 JBS1753
No ratings yet
61 JBS1753
13 pages
Assignment 3 1
No ratings yet
Assignment 3 1
3 pages
Task 1 Iris Flower Classification Using Machine Learning
No ratings yet
Task 1 Iris Flower Classification Using Machine Learning
10 pages
An Approach Based Iris Flower Species Recognition Using Machine Learning Classifiers
No ratings yet
An Approach Based Iris Flower Species Recognition Using Machine Learning Classifiers
7 pages
Assignment 4 R Program1
No ratings yet
Assignment 4 R Program1
11 pages
3 Text
No ratings yet
3 Text
2 pages
ML 10
No ratings yet
ML 10
3 pages
ML Lab1 PGM
No ratings yet
ML Lab1 PGM
4 pages
Shelly
No ratings yet
Shelly
15 pages
Wa0001
No ratings yet
Wa0001
39 pages
Understanding-Code-for A-Classifier
No ratings yet
Understanding-Code-for A-Classifier
15 pages
Animal Species Prediction Using Machine Learning
No ratings yet
Animal Species Prediction Using Machine Learning
10 pages
Project Template
No ratings yet
Project Template
15 pages
Types of ML Systems
No ratings yet
Types of ML Systems
5 pages
Iris Flower Classification Project
No ratings yet
Iris Flower Classification Project
9 pages
Lab 3-ML
No ratings yet
Lab 3-ML
8 pages
Report
No ratings yet
Report
2 pages
Pratique Work 3:data Preprocessing
No ratings yet
Pratique Work 3:data Preprocessing
7 pages
PR
No ratings yet
PR
17 pages
Shelly Mehndiratta IrisFlowerClassification
No ratings yet
Shelly Mehndiratta IrisFlowerClassification
15 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
47 pages
Sameed Ahmed Khan Tools For Artificial Neural Network and Machine Learning
No ratings yet
Sameed Ahmed Khan Tools For Artificial Neural Network and Machine Learning
14 pages
Assignment 1
No ratings yet
Assignment 1
12 pages
Fo DS
No ratings yet
Fo DS
9 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
FI Proj
No ratings yet
FI Proj
17 pages
Experiments
No ratings yet
Experiments
7 pages
Animal Species Prediction Using Machine Learning
No ratings yet
Animal Species Prediction Using Machine Learning
10 pages
Iris Classification
No ratings yet
Iris Classification
8 pages
Bagging, Random Forest, Gradient Boost, AdaBoost & PCA
No ratings yet
Bagging, Random Forest, Gradient Boost, AdaBoost & PCA
8 pages
Adaptive Linear Neuron
No ratings yet
Adaptive Linear Neuron
11 pages
Assessment 1
No ratings yet
Assessment 1
4 pages
8824 Shivam Darekar Report - 8824 Shivam Darekar
No ratings yet
8824 Shivam Darekar Report - 8824 Shivam Darekar
7 pages
24CSR1R01 DSF Assignment 2
No ratings yet
24CSR1R01 DSF Assignment 2
9 pages
Target Appraisal: Case: Dr. Reddy Laboratories (A) & (B)
No ratings yet
Target Appraisal: Case: Dr. Reddy Laboratories (A) & (B)
45 pages
Lab 1
No ratings yet
Lab 1
12 pages
Introduction To TikTok Shop Affiliate Program
No ratings yet
Introduction To TikTok Shop Affiliate Program
10 pages
Hornady 2017 Product Catalog
No ratings yet
Hornady 2017 Product Catalog
132 pages
3.2. Perspectives On Listening Ho
No ratings yet
3.2. Perspectives On Listening Ho
35 pages
English 11-Grade 2022-2023
No ratings yet
English 11-Grade 2022-2023
1 page
Worksheet KTSP - Kelas 7
No ratings yet
Worksheet KTSP - Kelas 7
31 pages
Mycbseguide: Class 12 - Accountancy Sample Paper 07
No ratings yet
Mycbseguide: Class 12 - Accountancy Sample Paper 07
15 pages
Kafd A1 111 Comn BF1 XXXXX SHP Arc Asb 00023
No ratings yet
Kafd A1 111 Comn BF1 XXXXX SHP Arc Asb 00023
1 page
03 Clerk Post English Grammer
No ratings yet
03 Clerk Post English Grammer
166 pages
CTPAT Job Aid - Personnel Training Checklist Sample - October 2021
No ratings yet
CTPAT Job Aid - Personnel Training Checklist Sample - October 2021
4 pages
Collab Report Merged
No ratings yet
Collab Report Merged
55 pages
RSPile Tutorials - 1 - Axially Loaded Piles
No ratings yet
RSPile Tutorials - 1 - Axially Loaded Piles
14 pages
Job Portal
82% (11)
Job Portal
17 pages
Eggd Parking
No ratings yet
Eggd Parking
1 page
JD Science Physic Teacher
No ratings yet
JD Science Physic Teacher
4 pages
Urooj
No ratings yet
Urooj
1 page
Quiksam PDF
No ratings yet
Quiksam PDF
6 pages
Inputs and Outputs List Page:1/21: Example-9: Sequential Control of Induction Motors
No ratings yet
Inputs and Outputs List Page:1/21: Example-9: Sequential Control of Induction Motors
7 pages
PL01ELBL53 Corporate Finance-I
No ratings yet
PL01ELBL53 Corporate Finance-I
3 pages
BDA Lab Manual R22
0% (1)
BDA Lab Manual R22
70 pages
Kurmanji Complete
100% (2)
Kurmanji Complete
217 pages
CTSD-Lab Mannual Final - 241204 - 102238
No ratings yet
CTSD-Lab Mannual Final - 241204 - 102238
54 pages
SSP Cakram5 6
No ratings yet
SSP Cakram5 6
420 pages
Victoria Code of Practice For Using Concrete Pump
0% (1)
Victoria Code of Practice For Using Concrete Pump
56 pages
Water Ingress Analysis and Splash Protection Evaluation For Vehicle Wading Using Non-Classical CFD Simulation
No ratings yet
Water Ingress Analysis and Splash Protection Evaluation For Vehicle Wading Using Non-Classical CFD Simulation
13 pages
LIBRARY MANAGEMENT SYSTEM - Final
100% (1)
LIBRARY MANAGEMENT SYSTEM - Final
25 pages
DCK-datacenter Strategies PDF
No ratings yet
DCK-datacenter Strategies PDF
26 pages
Chemistry Class 10
No ratings yet
Chemistry Class 10
8 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lab Report 10 FDS

Uploaded by

Lab Report 10 FDS

Uploaded by

Student Name J.

Name of the Assessment

Tools and Libraries:

 Python: The primary programming language.

1. Load the Dataset:

2. Pre-process the Data:

3. Split the Data:

4. Train the Model:

5. Evaluate the Model:

 Accuracy: This indicates the overall correctness of the model’s predictions.

 Hyperparameter Tuning: To improve model performance further, hyperparameter

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.