0% found this document useful (0 votes)

11 views9 pages

ML Fat

The document outlines the FAT exam for the course 'Machine Learning for Data Science (LAB)', detailing the steps taken to preprocess a dataset, divide it into training, validation, and test sets, and apply machine learning models including Random Forest and an Artificial Neural Network (ANN). It includes code snippets for data handling, model training, hyperparameter tuning, and performance evaluation using accuracy scores and confusion matrices. The document emphasizes the importance of model validation and performance metrics in machine learning tasks.

Uploaded by

Shiny Sundarmoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views9 pages

ML Fat

Uploaded by

Shiny Sundarmoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Winter semester 23-24

Course code MDI4001

Course name Machine Learning for DataScience (LAB)

Submitted to:
Jyotismita Chaki
Jyotismita@vit.ac.in

FAT Exam

Submitted by :
Shiny. S (21MID0079)
Shiny.2021@vitstudent.ac.in

Date: 29 April 2024

a) Performing the preprocessing steps in the given dataset

CODE:

import pandas as pd

import numpy as np

data = pd.read_csv("agriculture_dataset.csv")

data.head()

data.info()

data.isnull().sum()

data.describe()

SCREENSHOT :
There are no null values in the dataset. So there isn’t need for further preprocessing
steps.

b. Divide the dataset into train, validation, and test sets.

CODE :

x = data.iloc[:,0:6]

y = data['Plant type']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

SCREENSHOT:
C ) Use a suitable hyperparameter-tuned ML model to train the dataset.

Random Forest is the suitable hyperparameter-tuned model to train the given dataset.

CODE :

rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)

rf_classifier.fit(X_train, y_train)

y_pred = rf_classifier.predict(X_test)

SCREENSHOT:

d. After training, validate it and test the model’s performance

CODE:

model = RandomForestClassifier(random_state=1, max_depth=10)

model.fit(X_train, y_train)

pred_train = model.predict(X_train)

train_score = accuracy_score(y_train,pred_train)

print('train_accuracy_score',train_score)

pred_val = model.predict(X_test)

val_score = accuracy_score(y_test,pred_val)

print('val_accuracy_score',val_score)
SCREENSHOT:

Hypertuning the model for better value accuracy:

CODE:

from sklearn.metrics import accuracy_score, confusion_matrix, precision_score,

recall_score, ConfusionMatrixDisplay

from sklearn.model_selection import RandomizedSearchCV

from scipy.stats import randint

param_dist = {'n_estimators': randint(50,500),'max_depth': randint(1,20)}

rf = RandomForestClassifier()

rand_search = RandomizedSearchCV(rf,param_distributions = param_dist, n_iter=5,

cv=5)

rand_search.fit(X_train, y_train)

# Create a variable for the best model

best_rf = rand_search.best_estimator_

# Print the best hyperparameters

print('Best hyperparameters:', rand_search.best_params_)

# Generate predictions with the best model

pred_train = best_rf.predict(X_train)

train_score = accuracy_score(y_train,pred_train)

print('train_accuracy_score',train_score)

pred_val = best_rf.predict(X_test)

val_score = accuracy_score(y_test,pred_val)
print('val_accuracy_score',val_score)

SCREENSHOT:

Creating the confusion matrix:

CODE:

cm = confusion_matrix(y_test,pred_val)

ConfusionMatrixDisplay(confusion_matrix=cm).plot()

SCREENSHOT:
a) Perform the pre-processing steps if needed. If the pre-processing steps are not needed

CODE:

import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from keras.models import Sequential

from keras.layers import Dense

data = pd.read_csv('University_dataset.csv')

data.head()

data.isnull().sum()

SCREENSHOT:
B ) Divide the dataset into train, validation, and test set.

CODE:

X = data.iloc[:, 1:6].values

y = data.iloc[:, 6].values

SCREENSHOT:

C ) Can we use an ANN to train the dataset? If yes, then create an ANN and train and validate the
model by using the dataset and write a discussion on the performance of the model on the answer
booklet given. If no, then write your justification on the answer booklet given.

CODE:

scaler = StandardScaler()

X = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build the ANN

model = Sequential()

model.add(Dense(128, input_dim=5, activation='relu'))

model.add(Dense(64, activation='relu'))

model.add(Dense(1, activation='linear'))

# Compile the model

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_absolute_error'])

# Train the model

model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_test, y_test)) #Evaluate

the model

loss, accuracy = model.evaluate(X_test, y_test)

print(f'Loss: {loss}, Mean Absolute Error: {accuracy}')

# Make predictions

predictions = model.predict(X_test)
SCREENSHOT:

A Cash Management in A Supper Market Store
67% (3)
A Cash Management in A Supper Market Store
64 pages
Jupyter Lab
No ratings yet
Jupyter Lab
42 pages
SeisImagerSW Manual v3.0
80% (10)
SeisImagerSW Manual v3.0
314 pages
1 (A) Explain Supervised Learning and Unsupervised Learning
No ratings yet
1 (A) Explain Supervised Learning and Unsupervised Learning
52 pages
ML Exp8 C36
No ratings yet
ML Exp8 C36
18 pages
Tensor Flow and Keras Sample Programs
No ratings yet
Tensor Flow and Keras Sample Programs
22 pages
Mca 4 Sem Machine Learning and Data Analytics Using Python 91855 May 2023
No ratings yet
Mca 4 Sem Machine Learning and Data Analytics Using Python 91855 May 2023
3 pages
Reast Cancer Prediction Using Debt
No ratings yet
Reast Cancer Prediction Using Debt
18 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Assignment 4 Instructions
No ratings yet
Assignment 4 Instructions
4 pages
Codigo Modelo
No ratings yet
Codigo Modelo
5 pages
Classification Is For Predicting Type and Regression Is For Predicting Value
No ratings yet
Classification Is For Predicting Type and Regression Is For Predicting Value
4 pages
MLT 1 - 7 Kanish
No ratings yet
MLT 1 - 7 Kanish
24 pages
Random Forest
No ratings yet
Random Forest
8 pages
ML5&6&7&8&9&10
No ratings yet
ML5&6&7&8&9&10
35 pages
Untitled Document
No ratings yet
Untitled Document
6 pages
Assignment 1: Q1. Task Description
No ratings yet
Assignment 1: Q1. Task Description
12 pages
Kaggle Course Notes
No ratings yet
Kaggle Course Notes
87 pages
PYHTONPRACT
No ratings yet
PYHTONPRACT
4 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Supple Maximizing Performance in Cs CuBiCl
No ratings yet
Supple Maximizing Performance in Cs CuBiCl
5 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
ML Prac1-10
No ratings yet
ML Prac1-10
32 pages
Random Forest
No ratings yet
Random Forest
11 pages
Detect Fake Profiles in Online Social Networks Using Support Vector Machine
No ratings yet
Detect Fake Profiles in Online Social Networks Using Support Vector Machine
8 pages
Code
No ratings yet
Code
6 pages
Task 4
No ratings yet
Task 4
2 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
AI Note
No ratings yet
AI Note
5 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Big Data Practical
No ratings yet
Big Data Practical
20 pages
ML Functions
No ratings yet
ML Functions
12 pages
Rtmnu Machine Learning Paper Winter 2024
100% (1)
Rtmnu Machine Learning Paper Winter 2024
4 pages
AI ML - Cycle 2 Programs
No ratings yet
AI ML - Cycle 2 Programs
15 pages
16BCB0126 VL2018195002535 Pe003
No ratings yet
16BCB0126 VL2018195002535 Pe003
40 pages
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
Employee Turnover in Banking Sector: Empirical Evidence
100% (1)
Employee Turnover in Banking Sector: Empirical Evidence
5 pages
GRADE 11 - Summative Assessment in Statistics and Probability
100% (2)
GRADE 11 - Summative Assessment in Statistics and Probability
5 pages
Research On Buying Behaviour of Students of Nashik While Selecting Mba Coaching Institute
33% (3)
Research On Buying Behaviour of Students of Nashik While Selecting Mba Coaching Institute
15 pages
AML ML Practical List
No ratings yet
AML ML Practical List
10 pages
FB Models PDF
No ratings yet
FB Models PDF
14 pages
R Packages For Machine Learning
No ratings yet
R Packages For Machine Learning
3 pages
CS326 Report
No ratings yet
CS326 Report
36 pages
EX - NO:3: Algorithm
No ratings yet
EX - NO:3: Algorithm
11 pages
CP4252 Lab Manual
No ratings yet
CP4252 Lab Manual
13 pages
AI Assignment-6
No ratings yet
AI Assignment-6
7 pages
Import Numpy As NP Import Pandas As PD
No ratings yet
Import Numpy As NP Import Pandas As PD
7 pages
1
No ratings yet
1
13 pages
Data Visualization Tools Module
No ratings yet
Data Visualization Tools Module
29 pages
CS3491 Lab Manual
No ratings yet
CS3491 Lab Manual
21 pages
ML Lab Programs 2
No ratings yet
ML Lab Programs 2
16 pages
KNN-SVM Assignment
No ratings yet
KNN-SVM Assignment
4 pages
Rev Insurance Business Report
No ratings yet
Rev Insurance Business Report
4 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
23BCE7092 ML Lab Assignment
No ratings yet
23BCE7092 ML Lab Assignment
14 pages
A Synopsis of The Thesis Project
100% (1)
A Synopsis of The Thesis Project
3 pages
Ann Experiential Learning
No ratings yet
Ann Experiential Learning
43 pages
Linearregression SVM
No ratings yet
Linearregression SVM
3 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
AIML Laboratory Set-B
No ratings yet
AIML Laboratory Set-B
7 pages
Introduction To Factor Analysis (Compatibility Mode) PDF
No ratings yet
Introduction To Factor Analysis (Compatibility Mode) PDF
20 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
A Study On Manpower Planning at Solara Active Pharma Science in Cuddalore
No ratings yet
A Study On Manpower Planning at Solara Active Pharma Science in Cuddalore
3 pages
Association Rule in Data Mining
No ratings yet
Association Rule in Data Mining
4 pages
Week 1
No ratings yet
Week 1
50 pages
Simetrix Manual
No ratings yet
Simetrix Manual
419 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Statistics Explained, 4th Edition Full PDF Download
100% (13)
Statistics Explained, 4th Edition Full PDF Download
14 pages
A3 Classification and Feature Engineering
No ratings yet
A3 Classification and Feature Engineering
2 pages
Distribution Test and Rank Transformation
No ratings yet
Distribution Test and Rank Transformation
6 pages
Ancova
No ratings yet
Ancova
17 pages
Effectiveness of Mindfulness Meditation Vs Headache Education Rebecca Wells
No ratings yet
Effectiveness of Mindfulness Meditation Vs Headache Education Rebecca Wells
12 pages
Practice Problems: Chapter 4, Forecasting: Vela, Jhombelle Bsess SM 3-2
No ratings yet
Practice Problems: Chapter 4, Forecasting: Vela, Jhombelle Bsess SM 3-2
3 pages
Jawaban UTS
No ratings yet
Jawaban UTS
2 pages
Histograms Questions
No ratings yet
Histograms Questions
6 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
7 pages
Descriptive Analytics Assignments
No ratings yet
Descriptive Analytics Assignments
3 pages
RFP - SCAD.18.24.FR - E Commece Insights
No ratings yet
RFP - SCAD.18.24.FR - E Commece Insights
8 pages
Investigation 4-Worksheet FINAL
No ratings yet
Investigation 4-Worksheet FINAL
11 pages
Assignment Biostatistics
No ratings yet
Assignment Biostatistics
4 pages
Inbound 7150592261370166262
No ratings yet
Inbound 7150592261370166262
19 pages
CW1 Specification CSI 4 DMA 2425
No ratings yet
CW1 Specification CSI 4 DMA 2425
8 pages
Anurag Singh DataAnalyst PDF
No ratings yet
Anurag Singh DataAnalyst PDF
1 page
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML Fat

Uploaded by

ML Fat

Uploaded by

Winter semester 23-24

Course code MDI4001

Date: 29 April 2024

b. Divide the dataset into train, validation, and test sets.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)

d. After training, validate it and test the model’s performance

model = RandomForestClassifier(random_state=1, max_depth=10)

Hypertuning the model for better value accuracy:

from sklearn.metrics import accuracy_score, confusion_matrix, precision_score,

from sklearn.model_selection import RandomizedSearchCV

from scipy.stats import randint

param_dist = {'n_estimators': randint(50,500),'max_depth': randint(1,20)}

rand_search = RandomizedSearchCV(rf,param_distributions = param_dist, n_iter=5,

# Create a variable for the best model

# Print the best hyperparameters

print('Best hyperparameters:', rand_search.best_params_)

# Generate predictions with the best model

Creating the confusion matrix:

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from keras.models import Sequential

from keras.layers import Dense

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build the ANN

model.add(Dense(128, input_dim=5, activation='relu'))

# Compile the model

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_absolute_error'])

# Train the model

model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_test, y_test)) #Evaluate

loss, accuracy = model.evaluate(X_test, y_test)

print(f'Loss: {loss}, Mean Absolute Error: {accuracy}')

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.