0% found this document useful (0 votes)

55 views3 pages

9 A.validation Methods - Jupyter Notebook

The document loads diabetes dataset, splits it into training and test sets, fits a logistic regression model to the training set and evaluates it on the test set. It then performs 3 types of cross validation: K-fold, Leave-one-out, and Repeated K-fold cross validation and calculates the mean accuracy for each.

Uploaded by

venkatesh m

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views3 pages

9 A.validation Methods - Jupyter Notebook

Uploaded by

venkatesh m

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

In

[1]: import numpy as np

import matplotlib.pyplot as plt
import pandas as pd

In [2]: data = pd.read_csv("D:\\Course\\Python\\Datasets\\pima-indians-diabetes.csv")

In [3]: data

...

In [5]: # Divide Data into X and Y

array = data.values
X = array[:,0:8]
Y = array[:,8]

Hold Out Validations

In [6]: from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

In [7]: # Split the data

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33, random_

In [8]: model = LogisticRegression()

In [9]: model.fit(X_train, Y_train)

result = model.score(X_test, Y_test)

C:\Users\rgandyala\Anaconda3\lib\site-packages\sklearn\linear_model\_logistic.p
y:763: ConvergenceWarning: lbfgs failed to converge (status=1):

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:

https://scikit-learn.org/stable/modules/preprocessing.html (https://scikit-
learn.org/stable/modules/preprocessing.html)

Please also refer to the documentation for alternative solver options:

https://scikit-learn.org/stable/modules/linear_model.html#logistic-regressi
on (https://scikit-learn.org/stable/modules/linear_model.html#logistic-regressi
on)

n_iter_i = _check_optimize_result(

In [10]: # Predicting the Test set results

y_pred = model.predict(X_test)
In [11]: # Checking accuracy score
from sklearn.metrics import accuracy_score
accuracy_score(Y_test, y_pred)

Out[11]: 0.7716535433070866

K FOLD Validations
In [12]: from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

In [14]: # Initialize parameters

num_folds = 10
kfold = KFold(n_splits=num_folds)
model1 = LogisticRegression()

In [15]: # Fitting the model and Extracting the results

results1 = cross_val_score(model1, X, Y, cv=kfold)

...

In [16]: results1

...

In [17]: print(results1.mean()100.0, results1.std()100.0)

...

Leave One Out Cross Validation

In [18]: from sklearn.model_selection import LeaveOneOut
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

In [19]: # Initialize parameters

loocv = LeaveOneOut()
model2 = LogisticRegression()

In [20]: # Fitting the model and Extracting the results

results2 = cross_val_score(model2, X, Y, cv=loocv)

...

In [21]: print(results2.mean()100.0, results2.std()100.0)

77.05345501955672 42.04890690023727

Repeated K Fold Cross Validation

In [22]: from sklearn.model_selection import RepeatedKFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

In [23]: # Initialize parameters

n_splits = 10
kfold3 = RepeatedKFold(n_splits=n_splits, n_repeats=2)
model3 = LogisticRegression()

In [24]: # Fitting the model and Extracting the results

results3 = cross_val_score(model3, X, Y, cv=kfold3)

...

In [26]: # Check the Accuracy

print("Accuracy: ", results3*100.0)

Accuracy: [75.32467532 72.72727273 72.72727273 76.62337662 80.51948052 79.2207

7922

75.32467532 86.84210526 67.10526316 80.26315789 81.81818182 77.92207792

80.51948052 76.62337662 76.62337662 80.51948052 75.32467532 67.10526316

85.52631579 77.63157895]

In [27]: print(results3.mean()100.0, results3.std()100.0)

77.31459330143541 4.929298671034974

In [ ]:

Practical Data Science
No ratings yet
Practical Data Science
121 pages
Linux Hardening - TFG-B. 1910
No ratings yet
Linux Hardening - TFG-B. 1910
134 pages
Introduction To Scikit Learn
100% (1)
Introduction To Scikit Learn
108 pages
Mathematical Methods for Physics and Engineering 1st Edition Mattias Blennow download
100% (1)
Mathematical Methods for Physics and Engineering 1st Edition Mattias Blennow download
59 pages
Logistic Pima Indians - Ipynb - Colaboratory
No ratings yet
Logistic Pima Indians - Ipynb - Colaboratory
4 pages
Brain Tumor Classification
100% (1)
Brain Tumor Classification
12 pages
Scaling MariaDB With Docker - Webinar
No ratings yet
Scaling MariaDB With Docker - Webinar
47 pages
AI_Phase3
No ratings yet
AI_Phase3
2 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
6 pages
Solution LabAssignment
No ratings yet
Solution LabAssignment
15 pages
Oraclegg Part3 Trouble
No ratings yet
Oraclegg Part3 Trouble
41 pages
Python Assignment
33% (3)
Python Assignment
53 pages
SQL Tuning
100% (6)
SQL Tuning
51 pages
MariaDB 10.4 New Features at DOAG K+A 2019
No ratings yet
MariaDB 10.4 New Features at DOAG K+A 2019
25 pages
ML-journal
No ratings yet
ML-journal
45 pages
Unit5 - Logistic Regression
No ratings yet
Unit5 - Logistic Regression
4 pages
Neuro Symbolic Reasoning and Learning: Paulo Shakarian Chitta Baral Gerardo I. Simari Bowen Xi Lahari Pokala
No ratings yet
Neuro Symbolic Reasoning and Learning: Paulo Shakarian Chitta Baral Gerardo I. Simari Bowen Xi Lahari Pokala
125 pages
Network Flow Model I
No ratings yet
Network Flow Model I
36 pages
Database Components - MariaDB
No ratings yet
Database Components - MariaDB
4 pages
CLT 2018 Mariadb 10 2
No ratings yet
CLT 2018 Mariadb 10 2
48 pages
Probability 2 Lecture Notes
No ratings yet
Probability 2 Lecture Notes
96 pages
Clase-02-ML - Colab
No ratings yet
Clase-02-ML - Colab
5 pages
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
100% (1)
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
23 pages
2022-BNEXT
No ratings yet
2022-BNEXT
16 pages
Intro Lectures To DSA
0% (1)
Intro Lectures To DSA
17 pages
A Starter Pack To Exploratory Data Analysis With Python, Pandas, Seaborn, and Scikit-Learn
No ratings yet
A Starter Pack To Exploratory Data Analysis With Python, Pandas, Seaborn, and Scikit-Learn
40 pages
Mysql, Oracle, Sqlserver
No ratings yet
Mysql, Oracle, Sqlserver
9 pages
Database Design Best Practices
No ratings yet
Database Design Best Practices
2 pages
Industrial Engineering Lectures First Season
No ratings yet
Industrial Engineering Lectures First Season
122 pages
Regression Problems in Python PDF
No ratings yet
Regression Problems in Python PDF
34 pages
Introduction To Spark With Sparklyr in R
No ratings yet
Introduction To Spark With Sparklyr in R
11 pages
ML Exp 8
No ratings yet
ML Exp 8
22 pages
Hashing
No ratings yet
Hashing
48 pages
RO47002 - Course Introduction
No ratings yet
RO47002 - Course Introduction
48 pages
Chapter 2
No ratings yet
Chapter 2
14 pages
Chapter 4 Lecture 2 - Karnough Map
No ratings yet
Chapter 4 Lecture 2 - Karnough Map
33 pages
1rst exp
No ratings yet
1rst exp
3 pages
RedHat5 Manual NSA
No ratings yet
RedHat5 Manual NSA
200 pages
Mysql Replication With Heartbeat and DRBD
100% (3)
Mysql Replication With Heartbeat and DRBD
16 pages
Polars Vs Pandas - Benchmarking Performances and Beyond - LinkedIn
No ratings yet
Polars Vs Pandas - Benchmarking Performances and Beyond - LinkedIn
12 pages
Supervised Learning - Regression - Annotated
No ratings yet
Supervised Learning - Regression - Annotated
97 pages
Cse Cryptography Ppt
No ratings yet
Cse Cryptography Ppt
18 pages
Medical Imaging Ebook Beginners Guide
No ratings yet
Medical Imaging Ebook Beginners Guide
20 pages
2.basic Statistics - Jupyter Notebook
100% (1)
2.basic Statistics - Jupyter Notebook
7 pages
DSBDA UT 2 Part 2
No ratings yet
DSBDA UT 2 Part 2
21 pages
Micro-Insurance Model
No ratings yet
Micro-Insurance Model
6 pages
Machine Learning: Engr. Ejaz Ahmad
No ratings yet
Machine Learning: Engr. Ejaz Ahmad
54 pages
Heart Disease Prediction - Jupyter Notebook
100% (1)
Heart Disease Prediction - Jupyter Notebook
9 pages
Advanced Scikit Learn
No ratings yet
Advanced Scikit Learn
98 pages
Mariadb Tutorial: Learn Syntax, Commands With Examples
No ratings yet
Mariadb Tutorial: Learn Syntax, Commands With Examples
39 pages
Support Vector Machine
No ratings yet
Support Vector Machine
40 pages
Oracle GoldenGate Best Practices - Configuring Oracle GoldenGate For Teradata Databases V5a ID1323119.1-1
No ratings yet
Oracle GoldenGate Best Practices - Configuring Oracle GoldenGate For Teradata Databases V5a ID1323119.1-1
43 pages
Mysql For Oracle Dbas and Developers
No ratings yet
Mysql For Oracle Dbas and Developers
65 pages
L2 - Fuzzy Logic System
No ratings yet
L2 - Fuzzy Logic System
30 pages
Salary Prediction LinearRegression
100% (1)
Salary Prediction LinearRegression
7 pages
Statistics in Oracle
No ratings yet
Statistics in Oracle
13 pages
Unit 5 - QB FML
No ratings yet
Unit 5 - QB FML
12 pages
2016 Medical Diagnosis With The Aid of Using Fuzzy Logic
100% (1)
2016 Medical Diagnosis With The Aid of Using Fuzzy Logic
19 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Mathematics For Civil Engineers
No ratings yet
Mathematics For Civil Engineers
32 pages
Scikit Learn Cheat Sheet
No ratings yet
Scikit Learn Cheat Sheet
9 pages
Keynote Mysql Essentials 403024
No ratings yet
Keynote Mysql Essentials 403024
128 pages
Sales Forecasting
100% (1)
Sales Forecasting
10 pages
ML LAB Rec
No ratings yet
ML LAB Rec
9 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
3 SVM - Jupyter Notebook
No ratings yet
3 SVM - Jupyter Notebook
4 pages
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
No ratings yet
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
17 pages
Rhel5 Guide I731
No ratings yet
Rhel5 Guide I731
200 pages
17+北美名企DS岗位面试真题
No ratings yet
17+北美名企DS岗位面试真题
8 pages
5 Random Forest - Jupyter Notebook
No ratings yet
5 Random Forest - Jupyter Notebook
2 pages
Becoming A Data Scientist StudyPlan
No ratings yet
Becoming A Data Scientist StudyPlan
10 pages
Performance Tuning Addedinfo Oracle
No ratings yet
Performance Tuning Addedinfo Oracle
49 pages
Interpret Statspack Report
No ratings yet
Interpret Statspack Report
9 pages
Jupyter Installation
100% (1)
Jupyter Installation
19 pages
Regression Analysis
100% (2)
Regression Analysis
9 pages
6 XG Boost - Jupyter Notebook
100% (1)
6 XG Boost - Jupyter Notebook
3 pages
Experiment No. 4: Aim: Write A Program For Image Enhancement in Frequency Domain Filtering
No ratings yet
Experiment No. 4: Aim: Write A Program For Image Enhancement in Frequency Domain Filtering
6 pages
ML Lab Manual
No ratings yet
ML Lab Manual
47 pages
CS-3011 (Ai) - CS Mid Sept 2023
No ratings yet
CS-3011 (Ai) - CS Mid Sept 2023
13 pages
Network Situation Features Extraction Method of Computer Network Based on Knowledge Graph
No ratings yet
Network Situation Features Extraction Method of Computer Network Based on Knowledge Graph
5 pages
1 Simple Linear Regression
No ratings yet
1 Simple Linear Regression
9 pages
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
100% (1)
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
11 pages
SP
No ratings yet
SP
4 pages
Introduction To Oracle Golden Gate
No ratings yet
Introduction To Oracle Golden Gate
10 pages
1 Basics of Python
No ratings yet
1 Basics of Python
6 pages
FEA - Syllabus
No ratings yet
FEA - Syllabus
2 pages
2 Basic of Python - Functions
No ratings yet
2 Basic of Python - Functions
3 pages
3657
No ratings yet
3657
16 pages
Machine Learning Mini-Project Report
No ratings yet
Machine Learning Mini-Project Report
26 pages
1 KNN - Jupyter Notebook
No ratings yet
1 KNN - Jupyter Notebook
3 pages
Cse2001 Data-structures-And-Algorithms Eth 1.1 3 Cse2001
No ratings yet
Cse2001 Data-structures-And-Algorithms Eth 1.1 3 Cse2001
4 pages
Cripts IN Lender: Author: N.tox
No ratings yet
Cripts IN Lender: Author: N.tox
9 pages
CS3491 Set7
No ratings yet
CS3491 Set7
2 pages
ISI & Nyquist Criterion For Distortion Less Baseband Binary Data Transmission
0% (1)
ISI & Nyquist Criterion For Distortion Less Baseband Binary Data Transmission
7 pages
Data Science in Spark With Sparklyr::: Cheat Sheet
No ratings yet
Data Science in Spark With Sparklyr::: Cheat Sheet
2 pages
Using Statspack To Track Down Bad Code
No ratings yet
Using Statspack To Track Down Bad Code
11 pages
Professional Microsoft SQL Server 2012 Administration
From Everand
Professional Microsoft SQL Server 2012 Administration
Adam Jorgensen
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

9 A.validation Methods - Jupyter Notebook

Uploaded by

9 A.validation Methods - Jupyter Notebook

Uploaded by

In

[1]: import numpy as np

In [2]: data = pd.read_csv("D:\\Course\\Python\\Datasets\\pima-indians-diabetes.csv")

In [5]: # Divide Data into X and Y

Hold Out Validations

In [6]: from sklearn.model_selection import train_test_split

In [7]: # Split the data

In [8]: model = LogisticRegression()

In [9]: model.fit(X_train, Y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Please also refer to the documentation for alternative solver options:

In [10]: # Predicting the Test set results

In [14]: # Initialize parameters

In [15]: # Fitting the model and Extracting the results

In [17]: print(results1.mean()100.0, results1.std()100.0)

Leave One Out Cross Validation

In [19]: # Initialize parameters

In [20]: # Fitting the model and Extracting the results

In [21]: print(results2.mean()100.0, results2.std()100.0)

Repeated K Fold Cross Validation

In [23]: # Initialize parameters

In [24]: # Fitting the model and Extracting the results

In [26]: # Check the Accuracy

Accuracy: [75.32467532 72.72727273 72.72727273 76.62337662 80.51948052 79.2207

75.32467532 86.84210526 67.10526316 80.26315789 81.81818182 77.92207792

80.51948052 76.62337662 76.62337662 80.51948052 75.32467532 67.10526316

In [27]: print(results3.mean()100.0, results3.std()100.0)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

9 A.validation Methods - Jupyter Notebook

Uploaded by

9 A.validation Methods - Jupyter Notebook

Uploaded by

In

[1]: import numpy as np

In [2]: data = pd.read_csv("D:\\Course\\Python\\Datasets\\pima-indians-diabetes.csv")

In [5]: # Divide Data into X and Y

Hold Out Validations

In [6]: from sklearn.model_selection import train_test_split

In [7]: # Split the data

In [8]: model = LogisticRegression()

In [9]: model.fit(X_train, Y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Please also refer to the documentation for alternative solver options:

In [10]: # Predicting the Test set results

In [14]: # Initialize parameters

In [15]: # Fitting the model and Extracting the results

In [17]: print(results1.mean()*100.0, results1.std()*100.0)

Leave One Out Cross Validation

In [19]: # Initialize parameters

In [20]: # Fitting the model and Extracting the results

In [21]: print(results2.mean()*100.0, results2.std()*100.0)

Repeated K Fold Cross Validation

In [23]: # Initialize parameters

In [24]: # Fitting the model and Extracting the results

In [26]: # Check the Accuracy

Accuracy: [75.32467532 72.72727273 72.72727273 76.62337662 80.51948052 79.2207

75.32467532 86.84210526 67.10526316 80.26315789 81.81818182 77.92207792

80.51948052 76.62337662 76.62337662 80.51948052 75.32467532 67.10526316

In [27]: print(results3.mean()*100.0, results3.std()*100.0)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

In [17]: print(results1.mean()100.0, results1.std()100.0)

In [21]: print(results2.mean()100.0, results2.std()100.0)

In [27]: print(results3.mean()100.0, results3.std()100.0)