0% found this document useful (0 votes)

37 views11 pages

Diabetes - Prediction - Project - Ipynb - Colab

This is Diabetes Prediction using Machine Learning collab file

Uploaded by

sohinimukherjee0603

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views11 pages

Diabetes - Prediction - Project - Ipynb - Colab

This is Diabetes Prediction using Machine Learning collab file

Uploaded by

sohinimukherjee0603

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

import numpy as np

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
#

dataset=pd.read_csv('diabetes.csv')
dataset.head()

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome

0 6 148 72 35 0 33.6 0.627 50 1

1 1 85 66 29 0 26.6 0.351 31 0

2 8 183 64 0 0 23.3 0.672 32 1

3 1 89 66 23 94 28.1 0.167 21 0

4 0 137 40 35 168 43.1 2.288 33 1

dataset.columns

Index(['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',

'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome'],
dtype='object')

dataset.describe()

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction

count 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000

mean 3.845052 120.894531 69.105469 20.536458 79.799479 31.992578 0.471876

std 3.369578 31.972618 19.355807 15.952218 115.244002 7.884160 0.331329

min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.078000

25% 1.000000 99.000000 62.000000 0.000000 0.000000 27.300000 0.243750

50% 3.000000 117.000000 72.000000 23.000000 30.500000 32.000000 0.372500

75% 6.000000 140.250000 80.000000 32.000000 127.250000 36.600000 0.626250

max 17.000000 199.000000 122.000000 99.000000 846.000000 67.100000 2.420000

We need to replace the 0 values with nan

dataset[['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI']]=dataset[['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin

dataset.isnull().sum()

Pregnancies 0
Glucose 5
BloodPressure 35
SkinThickness 227
Insulin 374
BMI 11
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64

print(dataset.head(10))

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI \

0 6 148.0 72.0 35.0 NaN 33.6
1 1 85.0 66.0 29.0 NaN 26.6
2 8 183.0 64.0 NaN NaN 23.3
3 1 89.0 66.0 23.0 94.0 28.1
4 0 137.0 40.0 35.0 168.0 43.1
5 5 116.0 74.0 NaN NaN 25.6
6 3 78.0 50.0 32.0 88.0 31.0
7 10 115.0 NaN NaN NaN 35.3
8 2 197.0 70.0 45.0 543.0 30.5
9 8 125.0 96.0 NaN NaN NaN

DiabetesPedigreeFunction Age Outcome

0 0.627 50 1
1 0.351 31 0
2 0.672 32 1
3 0.167 21 0
4 2.288 33 1
5 0.201 30 0
6 0.248 26 1
7 0.134 29 0
8 0.158 53 1
9 0.232 54 1

We need to replace Nan values with the mean or median of the column
dataset['Glucose'].fillna(dataset['Glucose'].mean(), inplace=True)
dataset['BloodPressure'].fillna(dataset['BloodPressure'].mean(), inplace=True)
dataset['SkinThickness'].fillna(dataset['SkinThickness'].median(), inplace=True)
dataset['Insulin'].fillna(dataset['Insulin'].median(), inplace=True)
dataset['BMI'].fillna(dataset['BMI'].median(), inplace=True)

print(dataset)

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI \

0 6 148.0 72.0 35.0 125.0 33.6
1 1 85.0 66.0 29.0 125.0 26.6
2 8 183.0 64.0 29.0 125.0 23.3
3 1 89.0 66.0 23.0 94.0 28.1
4 0 137.0 40.0 35.0 168.0 43.1
.. ... ... ... ... ... ...
763 10 101.0 76.0 48.0 180.0 32.9
764 2 122.0 70.0 27.0 125.0 36.8
765 5 121.0 72.0 23.0 112.0 26.2
766 1 126.0 60.0 29.0 125.0 30.1
767 1 93.0 70.0 31.0 125.0 30.4

DiabetesPedigreeFunction Age Outcome

0 0.627 50 1
1 0.351 31 0
2 0.672 32 1
3 0.167 21 0
4 2.288 33 1
.. ... ... ...
763 0.171 63 0
764 0.340 27 0
765 0.245 30 0
766 0.349 47 1
767 0.315 23 0

[768 rows x 9 columns]

Visualising the columns

dataset.hist(figsize=(10,10))

array([[<Axes: title={'center': 'Pregnancies'}>,

<Axes: title={'center': 'Glucose'}>,
<Axes: title={'center': 'BloodPressure'}>],
[<Axes: title={'center': 'SkinThickness'}>,
<Axes: title={'center': 'Insulin'}>,
<Axes: title={'center': 'BMI'}>],
[<Axes: title={'center': 'DiabetesPedigreeFunction'}>,
<Axes: title={'center': 'Age'}>,
<Axes: title={'center': 'Outcome'}>]], dtype=object)
plt.subplot(121)
sns.distplot(dataset['Pregnancies'])
plt.subplot(122)
dataset['Pregnancies'].plot.box(figsize=(16, 5))
plt.show()

plt.subplot(121)
sns.distplot(dataset['Insulin'])
plt.subplot(122)
dataset['Insulin'].plot.box(figsize=(16, 5))
plt.show()

plt.subplot(121)
sns.distplot(dataset['Glucose'])
plt.subplot(122)
dataset['Glucose'].plot.box(figsize=(16, 5))
plt.show()
plt.subplot(121)
sns.distplot(dataset['BMI'])
plt.subplot(122)
dataset['BMI'].plot.box(figsize=(16, 5))
plt.show()

plt.subplot(121)
sns.distplot(dataset['BloodPressure'])
plt.subplot(122)
dataset['BloodPressure'].plot.box(figsize=(16, 5))
plt.show()

Correlation between features

dataset.corr()

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigr

Pregnancies 1.000000 0.127911 0.208522 0.081770 0.025047 0.021559

Glucose 0.127911 1.000000 0.218367 0.192686 0.419064 0.231128

BloodPressure 0.208522 0.218367 1.000000 0.191853 0.045087 0.281199

SkinThickness 0.081770 0.192686 0.191853 1.000000 0.155610 0.543205

Insulin 0.025047 0.419064 0.045087 0.155610 1.000000 0.180241

BMI 0.021559 0.231128 0.281199 0.543205 0.180241 1.000000

DiabetesPedigreeFunction -0.033523 0.137060 -0.002763 0.102188 0.126503 0.153438

Age 0.544341 0.266534 0.324595 0.126107 0.097101 0.025597

Outcome 0.221898 0.492928 0.166074 0.214873 0.203790 0.312038

sns.heatmap(dataset.corr(), annot=True)
<Axes: >

Split Dataset into training set and test set

from sklearn.model_selection import train_test_split

X=dataset.iloc[:,:-1].values
y=dataset.iloc[:,-1].values
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=0)

Feature Scaling

from sklearn.preprocessing import StandardScaler

sc= StandardScaler()
X_train=sc.fit_transform(X_train)
X_test=sc.transform(X_test)

X_train

array([[ 1.50755225, -1.09966742, -0.91333816, ..., -1.45629684,

-0.98325882, -0.04863985],
[-0.82986389, -0.133331 , -1.25078001, ..., 0.09212153,
-0.62493647, -0.88246592],
[-1.12204091, -1.03302353, 0.60515017, ..., -0.03691333,
0.39884168, -0.5489355 ],
...,
[ 0.04666716, -0.93305769, -0.66025677, ..., -1.14087828,
-0.96519215, -1.04923114],
[ 2.09190629, -1.2329552 , 0.09898739, ..., -0.3666691 ,
-0.5075031 , 0.11812536],
[ 0.33884418, 0.46646402, 0.7738711 , ..., -0.05125054,
0.51627505, 2.953134 ]])

X_test

array([[-0.82986389, 2.56574658, 0.26770832, ..., 1.46849342,

2.78665365, -0.96584853],
[-0.53768687, -0.4998724 , 0.09898739, ..., 0.13513316,
-0.19434743, -0.88246592],
[ 0.04666716, -1.53285271, -0.91333816, ..., 0.19248198,
-0.23349189, -0.71570071],
...,
[-0.82986389, -0.43322851, -1.08205909, ..., -1.04051783,
1.4406865 , -1.04923114],
[-0.24550986, 0.19988846, 0.43642925, ..., -1.65701774,
-0.60385869, 1.7857775 ],
[ 0.33884418, -1.13298936, 0.43642925, ..., -0.72509927,
-0.63396981, 0.28489057]])

y_train

array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0,
0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0,
1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1,
0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1,
0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0,
0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1,
0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0,
0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1,
1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1,
0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0,
1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0,
1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,
1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0,
0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1,
1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1,
1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1,
0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1,
0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,
0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1,
1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0,
1, 0, 0, 0])

y_test

array([1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1,
1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1,
1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1,
0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0,
1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0])

from sklearn.metrics import confusion_matrix,accuracy_score

Logistic Regression

from sklearn.linear_model import LogisticRegression

lr= LogisticRegression(random_state = 0)
lr.fit(X_train, y_train)

▾ LogisticRegression
LogisticRegression(random_state=0)

lr_train = lr.predict(X_train)
print(confusion_matrix(y_train,lr_train))
print(accuracy_score(y_train,lr_train))

[[322 48]
[ 89 117]]
0.7621527777777778

lr_test= lr.predict(X_test)
print(confusion_matrix(y_test,lr_test))
print(accuracy_score(y_test,lr_test))

[[117 13]
[ 28 34]]
0.7864583333333334

K- NEAREST NEIGHBORS

from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier(n_neighbors = 15, metric = 'minkowski', p = 2)
knn.fit(X_train, y_train)

▾ KNeighborsClassifier
KNeighborsClassifier(n_neighbors=15)

knn_train = knn.predict(X_train)
print(confusion_matrix(y_train,knn_train))
print(accuracy_score(y_train,knn_train))

[[332 38]
[ 84 122]]
0.7881944444444444

knn_test= knn.predict(X_test)
print(confusion_matrix(y_test,knn_test))
print(accuracy_score(y_test,knn_test))

[[118 12]
[ 25 37]]
0.8072916666666666

DECISION TREE CLASSIFIER

from sklearn.tree import DecisionTreeClassifier

dt = DecisionTreeClassifier(criterion = 'entropy', random_state = 0)
dt.fit(X_train, y_train)
▾ DecisionTreeClassifier
DecisionTreeClassifier(criterion='entropy', random_state=0)

dt_train=dt.predict(X_train)
confusion_matrix(y_train,dt_train)
accuracy_score(y_train,dt_train)

1.0

dt_pred=dt.predict(X_test)
confusion_matrix(y_test,dt_pred)
accuracy_score(y_test,dt_pred)

0.7135416666666666

RANDOM FOREST CLASSIFIER

from sklearn.ensemble import RandomForestClassifier

rfc = RandomForestClassifier(n_estimators=200)
rfc.fit(X_train, y_train)

▾ RandomForestClassifier
RandomForestClassifier(n_estimators=200)

rfc_train=rfc.predict(X_train)
print(confusion_matrix(y_train,rfc_train))
print(accuracy_score(y_train,rfc_train))

[[370 0]
[ 0 206]]
1.0

rfc_pred=rfc.predict(X_test)
print(confusion_matrix(y_test,rfc_pred))
print(accuracy_score(y_test,rfc_pred))

[[113 17]
[ 28 34]]
0.765625

SUPPORT VECTOR MACHINE

from sklearn.svm import SVC

svc = SVC(C = 0.25,kernel = 'rbf', random_state = 0)
svc.fit(X_train, y_train)

▾ SVC
SVC(C=0.25, random_state=0)

svc_train=svc.predict(X_train)
print(confusion_matrix(y_train,svc_train))
print(accuracy_score(y_train,svc_train))

[[343 27]
[ 89 117]]
0.7986111111111112

svc_pred=svc.predict(X_test)
print(confusion_matrix(y_test,svc_pred))
print(accuracy_score(y_test,svc_pred))

[[122 8]
[ 32 30]]
0.7916666666666666

NAIVE BAYES

from sklearn.naive_bayes import GaussianNB

nb = GaussianNB()
nb.fit(X_train, y_train)

▾ GaussianNB
GaussianNB()

nb_train=nb.predict(X_train)
print(confusion_matrix(y_train,nb_train))
print(accuracy_score(y_train,nb_train))
[[305 65]
[ 81 125]]
0.7465277777777778

nb_test=nb.predict(X_test)
print(confusion_matrix(y_test,nb_test))
print(accuracy_score(y_test,nb_test))

[[113 17]
[ 27 35]]
0.7708333333333334

keyboard_arrow_down KNN gives the highest accuracy of 80%

print(np.concatenate((knn_test.reshape(len(knn_test),1), y_test.reshape(len(y_test),1)),1))

[[1 1]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[1 1]
[1 1]
[0 0]
[0 0]
[1 1]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[0 1]
[0 1]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[1 0]
[0 0]
[0 0]
[1 0]
[0 0]
[0 0]
[1 0]
[0 0]
[1 1]
[0 1]
[0 0]
[1 0]
[1 0]
[0 0]
[0 0]
[0 0]
[1 1]
[1 1]
[0 0]
[0 0]
[0 1]
[1 1]
[0 1]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]
[0 1]

print(confusion_matrix(y_test,knn_test))
acc=accuracy_score(y_test,knn_test)
print("Accuracy: {:.2f} %".format(acc*100))

[[118 12]
[ 25 37]]
Accuracy: 80.73 %

Making a new prediction

print("y_pred")
print((knn.predict([X_test[3]])))
print("y_true")
print(y_test[3])

y_pred
[1]
y_true
1
p=knn.predict(sc.transform([[0,137,40,35,168,43.1,2.228,33]]))
print(p)
if p==0:
print("Not Diabetic")
else:
print("Diabetic")

[1]
Diabetic

keyboard_arrow_down Using Neural Networks

keyboard_arrow_down Building the ANN
import tensorflow as tf

tf.__version__

'2.15.0'

Initializing the ANN

ann= tf.keras.models.Sequential()

Building the Hidden layers

ann.add(tf.keras.layers.Dense(units=16,activation='relu'))
ann.add(tf.keras.layers.Dense(units=16,activation='relu'))
ann.add(tf.keras.layers.Dense(units=8,activation='relu'))

Building the Output layer

ann.add(tf.keras.layers.Dense(units=1,activation='sigmoid'))

Compiling the ANN

ann.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

Fitting the ANN on training set

ann.fit(X_train,y_train,batch_size=32,epochs=100)

Epoch 1/100
18/18 [==============================] - 2s 4ms/step - loss: 0.7237 - accuracy: 0.4132
Epoch 2/100
18/18 [==============================] - 0s 4ms/step - loss: 0.6966 - accuracy: 0.5608
Epoch 3/100
18/18 [==============================] - 0s 8ms/step - loss: 0.6826 - accuracy: 0.6372
Epoch 4/100
18/18 [==============================] - 0s 5ms/step - loss: 0.6740 - accuracy: 0.6667
Epoch 5/100
18/18 [==============================] - 0s 5ms/step - loss: 0.6647 - accuracy: 0.6823
Epoch 6/100
18/18 [==============================] - 0s 11ms/step - loss: 0.6552 - accuracy: 0.6944
Epoch 7/100
18/18 [==============================] - 0s 7ms/step - loss: 0.6431 - accuracy: 0.6997
Epoch 8/100
18/18 [==============================] - 0s 7ms/step - loss: 0.6287 - accuracy: 0.7066
Epoch 9/100
18/18 [==============================] - 0s 8ms/step - loss: 0.6100 - accuracy: 0.7101
Epoch 10/100
18/18 [==============================] - 0s 5ms/step - loss: 0.5886 - accuracy: 0.7153
Epoch 11/100
18/18 [==============================] - 0s 6ms/step - loss: 0.5659 - accuracy: 0.7240
Epoch 12/100
18/18 [==============================] - 0s 9ms/step - loss: 0.5456 - accuracy: 0.7361
Epoch 13/100
18/18 [==============================] - 0s 12ms/step - loss: 0.5276 - accuracy: 0.7292
Epoch 14/100
18/18 [==============================] - 0s 8ms/step - loss: 0.5155 - accuracy: 0.7378
Epoch 15/100
18/18 [==============================] - 0s 4ms/step - loss: 0.5053 - accuracy: 0.7517
Epoch 16/100
18/18 [==============================] - 0s 6ms/step - loss: 0.4980 - accuracy: 0.7552
Epoch 17/100
18/18 [==============================] - 0s 5ms/step - loss: 0.4902 - accuracy: 0.7656
Epoch 18/100
18/18 [==============================] - 0s 3ms/step - loss: 0.4851 - accuracy: 0.7622
Epoch 19/100
18/18 [==============================] - 0s 7ms/step - loss: 0.4810 - accuracy: 0.7656
Epoch 20/100
18/18 [==============================] - 0s 16ms/step - loss: 0.4762 - accuracy: 0.7760
Epoch 21/100
18/18 [==============================] - 0s 8ms/step - loss: 0.4714 - accuracy: 0.7743
Epoch 22/100
18/18 [==============================] - 0s 8ms/step - loss: 0.4679 - accuracy: 0.7778
Epoch 23/100
18/18 [==============================] - 0s 7ms/step - loss: 0.4643 - accuracy: 0.7795
Epoch 24/100
18/18 [==============================] - 0s 6ms/step - loss: 0.4628 - accuracy: 0.7778
Epoch 25/100
18/18 [==============================] - 0s 7ms/step - loss: 0.4587 - accuracy: 0.7830
Epoch 26/100
18/18 [==============================] - 0s 8ms/step - loss: 0.4553 - accuracy: 0.7812
Epoch 27/100
18/18 [==============================] - 0s 11ms/step - loss: 0.4521 - accuracy: 0.7882
Epoch 28/100
18/18 [==============================] - 0s 6ms/step - loss: 0.4494 - accuracy: 0.7865
Epoch 29/100
18/18 [==============================] - 0s 8ms/step - loss: 0.4471 - accuracy: 0.7899

Making a Prediction on the test set

a_pred=ann.predict(X_test)
a_pred=(a_pred>0.5)
print(np.concatenate((a_pred.reshape(len(a_pred),1), y_test.reshape(len(y_test),1)),1))

6/6 [==============================] - 0s 3ms/step

[[1 1]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[1 1]
[1 1]
[1 0]
[0 0]
[1 1]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[1 1]
[0 1]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[1 0]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[1 0]
[0 0]
[1 1]
[0 1]
[0 0]
[0 0]
[1 0]
[0 0]
[0 0]
[0 0]
[1 1]
[1 1]
[0 0]
[0 0]
[0 1]
[0 1]
[0 1]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]

a_train=ann.predict(X_train)
a_train=(a_train>0.5)
print(np.concatenate((a_train.reshape(len(a_train),1), y_train.reshape(len(y_train),1)),1))

18/18 [==============================] - 0s 2ms/step

[[0 0]
[0 0]
[0 0]
...
[0 0]
[0 0]
[0 0]]

Confusion Matrix and the Accuracy

Accuracy on training set datas

print(confusion_matrix(y_train,a_train))
acc=accuracy_score(y_train,a_train)
print("Accuracy: {:.2f} %".format(acc*100))

[[333 37]
[ 55 151]]
Accuracy: 84.03 %

Accuracy on test set datas

cm=confusion_matrix(y_test,a_pred)
print(cm)
acc=accuracy_score(y_test,a_pred)
print("Accuracy: {:.2f} %".format(acc*100))

[[115 15]
[ 23 39]]
Accuracy: 80.21 %

Making a new single prediction

b=(ann.predict(sc.transform([[0,137,40,35,168,43.1,2.228,33]]))>0.5)
if b==True:
print("Diabetic")
else:
print("Not Diabetic")

1/1 [==============================] - 0s 146ms/step

Diabetic

Diabetes Prediction Using Knn.ipynb
No ratings yet
Diabetes Prediction Using Knn.ipynb
723 pages
BMW MY 2023 All Models Maintenance (on Line)
No ratings yet
BMW MY 2023 All Models Maintenance (on Line)
24 pages
Diabetes_prediction.ipynb
No ratings yet
Diabetes_prediction.ipynb
69 pages
Diabetes
No ratings yet
Diabetes
97 pages
TBC-Test LTBI (Latent TB Infeksi)
100% (1)
TBC-Test LTBI (Latent TB Infeksi)
39 pages
Heart Disease Prediction! ❤️?
No ratings yet
Heart Disease Prediction! ❤️?
52 pages
Vertopal.com Heart Failure Prediction With Detailed Headings
No ratings yet
Vertopal.com Heart Failure Prediction With Detailed Headings
12 pages
115661979 U S DividendChampions
No ratings yet
115661979 U S DividendChampions
66 pages
Final Result
No ratings yet
Final Result
353 pages
ML Proj Diabetes.pptx
No ratings yet
ML Proj Diabetes.pptx
51 pages
vertopal.com_python2025
No ratings yet
vertopal.com_python2025
25 pages
Fds Mannual
No ratings yet
Fds Mannual
39 pages
GUERIDON SERVICE
No ratings yet
GUERIDON SERVICE
19 pages
Python Solution
No ratings yet
Python Solution
30 pages
Data Pre-Processing
No ratings yet
Data Pre-Processing
22 pages
Reading@the End of Life On Earth - Exercises
0% (1)
Reading@the End of Life On Earth - Exercises
4 pages
Psychological Statistics and Psychometrics Using Stata, 1st Edition pdf docx
100% (12)
Psychological Statistics and Psychometrics Using Stata, 1st Edition pdf docx
17 pages
vertopal.com_Project_16_Calories_Burnt_Prediction
No ratings yet
vertopal.com_Project_16_Calories_Burnt_Prediction
10 pages
Model2.ipynb - Colab
No ratings yet
Model2.ipynb - Colab
11 pages
Merged
No ratings yet
Merged
35 pages
chisquare
No ratings yet
chisquare
9 pages
Preprocessing1.ipynb - Colab
No ratings yet
Preprocessing1.ipynb - Colab
13 pages
腦功能期中筆記
No ratings yet
腦功能期中筆記
9 pages
Lecture 08 Nonlinearity
No ratings yet
Lecture 08 Nonlinearity
26 pages
SMG Estimation of Survival Function
No ratings yet
SMG Estimation of Survival Function
8 pages
Kuta
No ratings yet
Kuta
7 pages
KNN - Jupyter Notebook (1)
No ratings yet
KNN - Jupyter Notebook (1)
7 pages
Ialogue Riting: School Section
No ratings yet
Ialogue Riting: School Section
9 pages
Experiment No.2.Ipynb
No ratings yet
Experiment No.2.Ipynb
7 pages
Employees Burnout Analysis
No ratings yet
Employees Burnout Analysis
20 pages
ML Data Preprocessing in Python
No ratings yet
ML Data Preprocessing in Python
9 pages
Data Science Practical 9
No ratings yet
Data Science Practical 9
6 pages
k9 Classic Edp
No ratings yet
k9 Classic Edp
2 pages
Industry Keyword
No ratings yet
Industry Keyword
6 pages
DL EXP2.ipynb - Colaboratory
No ratings yet
DL EXP2.ipynb - Colaboratory
6 pages
Service Manual: Model No. ESA41 2K
No ratings yet
Service Manual: Model No. ESA41 2K
24 pages
Diabetis Project
No ratings yet
Diabetis Project
7 pages
Thunder Tiger Tomahawk ST Manual
No ratings yet
Thunder Tiger Tomahawk ST Manual
16 pages
Week 13 1-Pandas
No ratings yet
Week 13 1-Pandas
10 pages
DAL Experiment Outputs 6to10
No ratings yet
DAL Experiment Outputs 6to10
16 pages
Diabetics Data Analysis
No ratings yet
Diabetics Data Analysis
5 pages
Mwe 18u01030
No ratings yet
Mwe 18u01030
15 pages
Stroke Prediction
No ratings yet
Stroke Prediction
10 pages
Data Loading- Jupyter Notebook
No ratings yet
Data Loading- Jupyter Notebook
15 pages
PSDP 2018 19 - Categories
No ratings yet
PSDP 2018 19 - Categories
117 pages
SY8286
No ratings yet
SY8286
8 pages
healthcare-project-simplilearn- Week1
No ratings yet
healthcare-project-simplilearn- Week1
6 pages
Assignment 03
No ratings yet
Assignment 03
6 pages
Master Electrician Exam Prep - A - Liam Fuseworth
100% (2)
Master Electrician Exam Prep - A - Liam Fuseworth
120 pages
List of Courses - 3-Month
No ratings yet
List of Courses - 3-Month
4 pages
Pandas
No ratings yet
Pandas
4 pages
Data Pre Processing 1
No ratings yet
Data Pre Processing 1
35 pages
Exp 5
No ratings yet
Exp 5
7 pages
1 Simple Linear Regression
No ratings yet
1 Simple Linear Regression
9 pages
baseline.ipynb - Colab
No ratings yet
baseline.ipynb - Colab
5 pages
labpg3.ipynb - Colab
No ratings yet
labpg3.ipynb - Colab
2 pages
Dovdush_KN-305_lab3
No ratings yet
Dovdush_KN-305_lab3
2 pages
Practical 4
No ratings yet
Practical 4
2 pages
Experiment 4
No ratings yet
Experiment 4
5 pages
Ml1.ipynb - Colaboratory
No ratings yet
Ml1.ipynb - Colaboratory
5 pages
KNN For Classification
No ratings yet
KNN For Classification
4 pages
The nth Term of a Linear Sequence
No ratings yet
The nth Term of a Linear Sequence
3 pages
ADS Exp-1
No ratings yet
ADS Exp-1
3 pages
Week 4 Naive Bayes Classifier
No ratings yet
Week 4 Naive Bayes Classifier
2 pages
Linear and Multilinear Regression
No ratings yet
Linear and Multilinear Regression
5 pages
B.SC Zoology
No ratings yet
B.SC Zoology
20 pages
Project 3 - Diabetes Prediction.ipynb - Colab
No ratings yet
Project 3 - Diabetes Prediction.ipynb - Colab
4 pages
Student - Linear Regression Example - Colaboratory
No ratings yet
Student - Linear Regression Example - Colaboratory
6 pages
Heart Disease Prediction (1) (1) - 1
No ratings yet
Heart Disease Prediction (1) (1) - 1
1 page
Correlation: Import As Import As Import As Import As From Import From Import Import Matplotlib Import
No ratings yet
Correlation: Import As Import As Import As Import As From Import From Import Import Matplotlib Import
1 page
Tebak Aja
No ratings yet
Tebak Aja
3 pages
C42GM 1645092199975
No ratings yet
C42GM 1645092199975
1 page
University of Zakho College of Engineering Petroleum Department
No ratings yet
University of Zakho College of Engineering Petroleum Department
10 pages
Telling The Time Past 15 Minutes Lesson Plan
70% (10)
Telling The Time Past 15 Minutes Lesson Plan
5 pages
Practical 1
No ratings yet
Practical 1
7 pages
DLD Lab
No ratings yet
DLD Lab
4 pages
Shailesh020902@gmail - Com 1
No ratings yet
Shailesh020902@gmail - Com 1
1 page
House Keeping Pr65pogc001
100% (2)
House Keeping Pr65pogc001
23 pages
Heart Diseases EDA
No ratings yet
Heart Diseases EDA
1 page
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
No ratings yet
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
12 pages
Artificial Neural Network (Ann)
No ratings yet
Artificial Neural Network (Ann)
1 page
Pima Indian Diabetes Prediction
No ratings yet
Pima Indian Diabetes Prediction
22 pages
Res 1 Chapter 1 Group 3 Lannister
No ratings yet
Res 1 Chapter 1 Group 3 Lannister
15 pages
Sta. Justa National High School: Weekly Lesson Log
No ratings yet
Sta. Justa National High School: Weekly Lesson Log
5 pages
Pima Indian Diabetes Questions
No ratings yet
Pima Indian Diabetes Questions
6 pages
Treating Spinal Cord Injuries With Zhu's Scalp Acupuncture: Moyee Siu, L.Ac., MTCM
No ratings yet
Treating Spinal Cord Injuries With Zhu's Scalp Acupuncture: Moyee Siu, L.Ac., MTCM
8 pages
Transposons (Jumping Genes)
100% (1)
Transposons (Jumping Genes)
4 pages
Name and Formula
No ratings yet
Name and Formula
3 pages
From Overweight to Weight Loss
From Everand
From Overweight to Weight Loss
Mister Dred
No ratings yet
A List of Factorial Math Constants
From Everand
A List of Factorial Math Constants
Archive Classics
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Diabetes - Prediction - Project - Ipynb - Colab

Uploaded by

Diabetes - Prediction - Project - Ipynb - Colab

Uploaded by

import numpy as np

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome

0 6 148 72 35 0 33.6 0.627 50 1

2 8 183 64 0 0 23.3 0.672 32 1

4 0 137 40 35 168 43.1 2.288 33 1

Index(['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction

count 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000

mean 3.845052 120.894531 69.105469 20.536458 79.799479 31.992578 0.471876

std 3.369578 31.972618 19.355807 15.952218 115.244002 7.884160 0.331329

min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.078000

25% 1.000000 99.000000 62.000000 0.000000 0.000000 27.300000 0.243750

50% 3.000000 117.000000 72.000000 23.000000 30.500000 32.000000 0.372500

75% 6.000000 140.250000 80.000000 32.000000 127.250000 36.600000 0.626250

max 17.000000 199.000000 122.000000 99.000000 846.000000 67.100000 2.420000

We need to replace the 0 values with nan

dataset[['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI']]=dataset[['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI \

DiabetesPedigreeFunction Age Outcome

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI \

DiabetesPedigreeFunction Age Outcome

[768 rows x 9 columns]

Visualising the columns

array([[<Axes: title={'center': 'Pregnancies'}>,

Correlation between features

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigr

Pregnancies 1.000000 0.127911 0.208522 0.081770 0.025047 0.021559

Glucose 0.127911 1.000000 0.218367 0.192686 0.419064 0.231128

BloodPressure 0.208522 0.218367 1.000000 0.191853 0.045087 0.281199

SkinThickness 0.081770 0.192686 0.191853 1.000000 0.155610 0.543205

Insulin 0.025047 0.419064 0.045087 0.155610 1.000000 0.180241

BMI 0.021559 0.231128 0.281199 0.543205 0.180241 1.000000

DiabetesPedigreeFunction -0.033523 0.137060 -0.002763 0.102188 0.126503 0.153438

Age 0.544341 0.266534 0.324595 0.126107 0.097101 0.025597

Outcome 0.221898 0.492928 0.166074 0.214873 0.203790 0.312038

Split Dataset into training set and test set

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

array([[ 1.50755225, -1.09966742, -0.91333816, ..., -1.45629684,

array([[-0.82986389, 2.56574658, 0.26770832, ..., 1.46849342,

from sklearn.metrics import confusion_matrix,accuracy_score

from sklearn.linear_model import LogisticRegression

from sklearn.neighbors import KNeighborsClassifier

DECISION TREE CLASSIFIER

from sklearn.tree import DecisionTreeClassifier

RANDOM FOREST CLASSIFIER

from sklearn.ensemble import RandomForestClassifier

SUPPORT VECTOR MACHINE

from sklearn.svm import SVC

from sklearn.naive_bayes import GaussianNB

keyboard_arrow_down KNN gives the highest accuracy of 80%

Making a new prediction

keyboard_arrow_down Using Neural Networks

Initializing the ANN

Building the Hidden layers

Building the Output layer

Compiling the ANN

Fitting the ANN on training set

Making a Prediction on the test set

6/6 [==============================] - 0s 3ms/step

18/18 [==============================] - 0s 2ms/step

Confusion Matrix and the Accuracy

Accuracy on training set datas

Accuracy on test set datas

Making a new single prediction

1/1 [==============================] - 0s 146ms/step

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.