0% found this document useful (0 votes)
8 views7 pages

L6 Tutorial - KNN - Jupyter Notebook

The document outlines a KNN (K Nearest Neighbors) exercise using the Iris dataset, detailing steps to load the data, create a DataFrame, and visualize the features. It includes training and testing a KNN classifier, achieving a high accuracy score of approximately 96.67%. Additionally, it presents a confusion matrix and classification report for performance evaluation.

Uploaded by

Kelvin Loo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views7 pages

L6 Tutorial - KNN - Jupyter Notebook

The document outlines a KNN (K Nearest Neighbors) exercise using the Iris dataset, detailing steps to load the data, create a DataFrame, and visualize the features. It includes training and testing a KNN classifier, achieving a high accuracy score of approximately 96.67%. Additionally, it presents a confusion matrix and classification report for performance evaluation.

Uploaded by

Kelvin Loo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

KNN Exercise

In [1]: 1 import pandas as pd


2 from sklearn.datasets import load_iris
3 iris = load_iris()

In [2]: 1 iris.feature_names

Out[2]: ['sepal length (cm)',


'sepal width (cm)',
'petal length (cm)',
'petal width (cm)']

In [3]: 1 iris.target_names

Out[3]: array(['setosa', 'versicolor', 'virginica'], dtype='<U10')

In [4]: 1 df = pd.DataFrame(iris.data,columns=iris.feature_names)
2 df.head()

Out[4]:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)

0 5.1 3.5 1.4 0.2

1 4.9 3.0 1.4 0.2

2 4.7 3.2 1.3 0.2

3 4.6 3.1 1.5 0.2

4 5.0 3.6 1.4 0.2

In [5]: 1 df['target'] = iris.target


2 df.head()

Out[5]:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target

0 5.1 3.5 1.4 0.2 0

1 4.9 3.0 1.4 0.2 0

2 4.7 3.2 1.3 0.2 0

3 4.6 3.1 1.5 0.2 0

4 5.0 3.6 1.4 0.2 0


In [6]: 1 df[df.target==1].head()

Out[6]:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target

50 7.0 3.2 4.7 1.4 1

51 6.4 3.2 4.5 1.5 1

52 6.9 3.1 4.9 1.5 1

53 5.5 2.3 4.0 1.3 1

54 6.5 2.8 4.6 1.5 1

In [7]: 1 df[df.target==2].head()

Out[7]:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target

100 6.3 3.3 6.0 2.5 2

101 5.8 2.7 5.1 1.9 2

102 7.1 3.0 5.9 2.1 2

103 6.3 2.9 5.6 1.8 2

104 6.5 3.0 5.8 2.2 2

In [8]: 1 df['flower_name'] =df.target.apply(lambda x: iris.target_names[x])


2 df.head()

Out[8]:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target flower_name

0 5.1 3.5 1.4 0.2 0 setosa

1 4.9 3.0 1.4 0.2 0 setosa

2 4.7 3.2 1.3 0.2 0 setosa

3 4.6 3.1 1.5 0.2 0 setosa

4 5.0 3.6 1.4 0.2 0 setosa


In [9]: 1 df[45:55]

Out[9]:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target flower_name

45 4.8 3.0 1.4 0.3 0 setosa

46 5.1 3.8 1.6 0.2 0 setosa

47 4.6 3.2 1.4 0.2 0 setosa

48 5.3 3.7 1.5 0.2 0 setosa

49 5.0 3.3 1.4 0.2 0 setosa

50 7.0 3.2 4.7 1.4 1 versicolor

51 6.4 3.2 4.5 1.5 1 versicolor

52 6.9 3.1 4.9 1.5 1 versicolor

53 5.5 2.3 4.0 1.3 1 versicolor

54 6.5 2.8 4.6 1.5 1 versicolor

In [10]: 1 df0 = df[:50]


2 df1 = df[50:100]
3 df2 = df[100:]

In [11]: 1 import matplotlib.pyplot as plt


2 %matplotlib inline

Sepal length vs Sepal Width (Setosa vs Versicolor)


In [12]: abel('Sepal
1 Length')
abel('Sepal
2 Width')
atter(df0['sepal
3 length (cm)'], df0['sepal width (cm)'],color="green",marker='+')
atter(df1['sepal
4 length (cm)'], df1['sepal width (cm)'],color="blue",marker='.')

Out[12]: <matplotlib.collections.PathCollection at 0x20bfb412940>

Petal length vs Pepal Width (Setosa vs Versicolor)

In [13]: 1 plt.xlabel('Petal Length')


2 plt.ylabel('Petal Width')
3 plt.scatter(df0['petal length (cm)'], df0['petal width (cm)'],color="green",
4 plt.scatter(df1['petal length (cm)'], df1['petal width (cm)'],color="blue",m

Out[13]: <matplotlib.collections.PathCollection at 0x20bfb521430>

Train test split


In [14]: 1 from sklearn.model_selection import train_test_split

In [15]: 1 X = df.drop(['target','flower_name'], axis='columns')


2 y = df.target

In [16]: , X_test,
1 y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

In [17]: 1 len(X_train)

Out[17]: 120

In [18]: 1 len(X_test)

Out[18]: 30

Create KNN (K Neighrest Neighbour Classifier)

In [19]: 1 from sklearn.neighbors import KNeighborsClassifier


2 knn = KNeighborsClassifier(n_neighbors=10)
3 knn.fit(X_train, y_train)

Out[19]: KNeighborsClassifier(n_neighbors=10)

In [20]: 1 knn.score(X_test, y_test)

Out[20]: 0.9666666666666667

In [21]: 1 knn.predict([[4.8,3.0,1.5,0.3]])

Out[21]: array([0])

Plot Confusion Matrix

In [22]: 1 from sklearn.metrics import confusion_matrix


2 y_pred = knn.predict(X_test)
3 cm = confusion_matrix(y_test, y_pred)
4 cm

Out[22]: array([[11, 0, 0],


[ 0, 12, 1],
[ 0, 0, 6]], dtype=int64)
In [23]: 1 %matplotlib inline
2 import matplotlib.pyplot as plt
3 import seaborn as sn
4 plt.figure(figsize=(7,5))
5 sn.heatmap(cm, annot=True)
6 plt.xlabel('Predicted')
7 plt.ylabel('Truth')

Out[23]: Text(42.0, 0.5, 'Truth')

Print classification report for precesion, recall and f1-score for each
classes

In [24]: 1 from sklearn.metrics import classification_report


2 ​
3 print(classification_report(y_test, y_pred))

precision recall f1-score support

0 1.00 1.00 1.00 11


1 1.00 0.92 0.96 13
2 0.86 1.00 0.92 6

accuracy 0.97 30
macro avg 0.95 0.97 0.96 30
weighted avg 0.97 0.97 0.97 30

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy