0% found this document useful (0 votes)
28 views3 pages

Lab 10 - Manual and Assignment On KNN

The document outlines a lab exercise for implementing the k-Nearest Neighbours (k-NN) algorithm using the Iris dataset. It includes steps for loading the dataset, coding the k-NN classifier from scratch, and evaluating model accuracy through train-test split. Additionally, it poses questions regarding the impact of various parameters on model performance and classification outcomes.

Uploaded by

bishwadiprajp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views3 pages

Lab 10 - Manual and Assignment On KNN

The document outlines a lab exercise for implementing the k-Nearest Neighbours (k-NN) algorithm using the Iris dataset. It includes steps for loading the dataset, coding the k-NN classifier from scratch, and evaluating model accuracy through train-test split. Additionally, it poses questions regarding the impact of various parameters on model performance and classification outcomes.

Uploaded by

bishwadiprajp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

AI and ML Laboratory (CS2672)

Lab 10: Classification with k-NN

Objective:
Understand and implement k-Nearest Neighbours (k-NN).

Task:
- Load the Iris dataset using Scikit-learn.

- Write a k-NN classifier from scratch.

- Evaluate the model's accuracy using train-test split.

Introduction to k-Nearest Neighbours (k-NN):


k-NN is a simple, non-parametric, and lazy learning algorithm used for classification and
regression. It classifies a data point based on how its neighbors are classified.

Step-by-Step Implementation:

Step 1: Load the Iris Dataset


The Iris dataset is a standard dataset included in Scikit-learn, containing three classes of iris
flowers (Setosa, Versicolor, and Virginica) with four features each. Use
`sklearn.datasets.load_iris()` to load the dataset.

```python
from sklearn import datasets
import numpy as np

data = datasets.load_iris()
X = data.data
y = data.target
```

Step 2: Implement k-NN from Scratch


Compute the Euclidean distance between test points and training points. Identify the `k`
nearest neighbors. Assign the most common class among the `k` neighbors to the test point.

```python
from collections import Counter
from scipy.spatial import distance
def euclidean_distance(x1, x2):
return np.sqrt(np.sum((x1 - x2) ** 2))

class KNN:
def __init__(self, k=3):
self.k = k

def fit(self, X_train, y_train):


self.X_train = X_train
self.y_train = y_train

def predict(self, X_test):


predictions = [self._predict(x) for x in X_test]
return np.array(predictions)

def _predict(self, x):


distances = [euclidean_distance(x, x_train) for x_train in self.X_train]
k_indices = np.argsort(distances)[:self.k]
k_nearest_labels = [self.y_train[i] for i in k_indices]
most_common = Counter(k_nearest_labels).most_common(1)
return most_common[0][0]
```

Step 3: Evaluate the Model


Split the dataset into training and testing sets. Train the model and predict the test set
labels. Compute the accuracy.

```python
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

knn = KNN(k=5)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)


print(f'Accuracy: {accuracy * 100:.2f}%')
```
Assignment:
1. What is the impact of choosing different values of k on the model's accuracy?

2. How does k-NN handle multi-class classification problems?

3. How does feature scaling affect the performance of k-NN?

4. What other distance metrics can be used instead of Euclidean distance, and how do
they impact classification?

5. How does the choice of train-test split ratio affect the model’s performance?

6. What happens when k is too small or too large?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy