PML Lab Exp 11
PML Lab Exp 11
Aim
To implement the K-Nearest Neighbor (KNN) algorithm to classify a dataset and evaluate the classification
accuracy.
Algorithm
1. Read the dataset from a CSV file and load it into a pandas DataFrame.
3. Normalize the feature data to ensure all features are on the same scale.
Program
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
data = pd.read_csv(’student_performance_knn.csv’)
X = data[[’StudyHours’, ’Attendance’]].values
y = data[’Performance’].map({’Low’: 0, ’Medium’: 1, ’High’: 2}).values
1
AML311 - PML Lab
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
plt.figure(figsize=(6,6))
sns.heatmap(conf_matrix, annot=True, fmt=’d’, cmap=’Blues’,
xticklabels=[’Low’, ’Medium’, ’High’], yticklabels=[’Low’, ’Medium’, ’High’])
plt.title(’Confusion Matrix’)
plt.xlabel(’Predicted’)
plt.ylabel(’Actual’)
plt.savefig(’confusion_matrix_knn.png’)
plt.close()
Result
The K-Nearest Neighbor (KNN) classifier was successfully implemented to classify the given dataset. The
accuracy, precision, and recall values were calculated.
2 CEK
AML311 - PML Lab
data. If K is too large, the model may oversimplify the decision boundary, leading to underfitting
and reduced accuracy.
3 CEK