0% found this document useful (0 votes)
3 views1 page

K-NN and Perceptron

The document outlines a mini course on Machine Learning focusing on mathematical foundations and practical applications, specifically using a dataset for credit approval. It details the use of k-NN and Perceptron algorithms for classification, emphasizing the importance of accuracy measurement and data normalization. The course includes instructions for implementing these algorithms in MATLAB and evaluating their performance on training and testing datasets.

Uploaded by

dienb2203806
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views1 page

K-NN and Perceptron

The document outlines a mini course on Machine Learning focusing on mathematical foundations and practical applications, specifically using a dataset for credit approval. It details the use of k-NN and Perceptron algorithms for classification, emphasizing the importance of accuracy measurement and data normalization. The course includes instructions for implementing these algorithms in MATLAB and evaluating their performance on training and testing datasets.

Uploaded by

dienb2203806
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Can Tho University Department of Mathematics

Mini Course:
Machine Learning - Mathematical Foundation and Practical Applications

The pdf file, CreditApproval.pdf, has a description of the features of the data set. It is noted the
values in the data set was changed by the authors to protect the confidentiality of the data. Use the
first 500 data for the training set and the rest for the testing set.

1. k-NN.
The training data set is used to determine the labels of the data in the testing set.

ˆ Use k = n, where n is the number of data in the training data set, determine which of
the norms (Euclidean, Manhattan, or Mahalanobis) gives you the best accuracy for the
data in the test data set. The accuracy is defined to be the ratio of the number of data
that was classified correctly by the algorithm divided by the total number of data. Do
this for two cases:
– Data is used unchanged.
– Data is normalized using the mean
n
1X
µj = xj (i)
n i=1

and the variance n


1X
σj2 = (xj (i) − µj )2
n i=1
as follows:
xj (i) − µj
xj (i) = ,
σj2
where xj (i) is the jth feature of the ith data.

Note: If you write your code in MATLAB, you can use the MATLAB routines: sort (for
sorting), idx (for keeping tracks of the data indices when sorting), and mode (for finding the
majority of the nearest neigbors’ labels)

1. Perceptron The training set is used to determine the best weights. Once the best weights
are determined, you apply them to the testing data set, which you can think as future credit
card applications, except that you know the labels so that you can determine how well the
best weights that you computed from the training data set classify them by calculating the
accuracy. Set the number of epochs to be 1000. For the output, report the accuracy for the
training data as well as the testing data. Do this for following cases:

ˆ Data is used unchanged.


ˆ Data is normalized as above.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy