0% found this document useful (0 votes)
23 views6 pages

Assignment 2

The document contains 10 multiple choice questions about the K-nearest neighbors (KNN) algorithm. KNN is a non-parametric lazy learning algorithm used for classification and regression. It works by finding the closest training examples in the feature space and predicting the target value based on a majority vote of its neighbors.

Uploaded by

mohamedmariam490
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views6 pages

Assignment 2

The document contains 10 multiple choice questions about the K-nearest neighbors (KNN) algorithm. KNN is a non-parametric lazy learning algorithm used for classification and regression. It works by finding the closest training examples in the feature space and predicting the target value based on a majority vote of its neighbors.

Uploaded by

mohamedmariam490
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Assignment 2

Question 1: What is the KNN algorithm?


(A) The KNN algorithm is non-parametric and does not make
assumptions about the underlying distribution of the data.
(B) The KNN works by finding the K closest data points (neighbors) to
the query point and predicts the output based on the labels of these
neighbors.
(C) The KNN algorithm is a lazy machine learning algorithm for
classification and regression tasks. It can work well with both binary
and multi-class classification problems.
(D) All of the above

Question 2: Euclidean and Minkowski distance are the most


commonly used distance metrics in the KNN algorithm.
What are the other distance metrics used in the KNN
algorithm?
(A) Cosine distance
(B) Haversine distance
(C) Manhattan distance
(D) All of the above

Question 3: What are the disadvantages of using the KNN


algorithm?
(A) As the number of dimensions increases, the distance between any
two points in the space becomes increasingly large, making it difficult
to find meaningful nearest neighbors.
(B) Computationally expensive, especially for large datasets, and
requires a large amount of memory to store the entire dataset.
(C) Sensitive to the choice of K and distance metric.
(D) All of the above

Question 4: How do you choose the value of K (the number of


neighbors to consider) in the KNN algorithm? (Select two)
(A) A small value of K, for example, K=1, will result in a more flexible
model but may be prone to overfitting.
(B) A large value of K, for example, K=n, where n is the size of the
dataset, will result in a more stable model but may not capture the
local variations in the data.
(C) A large value of K, for example, K=n, where n is the size of the
dataset, will result in a more flexible model but may be prone to
overfitting.
(D) A small value of K, for example, K=1, will result in a more stable
model but may not capture the local variations in the data.

Question 5: How do you handle imbalanced data in the KNN


algorithm?
(A) Weighted voting, where the vote of each neighbor is weighted by its
inverse distance to the query point. This gives more weight to the
closer neighbors and less weight to the farther neighbors, which can
help to reduce the effect of the majority class.
(B) Oversample the minority class.
(C) Undersample the majority class.
(D) All of the above.

Question 6: How would you choose the distance metric in


KNN?
(A) Euclidean distance is a good default choice for continuous data. It
works well when the data is dense and the differences between features
are important.
(B) Manhattan distance is a good choice when the data has many
outliers or when the scale of the features is different. For example, if we
are comparing distances between two cities, the distance metric should
not be affected by the difference in elevation or terrain between the
cities.
(C) Minkowski distance with p=1 is equivalent to Manhattan distance,
and Minkowski distance with p=2 is equivalent to Euclidean distance.
Minkowski distance allows you to control the order of the distance
metric based on the nature of the problem.
(D) All of the above

Question 7: What are the ideal use cases for KNN?


(A) KNN is best suited for small to medium-sized datasets with
relatively low dimensionality. It can be useful in situations where the
decision boundary is linear. It can be effective in cases where the data
is clustered or has distinct groups.
(B) KNN is best suited for large datasets with relatively high
dimensionality. It can be useful when the decision boundary is highly
irregular or nonlinear. It can be effective in cases where the data is
clustered or has distinct groups.
(C) KNN is best suited for small to medium-sized datasets with
relatively low dimensionality. It can be useful when the decision
boundary is highly irregular or nonlinear. It can be effective in cases
where the data is clustered or has distinct groups.
(D) KNN is best suited for small to medium-sized datasets with
relatively low dimensionality. It can be useful when the decision
boundary is highly irregular or nonlinear. It can be effective in cases
where the data is not clustered or doesn’t have distinct groups.

Question 8: How does the KNN algorithm work? (Select two)


(A) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors.
For regression, the most common class among the ‘k’ neighbors is
assigned as the predicted class for the new data point.
(B) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors.
For classification, averages the values of the most common class
among the ‘k’ neighbor to the target data point.
(C) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors.
For classification, the most common class among the ‘k’ neighbors is
assigned as the predicted class for the new data point.
(D) KNN works by calculating the distance between a data point and all
other points in the dataset. Then, KNN selects the k-nearest neighbors.
For regression tasks, instead of a majority vote, the algorithm takes the
average of the ‘k’ nearest neighbors’ values as the prediction.

Question 9: What’s the bias and variance trade-off for KNN?


(Select two)
(A) A small ‘k’ results in a low bias but high variance (the model is
sensitive to noise).
(B) A large ‘k’ results in a low bias but high variance (the model is
sensitive to noise).
(C) A large ‘k’ leads to high bias but low variance (smoothing over the
data).
(D) A small ‘k’ leads to high bias but low variance (smoothing over the
data).

Question 10: Which options are correct about instance-based


learning, model-based learning, and online learning? (Select
two)
(A) KNN is an instance-based learning algorithm, meaning it
memorizes the entire training dataset and makes predictions based on
similarity to instances. That’s why KNN is not naturally suited for
online learning because it memorizes the entire training dataset. When
new data is added, the entire model needs to be recalculated.
(B) Model-based learning involves learning a mapping from inputs to
outputs and generalizing to new, unseen data. For example, SVM,
Decision Trees, etc.
(C) KNN is a model-based learning algorithm, meaning it memorizes
the entire training dataset and makes predictions based on similarity to
instances. That’s why KNN is not naturally suited for online learning
because it memorizes the entire training dataset. When new data is
added, the entire model needs to be recalculated.
(D) Instance-based learning involves learning a mapping from inputs
to outputs and generalizing to new, unseen data. For example, SVM,
Decision Trees, etc.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy