0% found this document useful (0 votes)
20 views3 pages

KNN Regression

K-Nearest Neighbors (KNN) regression is a non-parametric method for estimating continuous values based on the k-nearest neighbors in the feature space, without assuming a specific functional form. The process involves selecting the number of neighbors (k), calculating distances to training data points, identifying the nearest neighbors, and aggregating their target values to make predictions. An example demonstrates predicting efficiency at 135°C using KNN, resulting in a predicted efficiency of 0.575 with k=2.

Uploaded by

YASH WANKHEDE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views3 pages

KNN Regression

K-Nearest Neighbors (KNN) regression is a non-parametric method for estimating continuous values based on the k-nearest neighbors in the feature space, without assuming a specific functional form. The process involves selecting the number of neighbors (k), calculating distances to training data points, identifying the nearest neighbors, and aggregating their target values to make predictions. An example demonstrates predicting efficiency at 135°C using KNN, resulting in a predicted efficiency of 0.575 with k=2.

Uploaded by

YASH WANKHEDE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

K-Nearest Neighbors (KNN) Regression Explained

KNN regression is a non-parametric method used for estimating the continuous value of a
data point based on the values of its k-nearest neighbors in the feature space. Unlike
parametric methods that assume a specific functional form for the relationship between
features and the target variable, KNN regression makes no such assumptions. It relies directly
on the proximity of data points.

Here's a step-by-step breakdown of how KNN regression works:

1. Choose the Number of Neighbors (K):

• The first crucial step is to select the value of 'k', which represents the number of
nearest neighbors to consider for making a prediction.
• The choice of 'k' significantly impacts the model's performance.
o Small k (e.g., k=1): The prediction is highly influenced by the single closest
neighbor. This can lead to noisy predictions that are sensitive to local data
variations and outliers (high variance, low bias).
o Large k (e.g., k close to the number of data points): The prediction tends to be
smoother as it averages over a larger neighborhood. This can smooth out noise but
might also average out important local patterns, potentially leading to underfitting
(low variance, high bias).
• The optimal 'k' is usually found through experimentation and techniques like cross-
validation.

2. Calculate Distances:

• When you have a new data point for which you want to make a prediction, the KNN
algorithm calculates the distance between this new point and all the data points in
your training set.
• Common distance metrics used include:

Vipin V. Palande NMPL


3. Find the K-Nearest Neighbors:

• After calculating the distances to all training data points, the algorithm identifies the
'k' data points in the training set that have the smallest distances to the new data point.
These are the k-nearest neighbors.

4. Aggregate the Target Values of the Neighbors:

• Once the k-nearest neighbors are identified, their corresponding target (dependent
variable) values are used to make a prediction for the new data point.
• The simple aggregation method for regression is:
o Simple Averaging: The predicted value is the average of the target values of
the k-nearest neighbors.

Problem: Predict the efficiency at 135°C using the K-Nearest Neighbors (KNN)
algorithm.

Data:

Engine Temperature (°C) Efficiency (%)

50 0.4

100 0.6

150 0.55

200 0.7

Steps:
1. Choose a value for K:
o K represents the number of nearest neighbors we'll consider. Let's start with K
= 2 for this example.

2. Calculate the distance between the query point (135°C) and each data
point:
o We'll use the Euclidean distance formula for this one-dimensional data:
▪ Distance = |x₁ - x₂|

Vipin V. Palande NMPL


o Distance from 135°C to:
▪ 50°C: |135 - 50| = 85
▪ 100°C: |135 - 100| = 35
▪ 150°C: |135 - 150| = 15
▪ 200°C: |135 - 200| = 65

3. Identify the K-Nearest Neighbors:


o The distances are: 85, 35, 15, and 65.
o The two smallest distances are 15 and 35.
o Therefore, the 2-nearest neighbors to 135°C are:
▪ 150°C with 0.55 efficiency
▪ 100°C with 0.6 efficiency

4. Calculate the predicted efficiency:


o Since we're using K = 2, we average the efficiency values of the two nearest
neighbors:
▪ Predicted efficiency = (0.55 + 0.6) / 2 = 1.15 / 2 = 0.575

Result:
• Using KNN with K = 2, the predicted efficiency at 135°C is 0.575.
• If we had chosen K = 1, the nearest neighbor would be 150°C, and the predicted
efficiency would be 0.55.
• If we had chosen K = 3, the three nearest neighbors would be 100°C, 150°C,
and 200°C. The predicted efficiency would be (0.6 + 0.55 + 0.7) / 3 = 0.6167.

References
• "Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow" by Aurélien
Géron.
• Gemini AI. (2025, April 10).

Vipin V. Palande NMPL

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy