0% found this document useful (0 votes)

17 views89 pages

Unit 2 ML

The document provides an overview of supervised learning, covering key concepts such as class learning, regression, model selection, and generalization. It discusses various non-parametric methods for statistical estimation, including histogram and kernel estimators, as well as distance measures like Euclidean, Manhattan, and Cosine similarity for classification tasks. The importance of model selection and avoiding underfitting and overfitting in machine learning is emphasized throughout the document.

Uploaded by

Virupaksh Alur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views89 pages

Unit 2 ML

Uploaded by

Virupaksh Alur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 89

Unit-2

Supervised Learning
Syllabus
• Supervised Learning:
• Learning a Class from Examples, Learning Multiple Classes, Regression, Model Selection and Generalization,
Dimensions of Supervised Machine Learning Algorithm,
• Non-Parametric Methods: Histogram Estimator, Kernel Estimator, K-Nearest Neighbor Estimator
Learning a class from examples
• Class learning is finding a description that is shared by all the
positive examples.
• For example, given a car that we have not seen before by
checking with the description learned, we will be able to say
whether it is a family car or not.
Learning a Class from Examples
• Class C of a “family car”
– Prediction: Is car x a family car?
– Knowledge extraction: What do people expect from a family car?
• Output:
Positive (+) and negative (–) examples
• Input representation:
x1: price, x2 : engine power

4
Training set X
X = {x t ,r t }tN=1

 1 if x is positive
r =
0 if x is negative

x 1 
x= 
x 2 

5
Class C

(p1  price  p2 ) AND (e1  engine power  e2 )

6
Hypothesis class H
 1 if h classifies x as positive
h (x ) = 
0 if h classifies x as negative

Error of h on H

(( ) )
N
E (h | X ) =  1 h x  r
t t

t =1

7
Learning Multiple classes
• In the example discussed earlier, positive examples belonging to
the family car and the negative examples belongs to all other
cars.
• This is a two class problem.
• In the general case, we have K classes denoted by Ci, where i=1
…k, and an input instance belongs to one and exactly one of
them.
• An example in next slide with instances from 3 classes: family car,
sports car and luxury sedan car.
Multiple Classes, Ci i=1,...,K
K number of classes and t number of test data
X = {x t ,r t }tN=1


1 if x t
 Ci
ri = 
t


0 if x t
C j , j  i

Train hypotheses
hi(x), i =1,...,K:

 1 if x t
 Ci
( ) 
hi x = 
t

0 if x  C j , j  i
t

9
Regression
• Regression models are used to predict a continuous value.

• Predicting prices of a house given the features of house

like size, price etc is one of the common examples of
Regression. It is a supervised technique.

• Classification applied on discrete values and regression on

continuous values.
• Example: age vs. height., temperature of a city, house price
Regression vs classification
Supervised vs. unsupervised learning
Underfitting vs. Overfitting
Model Selection and Generalization
• What is a model ? ML algorithm trained with training data is
usually referred as Hypothesis or Model.
• For example: Linear Regression algorithm
Other Examples: Decision Tree, Neural Network
Model Selection?
• It is the process of selecting the optimal model from the set of candidate
models, for a given data.

• Data used in learning:

– Training Set (Usually 60%)
– Validation set – Cross Validation (20%) : Validated on different models on the same
training set
– Test data (20%) : Unseen data

• Model Selection is finding the optimal model which minimizes both bias and
variance.
• Bias is the error during training and Variance is the error during testing
Inductive bias
• The set of assumption we make to have learning possible is called
inductive bias of the learning algorithm.
• Model selection is also about choosing the right inductive bias.
Generalization
• Generalisation is how well a model trained on the training set
predicts the right output for new instances.
• A model should be selected having best generalisation. It should
avoid two problems
– Underfitting
– Overfitting
In underfitting – simple model is selected and hence
bias is very high
Underfitting – overfitting – just right model-
Generalization
• As the model selected the bias (training error) is very high.
• Also variance testing error) is also very high.

• Overfitting is trying to fit each and every data/samples.

• Bias is low.
• The model is complex
• Variance is expected to be very high.

• In just right model, it is close all the data/samples, due to this minim
bias and variance is also low : This is called generalization
Dimensions of Supervised Machine Learning
Algorithm
Consider we have N samples and K classes
Non-Parametric Estimation
• Statistical Estimation:
– Population is the entire data and a smaller set of data from the
population is called Sample. (Election result analysis – Exit poll )
– Usually it is very difficult to estimate on the entire data, it will be
performed on a sample data.

• Classification of Statistical Estimation:

– Parametric : It uses mean and SD.
– Non-Parametric : It does not uses mean and standard deviation.
Non-Parametric Methods: Histogram Estimator
• Histogram is a convenient way to describe the data.
• To form a histogram, the data from a single class are grouped into
intervals.
• Over each interval rectangle is drawn, with height proportional to
number of data points falling in that interval. In the example
interval is chosen to have width of two units.
Kernel and Window Estimators
• Histogram is a good representation for discrete data. It will show the spikes
for each bin.
• But may not suite for continuous data. Then we will be using Kernel
(function) for each of the data points. And the total density is estimated by
the kernel density function.
• It is useful for applications like audio density estimation.

• This approximation to a continuous density estimation is not useful in

decision making.

• Each Delta function is replaced by Kernel Functions such as rectangles,

triangles or normal density functions which have been scaled so that their
combined area should be equal to one.
Kernel Density function

-4 to -2 = 1, -2 to 0 = 2, 0 to -2 = 1 -2 to -4 =0, -4 to -6=1 and -6 to -8=1

Height = 1/6*2 = 0.08 (first case) and so on
KERNEL DENSITY ESTIMATION
Similarity and Dissimilarity
Distance are used to measure similarity

There are many ways to measure the distance s between two instances
Distance or similarity measures are essential in solving many pattern recognition problems
such as classification and clustering. Various distance/similarity measures are available in the
literature to compare two data distributions.
As the names suggest, a similarity measures how close two distributions are.

For algorithms like the k-nearest neighbor and k-means, it is essential to measure the
distance between the data points.
• In KNN we calculate the distance between points to find the nearest neighbor.
• In K-Means we find the distance between points to group data points into clusters based
on similarity.
• It is vital to choose the right distance measure as it impacts the results of our algorithm.
Euclidean Distance
• We are most likely to use Euclidean distance when calculating the distance between two rows
of data that have numerical values, such a floating point or integer values.
• If columns have values with differing scales, it is common to normalize or standardize the
numerical values across all columns prior to calculating the Euclidean distance. Otherwise,
columns that have large values will dominate the distance measure.

n
dist =  ( pk − qk )2
k =1

• Where n is the number of dimensions (attributes) and pk and qk are, respectively, the kth
attributes (components) or data objects p and q.
• Euclidean distance is also known as the L2 norm of a vector.
Compute the Euclidean Distance between the following data set
• D1= [10, 20, 15, 10, 5]
• D2= [12, 24, 18, 8, 7]
Apply Pythagoras theorem for Euclidean distance
Manhattan distance:
Manhattan distance is a metric in which the distance between two points is the sum
of the absolute differences of their Cartesian coordinates. In a simple way of saying it
is the total sum of the difference between the x-coordinates and y-coordinates.
Formula: In a plane with p1 at (x1, y1) and p2 at (x2, y2)

• The Manhattan distance is related to the L1 vector norm

• In general ManhattanDistance = sum for i to N sum of |v1[i] – v2[i]|
Compute the Manhattan distance for the following
• D1 = [10, 20, 15, 10, 5]
• D2 = [12, 24, 18, 8, 7]
Manhattan distance:
is also popularly called city block distance

Euclidean distance is like flying

distance

Manhattan distance is like

travelling by car
Minkowski Distance
• It calculates the distance between two real-valued vectors.
• It is a generalization of the Euclidean and Manhattan distance measures and
adds a parameter, called the “order” or “r“, that allows different distance
measures to be calculated.
• The Minkowski distance measure is calculated as follows:
1
n
dist = (  | pk − qk r r
| )
k =1
Where r is a parameter, n is the number of dimensions (attributes) and pk
and qk are, respectively, the kth attributes (components) or data objects p
and q.
Minkowski is called generalization of Manhattan and Euclidean:

Manhattan Distance is called L1 Norm and

Euclidean distance is called L2

Minkowski is called Lp where P can be 1 or 2

Cosine Similarity
(widely used in recommendation system and NLP)
–If A and B are two document vectors.
–Cosine similarity ranges between (-1 to +1)
– -1 indicates not at all close and +1 indicates it is very close in similarity
–In cosine similarity data objects are treated as vectors.
– It is measured by the cosine of the angle between two vectors and determines
whether two vectors are pointing in roughly the same direction. It is often used
to
measure document similarity in text analysis.
–Cosine Distance = 1- Cosine Similarity
cos(A, B) = 1: exactly the same
0: orthogonal
−1: exactly opposite
Formula for Cosine Similarity
• The cosine similarity between two vectors is measured in ‘θ’.
• If θ = 0°, the ‘x’ and ‘y’ vectors overlap, thus proving they are
similar.
• If θ = 90°, the ‘x’ and ‘y’ vectors are dissimilar.

• If two points are on the same plane or same vector

• In the example given P1 and P2 are on the same vector,
and hence the angle between them is 0, so COS(0) =1 indicates
they are of high similarity
• In this example two points P1 and P2
are separated by 45 degree, and hence
Cosine similarity is COS(45) = 0.53.

In this example P1 and P2 are separated by

90 degree, and hence the Cosine
similarity is COS(90)= 0
If P1 and P2 are on the opposite side
• If P1 and P2 are on the opposite side then the angle between
them is 180 degree and hence the COS(180)= -1

• If it is 270, then again it will be 0, and 360 or 0 it will be 1.

Cosine Similarity
Advantages of Cosine Similarity
• The cosine similarity is beneficial because even if the two similar
data objects are far apart by the Euclidean distance because of
the size, they could still have a smaller angle between them.
Smaller the angle, higher the similarity.
• When plotted on a multi-dimensional space, the cosine similarity
captures the orientation (the angle) of the data objects and not
the magnitude.
Example1 for computing cosine distance
Consider an example to find the similarity between two vectors – ‘x’ and ‘y’, using Cosine
Similarity. (if angle can not be estimated directly)

The ‘x’ vector has values, x = { 3, 2, 0, 5 }

The ‘y’ vector has values, y = { 1, 0, 0, 0 }

The formula for calculating the cosine similarity is : Cos(x, y) = x . y / ||x|| * ||y||

x . y = 31 + 20 + 00 + 50 = 3

||x|| = √ (3)^2 + (2)^2 + (0)^2 + (5)^2 = 6.16

||y|| = √ (1)^2 + (0)^2 + (0)^2 + (0)^2 = 1

∴ Cos(x, y) = 3 / (6.16 * 1) = 0.49

Example2 for computing cosine distance
d1 = 3 2 0 5 0 0 0 2 0 0 ; d2 = 1 0 0 0 0 0 0 1 0 2

d1 • d2= 3*1 + 2*0 + 0*0 + 5*0 + 0*0 + 0*0 + 0*0 + 2*1 + 0*0 + 0*2 = 5

||d1|| = (33+22+00+55+00+00+00+22+00+00) 0.5 = (42) 0.5 = 6.481

(square root of sum of squares of all the elements)
||d2|| = (1*1+0*0+0*0+0*0+0*0+0*0+0*0+1*1+0*0+2*2) 0.5 = (6) 0.5 = 2.245

So cosine similarity = cos( d1, d2 ) = (d1 • d2)/ (||d1|| *||d2|| )

= (5/(6.481*2.245)) = 0.3150

Cosine distance (or it can be called dis-similarity)

= 1-cos(d1,d2) = 1-0.3436 = 0.6564
Find Cosine distance between
D1 = [5 3 8 1 9 6 0 4 2 1] D2 = [1 0 3 6 4 5 2 0 0 1]
When to use Cosine Similarity
• Cosine similarity looks at the angle between two vectors, euclidian
similarity at the distance between two points. Hence it is very popular
for NLP applications.

• Let's say you are in an e-commerce setting and you want to compare
users for product recommendations:
• User 1 bought 1x eggs, 1x flour and 1x sugar.
• User 2 bought 100x eggs, 100x flour and 100x sugar
• User 3 bought 1x eggs, 1x Vodka and 1x Red Bull

• By cosine similarity, user 1 and user 2 are more similar. By euclidean

similarity, user 3 is more similar to user 1.
JACCARD SIMILARITY AND DISTANCE:
In Jaccard similarity instead of vectors, we will be using sets.
It is used to find the similarity between two sets.
Jaccard similarity is defined as the intersection of sets divided by
their union. (count)

Jaccard similarity between two sets A and B is

A simple example using set notation: How similar are these two sets?
A = {0,1,2,5,6}
B = {0,2,3,4,5,7,9}
J(A,B) = {0,2,5}/{0,1,2,3,4,5,6,7,9} = 3/9 = 0.33
Jaccard Similarity is given by :
Overlapping vs Total items.
• Jaccard Similarity value ranges between 0 to 1
• 1 indicates highest similarity
• 0 indicates no similarity
Application of Jaccard Similarity
• Language processing is one example where jaccard similarity is
used.

• In this example it is 4/12 = 0.33

Jaccard Similarity is popularly used for ML model
performance analysis
• In this example, table is designed against
Actual vs predicted.

This gives an idea how our algorithm is working

• In the example is shows the overlapping +ve vs
Total positives including actual and predicted
Common Properties of a Distance

• Distances, such as the Euclidean distance, have some well known

properties.
1. d(p, q)  0 for all p and q and d(p, q) = 0 only if
p = q. (Positive definiteness)
2. d(p, q) = d(q, p) for all p and q. (Symmetry)
3. d(p, r)  d(p, q) + d(q, r) for all points p, q, and r.
(Triangle Inequality)
where d(p, q) is the distance (dissimilarity) between points (data objects), p and q.

• A distance that satisfies these properties is a metric, and a space is

called a metric space
Distance Metrics Continued
• Dist (x,y) >= 0
• Dist (x,y) = Dist (y,x) are Symmetric
• Detours can not Shorten Distance
Dist(x,z) <= Dist(x,y) + Dist (y,z)
z

X X y
y z
Euclidean Distance

2 p1 point x y
p3 p4 p1 0 2
1 p2 2 0
p2 p3 3 1
0 p4 5 1
0 1 2 3 4 5 6

p1 p2 p3 p4
p1 0 2.828 3.162 5.099
p2 2.828 0 1.414 3.162
p3 3.162 1.414 0 2
p4 5.099 3.162 2 0

Distance Matrix
Minkowski Distance
3
L1 p1 p2 p3 p4
p1 0 4 4 6
2 p1 p2 4 0 2 4
p3 p4
1
p3 4 2 0 2
p2 p4 6 4 2 0
0
0 1 2 3 4 5 6 L2 p1 p2 p3 p4
p1 0 2.828 3.162 5.099
p2 2.828 0 1.414 3.162
p3 3.162 1.414 0 2
p4 5.099 3.162 2 0
point x y
L p1 p2 p3 p4
p1 0 2
p1 0 2 3 5
p2 2 0 p2 2 0 1 3
p3 3 1 p3 3 1 0 2
p4 5 1 p4 5 3 2 0

Distance Matrix
Summary of Distance Metrics

• Manhattan Distance • Euclidean Distance

|X1-X2| + |Y1-Y2| • 𝑥1 − 𝑥2 2 + √ 𝑦1 − 𝑦2 2
Nearest Neighbors Classifiers
• Basic idea:
– If it walks like a duck, quacks like a duck, then it’s probably a duck

Compute Distance
Test Record

Training Records Choose k of the “nearest”

records
K-Nearest Neighbors (KNN) : ML algorithm
• Simple, but a very powerful classification algorithm
• Classifies based on a similarity measure
• This algorithm does not build a model
• Does not “learn” until the test example is submitted for classification
• Whenever we have a new data to classify, we find its K-nearest neighbors from
the training data
• Classified by “MAJORITY VOTES” for its neighbor classes
• Assigned to the most common class amongst its K-Nearest Neighbors
(by measuring “distant” between data)
• In practice, k is usually chosen to be odd, so as to avoid ties
• The k = 1 rule is generally called the “nearest-neighbor classification” rule
K-Nearest Neighbors (KNN)
• K-Nearest Neighbor is one of the simplest Machine Learning algorithms based
on Supervised Learning technique.
• K-NN algorithm assumes the similarity between the new case/data/Pattern
and available cases and put the new case into the category that is most similar
to the available categories.

• K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.
• K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data.
• K-NN algorithm stores all the available data and classifies a new data point
based on the similarity. This means when new data appears then it can be
easily classified into a well suite category by using K- NN algorithm.
• It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it gets
new data, then it classifies that data into a category that is much similar to the
new data.
Illustrative Example for KNN
Collected data over the past few years(training data)
Considering K=1, based on nearest neighbor find the test
data class- It belongs to class of africa
Now we have used K=3, and 2 are showing it is close to
North/South America and hence the new data or data under
testing belongs to that class.
In this case K=3… but still not a correct value to
classify…Hence select a new value of K
Algorithm
• Step-1: Select the number K of the neighbors
• Step-2: Calculate the Euclidean distance to all the data points in
training.
• Step-3: Take the K nearest neighbors as per the calculated
Euclidean distance.
• Step-4: Among these k neighbors, apply voting algorithm
• Step-5: Assign the new data points to that category for which the
number of the neighbor is maximum.
• Step-6: Our model is ready.
Consider the following data set of a pharmaceutical company with assigned class labels,
using K nearest neighbour method classify a new unknown sample using k =3 and k = 2.

Points X1 (Acid Durability ) X2(strength) Y=Classification

P1 7 7 BAD

P2 7 4 BAD

P3 3 4 GOOD

P4 1 4 GOOD

New pattern with X1=3, and X2=7 Identify the Class?

Points X1(Acid Durability) X2(Strength) Y(Classification)
P1 7 7 BAD
P2 7 4 BAD
P3 3 4 GOOD
P4 1 4 GOOD
P5 3 7 ?
KNN

P1 P2 P3 P4

(7,7) (7,4) (3,4) (1,4)

Euclidean
Distance of
P5(3,7) from
Sqrt((7-3) 2 + (7-7)2 ) = 16 Sqrt((7-3) 2 + (4-7)2 ) = 25 Sqrt((3-3) 2 + (4-7)2 ) Sqrt((1-3) 2 + (4-7)2 )
=4 =5 = 9 = 13
=3 = 3.60
P1 P2 P3 P4

Euclide (7,7) (7,4) (3,4) (1,4)

an
Distanc
e of Sqrt((7-3) 2 + Sqrt((7-3) 2 + Sqrt((3-3) 2 Sqrt((1-3) 2
P5(3,7) (7-7)2 ) (4-7)2 ) = 25 + (4-7)2 ) + (4-7)2 )
from = 16 =5 = 9 = 13
=4 =3 = 3.60

Class BAD BAD GOOD GOOD

Height (in cms) Weight (in kgs) T Shirt Size
158 58 M New customer named 'Mary’ has height
158 59 M 161cm and weight 61kg.
158 63 M
160 59 M
Suggest the T shirt Size with K=3,5
160 60 M
163 60 M
using Euclidean Distance and
163 61 M also Manhattan Distance
160 64 L
163 64 L
165 61 L
165 62 L
165 65 L
168 62 L
168 63 L
168 66 L
170 63 L
170 64 L
170 68 L
There is a Car manufacturer company that has
manufactured a new SUV car. The company wants to give
the ads to the users who are interested in buying that SUV.
So for this problem, we have a dataset that contains
multiple user's information through the social network. The
dataset contains lots of information but the Estimated
Salary and Age we will consider for the independent
variable and the Purchased variable is for the dependent
variable. Dataset is as shown in the table. Using K =5
classify the new sample
• There is no particular way to determine the best value for "K", so we need to try some
values to find the best out of them. The most preferred value for K is 5.
• A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of
outliers in the model.
• Large values for K are good, but it may find some difficulties.

• Advantages of KNN Algorithm:

• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.

• Disadvantages of KNN Algorithm:

• Always needs to determine the value of K which may be complex some time.
• The computation cost is high because of calculating the distance between the data
points for all the training samples.
Another example: solve
• Because the distance function used to find the k nearest
neighbors is not linear, so it usually won't lead to a linear
decision boundary.
End of Unit - 2

ML Unit 2
No ratings yet
ML Unit 2
24 pages
Unit 3
No ratings yet
Unit 3
100 pages
Unit 5
No ratings yet
Unit 5
73 pages
Lesson 4 - Supervised Learning
No ratings yet
Lesson 4 - Supervised Learning
36 pages
ML Question Bank Ans
No ratings yet
ML Question Bank Ans
24 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
T6 - KNN - Features, Distances &amp Amp Non-Parametric Models
No ratings yet
T6 - KNN - Features, Distances &amp Amp Non-Parametric Models
23 pages
Datamining Lect12
No ratings yet
Datamining Lect12
75 pages
ML RUSA Module 6 Probablistic EM KNN SVM
No ratings yet
ML RUSA Module 6 Probablistic EM KNN SVM
51 pages
Unit Ii
No ratings yet
Unit Ii
54 pages
Machine Learning Module-03
No ratings yet
Machine Learning Module-03
24 pages
MIQ - Lecture 1, 2, 3 and 4 ?
No ratings yet
MIQ - Lecture 1, 2, 3 and 4 ?
15 pages
DSV Ia2
No ratings yet
DSV Ia2
18 pages
cs4302 Lecture2
No ratings yet
cs4302 Lecture2
40 pages
ML Unit-2
No ratings yet
ML Unit-2
55 pages
Module 4 - Supervised and Unsupervised Learning Techniques
No ratings yet
Module 4 - Supervised and Unsupervised Learning Techniques
52 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
ML Unit-2 (CEC)
No ratings yet
ML Unit-2 (CEC)
96 pages
Mlfa Autumn 22 Lec 03
No ratings yet
Mlfa Autumn 22 Lec 03
61 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
2.unit 2 ML Q&A
No ratings yet
2.unit 2 ML Q&A
36 pages
Datamining Lect7knearst
No ratings yet
Datamining Lect7knearst
62 pages
Classification
No ratings yet
Classification
74 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Session 5
No ratings yet
Session 5
36 pages
SWE622 Lecture 3 Classification
No ratings yet
SWE622 Lecture 3 Classification
57 pages
ML Unit2
No ratings yet
ML Unit2
38 pages
ML Unit-2
No ratings yet
ML Unit-2
33 pages
ML Unit 3
No ratings yet
ML Unit 3
12 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
Data Mining Assignment 3
No ratings yet
Data Mining Assignment 3
9 pages
I2ml3e Chap8
No ratings yet
I2ml3e Chap8
28 pages
Zilog Z80 Assembly Language Programming Classic
100% (1)
Zilog Z80 Assembly Language Programming Classic
304 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
AIML-Unit 4 Notes-Assignment 4
No ratings yet
AIML-Unit 4 Notes-Assignment 4
21 pages
Ds Module 5
No ratings yet
Ds Module 5
49 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
Classification
No ratings yet
Classification
50 pages
Dynamic Business Strategy Competing in A Fastchanging Uncertain
No ratings yet
Dynamic Business Strategy Competing in A Fastchanging Uncertain
134 pages
Statlearn PDF
No ratings yet
Statlearn PDF
123 pages
Chapter#10 (Part#01) SL (K-NN)
No ratings yet
Chapter#10 (Part#01) SL (K-NN)
27 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Introduction To AI and ML - UNIT 4
No ratings yet
Introduction To AI and ML - UNIT 4
29 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
Unit-4 AML (1. Basics and K-NN)
No ratings yet
Unit-4 AML (1. Basics and K-NN)
25 pages
ML Unit 2 r20 Jntuk
No ratings yet
ML Unit 2 r20 Jntuk
34 pages
Algebra Final Revision Sheet - Compressed
No ratings yet
Algebra Final Revision Sheet - Compressed
268 pages
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Pattern Recognition 21BR551 MODULE 03 NOTES
No ratings yet
Pattern Recognition 21BR551 MODULE 03 NOTES
16 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Jntuk r20 ML Unit-II
No ratings yet
Jntuk r20 ML Unit-II
33 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
IV Distance and Rule Based Models 4.1 Distance Based Models
No ratings yet
IV Distance and Rule Based Models 4.1 Distance Based Models
45 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
GFZ-63994EN01, PROFIBUS-DP Board For 30ia - Operator's
No ratings yet
GFZ-63994EN01, PROFIBUS-DP Board For 30ia - Operator's
98 pages
AI-Driven Application Co-Design & Co-Development
No ratings yet
AI-Driven Application Co-Design & Co-Development
43 pages
AT231 UserGuide
No ratings yet
AT231 UserGuide
28 pages
AN5295
No ratings yet
AN5295
16 pages
SAP NetWeaver AS ABAP 7.50 SP2 - Developer Edition On SAP ASE
No ratings yet
SAP NetWeaver AS ABAP 7.50 SP2 - Developer Edition On SAP ASE
11 pages
Be142 Genset Controller Manual
No ratings yet
Be142 Genset Controller Manual
28 pages
Chassis Name Platform Model Name: 4100S Tsumv59 24PHA4100S/67
No ratings yet
Chassis Name Platform Model Name: 4100S Tsumv59 24PHA4100S/67
28 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Sigma Personal Voice Assistance Mid - Defence - Report
No ratings yet
Sigma Personal Voice Assistance Mid - Defence - Report
27 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
Fundamentals of Data Communication and Computer Network
No ratings yet
Fundamentals of Data Communication and Computer Network
27 pages
Interrupts
No ratings yet
Interrupts
34 pages
Number System
No ratings yet
Number System
23 pages
A6K Product Manual
No ratings yet
A6K Product Manual
16 pages
Brochure Powerful and Intuitive Machine Tool Probing Software
No ratings yet
Brochure Powerful and Intuitive Machine Tool Probing Software
17 pages
What's New in SAP Extended Warehouse Management 9.5: Document Version: 1.0 - 2017-12-07
No ratings yet
What's New in SAP Extended Warehouse Management 9.5: Document Version: 1.0 - 2017-12-07
22 pages
Gamified Education System
No ratings yet
Gamified Education System
10 pages
Controls
No ratings yet
Controls
34 pages
Advidia Catalogue
No ratings yet
Advidia Catalogue
7 pages
Dump State
No ratings yet
Dump State
10 pages
WPA3
No ratings yet
WPA3
3 pages
Trans Connect
No ratings yet
Trans Connect
7 pages
Personal Edition Installation Guide 63SP1
No ratings yet
Personal Edition Installation Guide 63SP1
16 pages
Secure Firewall MGMT Center Virtual Classic License Eol
No ratings yet
Secure Firewall MGMT Center Virtual Classic License Eol
5 pages
Case Study HR
No ratings yet
Case Study HR
2 pages
DCW20 960W DIN Rail Combo DC-UPS / DC-DC Converter: Main Features Embedded User Interface
No ratings yet
DCW20 960W DIN Rail Combo DC-UPS / DC-DC Converter: Main Features Embedded User Interface
4 pages
Tentative Program
No ratings yet
Tentative Program
3 pages
OCS752 QB Model
No ratings yet
OCS752 QB Model
2 pages
Digits in Numbers: E-OLYMP 1. Simple Problem
No ratings yet
Digits in Numbers: E-OLYMP 1. Simple Problem
2 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 2 ML

Uploaded by

Unit 2 ML

Uploaded by

Unit-2

(p1  price  p2 ) AND (e1  engine power  e2 )

• Predicting prices of a house given the features of house

• Classification applied on discrete values and regression on

• Data used in learning:

• Overfitting is trying to fit each and every data/samples.

• Classification of Statistical Estimation:

• This approximation to a continuous density estimation is not useful in

• Each Delta function is replaced by Kernel Functions such as rectangles,

-4 to -2 = 1, -2 to 0 = 2, 0 to -2 = 1 -2 to -4 =0, -4 to -6=1 and -6 to -8=1

• The Manhattan distance is related to the L1 vector norm

Euclidean distance is like flying

Manhattan distance is like

Manhattan Distance is called L1 Norm and

Minkowski is called Lp where P can be 1 or 2

• If two points are on the same plane or same vector

In this example P1 and P2 are separated by

• If it is 270, then again it will be 0, and 360 or 0 it will be 1.

The ‘x’ vector has values, x = { 3, 2, 0, 5 }

x . y = 3*1 + 2*0 + 0*0 + 5*0 = 3

||x|| = √ (3)^2 + (2)^2 + (0)^2 + (5)^2 = 6.16

∴ Cos(x, y) = 3 / (6.16 * 1) = 0.49

||d1|| = (3*3+2*2+0*0+5*5+0*0+0*0+0*0+2*2+0*0+0*0) 0.5 = (42) 0.5 = 6.481

So cosine similarity = cos( d1, d2 ) = (d1 • d2)/ (||d1|| *||d2|| )

Cosine distance (or it can be called dis-similarity)

• By cosine similarity, user 1 and user 2 are more similar. By euclidean

Jaccard similarity between two sets A and B is

• In this example it is 4/12 = 0.33

This gives an idea how our algorithm is working

• Distances, such as the Euclidean distance, have some well known

• A distance that satisfies these properties is a metric, and a space is

• Manhattan Distance • Euclidean Distance

Training Records Choose k of the “nearest”

Points X1 (Acid Durability ) X2(strength) Y=Classification

New pattern with X1=3, and X2=7 Identify the Class?

(7,7) (7,4) (3,4) (1,4)

Euclide (7,7) (7,4) (3,4) (1,4)

Class BAD BAD GOOD GOOD

• Advantages of KNN Algorithm:

• Disadvantages of KNN Algorithm:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

x . y = 31 + 20 + 00 + 50 = 3

||d1|| = (33+22+00+55+00+00+00+22+00+00) 0.5 = (42) 0.5 = 6.481