0% found this document useful (0 votes)
10 views27 pages

ML Intro Numericals

Uploaded by

sahanawaz9199
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views27 pages

ML Intro Numericals

Uploaded by

sahanawaz9199
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Getting Started with Machine

Learning: Numericals
Gir House Help Resource

May Indra and Agni be favourable!


Credits :

Prepared by Vighnesh Mishra, Secretary of Gir House (September 23 - August 24)


Edited by Niraj Kumar (LinkedIN)

I would like to acknowledge the contribution of Acegrade.in for the PYQs.

Preface :

I do not like Mathematics. I was a humanities student in classes 11th and 12th and really
hated Maths in classes 9th and 10th. I do not like complex symbols. I like simple addition
and subtraction and good marks and good grades. If you're like me, this is for you.
Contents:

Slide 4: Random Things you should know


Slide 5: Norm of a Vector
Slides 6-7: Loss — Concept and Explanation
Slide 8: Revision of Theory
Slide 9: Loss Formula at a glance
Slides 10-12: Regression – Solved Example
Slide 13: Regression – Do it Yourself
Slides 14-16: Dimensionality Reduction– Solved Example
Slide 17: Dimensionality Reduction – Do it Yourself
Slides 18-25: Classification – Solved Example
Slide 26: Classification – Do it Yourself
Slide 27: Endnote
Random stuff you should know that I didn’t

1. Vector means some numbers written between brackets like [1, 2, 3] or [4, 3]
or [-3, 4, 5 ,-34] or (3, 7)
2. Rx means x number of items in a vector.
Example: [2, 34, 17, -84] has R⁴
3. exists only to scare you.You should add everything when you see the
thing
4. For some weird reason, they call numbers as scalars. Hence,Vector is
technically scalars within brackets separated by comma.
Vector : (2,6)
Scalar
5. Norm of a vector is a scalar. Next page tells you how to calculate norm.
Calculating Norm of a Vector

Let [3, 4, 2] be a vector named v. Then the norm of V is : (Remember Python variable assignment?)

||V|| = Sq. root ( 32 + 42 + (-2)2 ) = Sq. root (9 + 16 + 4) = Sq. root (29)


[use calculator to calculate square root.]

Q1. Calculate ||(2, 4)|| as an exercise.


Note: Whenever you see any vector between two || thingies, it means you have to calculate the Euclidean
Norm, in this course.
Please use ChatGPT to find out more about it.
Concept of Loss:

Imagine playing a game where you guess how many candies are in a jar. You
lose if your guess is too high or too low. In machine learning, "loss" is a score that
tells us how wrong our guess (or prediction) was. The higher the score, the more
the loss.

In machine learning, we teach computers to make predictions, like predicting the


temperature tomorrow. The "loss" shows how far off our computer's guess was
from the actual temperature. If the loss is minimal, our guess is close however, if
it's large, our guess is far off. The goal is to make the loss as small as possible by
improving the computer's prediction capability.
Technical Explanation of Loss:

❖ Loss in machine learning is a measure of how well or poorly a model’s


predictions match the actual data. It is a crucial component in training
models.
❖ The loss function quantifies the difference between the predicted values and
the actual values (also known as the ground truth).
❖ Common loss functions include Mean Squared Error (MSE) for regression
tasks and Cross-Entropy Loss for classification tasks.
❖ The process of training a model involves minimizing this loss function,
typically using optimization algorithms like gradient descent. Reducing the
loss helps improve the accuracy and reliability of the model’s predictions.

*Feel free to skip if this is too complex


Machine Learning

e!
i s
Rev
Supervised Learning Unsupervised Learning

Regression Classification Dimensionality Density Estimation


Reduction
Formula of Loss in Different Algorithms:
Formula Name Used for
1. 1. Negative Loss Likelihood 1. Density Estimation

2. 2. Reconstruction Error 2. Dimensionality Reduction

3. 3. Mean or Average Square 3. Regression


Error
4. 4. Misclassification Error 4. Classification
Loss Algorithm: Regression

Explanation:
1. You will be given a list of X and Y value pairs.
2. Both X and Y will be a list of scalars.
3. f(x): X will have to undergo some formula or function.
4. Subtract corresponding Y from f(x).
5. Calculate square of the result.
6. Add all values obtained from the above process.
7. Divide by the total number of X-Y value pairs. That is the loss.
Solved Example: Regression 1 / 2

Explanation:
1. You will be given a list of X and Y value pairs.
2. Both X and Y will be a list of scalars.
3. f(x): X will have to undergo some formula or function.
4. Subtract corresponding Y from f(x).
5. Calculate square of the result.
6. Add all values obtained from the above process.
7. Divide by the total number of X-Y value pairs. That is the loss.
Solved Example: Regression 2 / 2

x1 x2 h(x1, x2) or (x1+x2)/2 Y (h(x1, x2)) -Y Squaring

1 2 (1+2)/2 = 3/2 =1.5 2 1.5-2 = -0.5 (-0.5)² = 0.25

2 3 (2+3)/2 = 5/2 = 2.5 3 2.5-3 = -0.5 (-0.5)² = 0.25

3 4 (3+4)/2 = 7/2 = 3.5 4 3.5-4 = -0.5 (-0.5)² = 0.25

4 5 (4+5)/2 = 9/2 = 4.5 5 4.5-5 = -0.5 (-0.5)² = 0.25

Sum = 1.00 ¼ = 0.25 (Answer)


Do it Yourself (Regression)
Find the best fit line using the dataset given below using the least squares method.
(September, 2023)
x 1 3 4 5 7

y -3 7 1 0 4

Options:
1. ŷ = 0.7 x - 1
2. ŷ = x + 0.5 Hint: The best fit line has the least loss
3. ŷ = x + 0.2 P.S: This hint won't be there in the Quiz.

4. ŷ = 0.2x + 1
Loss Algorithm: Dimensionality Reduction

Explanation:
1. You will be given X, an encoder function and a decoder function.
2. X will be a list of vectors.
3. g(f(X)): First apply the encoder function on X and then the decoder function.
4. Subtract the original X vectors from the g(f(X))
5. Calculate norm of the results. [Refer slide 4]
6. Square the norm.
7. Divide the square of norm with the number of vectors in X. That is the Loss.
Solved Example: Dimensionality Reduction

Check slide 5 if
you forgot how
to calculate the
Norm of a
Vector.
u
xi f(xi) or (x1+x2+x3) / 2 g(f(xi)) or [u,u,u] xi - g(f(xi)) Norm² or ||xi-g(f(xi))||²

[1, 2, 3] (1 + 2 + 3)/2 = 6/2 = 3 [3, 3, 3] [1,2,3] - [3,3,3] = [-2,-1,0] ||[-2,-1,0]||² = 5

[2, 4, 6] (2 + 4 + 6)/2 =12/2 =6 [6, 6, 6] [2,4,6] - [6,6,6] = [-4,-2,0] ||[-4,-2,0]||² = 20

[3, 6, 9] (3 + 6 + 9)/2 =18/2 =9 [9, 9, 9] [3,6,9] - [9,9,9] = [-6,-3,0] ||[-6,-3,0]||² = 45

[4, 8,12] (4 + 8+12)/2=24/2=12 [12, 12,12] [4,8,12] - [12,12,12]=[-8,-4,0] ||[-8,-4,0]||² = 80


Solved Example: Dimensionality Reduction (Contd.)

The number of vectors in this dataset is 4.


This means i = n = 4.

sum of ||xi-g(f(xi))||² 5 + 20 + 45 + 80
= = 150 / 4 = use calculator to get
n or i 4 the answer please 🥰
Do it Yourself (Dimensionality Reduction)

Give the reconstruction error for this encoder decoder pair. The reconstruction
error is the mean of the squared distance between the reconstructed input and
input.

May, 2022
Basic Explanation: Classification

In my opinion, the calculation of Loss in Classification is the easiest to solve. However,


before even looking at the formula, you should know how the 1(statement) = 1 or 0 notation
works. Look at the explanation below:

1 (true statement) = 1 Notice how any statement in between the 1 followed by


brackets gives out 1 if true and 0 if false.
1 (false statement) = 0

Examples:
1 (India is in Asia) = 1
1 (India is in Africa) = 0 ∈ symbol is the same as ‘belongs to’. The statement ‘Parrot ∈
1 (2 is even) = 1 the set of birds’ is the same as the statement ‘Parrot belongs to
1 (10%3=0) = 0 the set of birds’.
1 (Gir is the Best) = 1
1 (2 ∈ (2,3,4)) = 1 On the other hand, ∉ notation means ‘does not belong to’.
Loss Algorithm: Classification

Explanation:
1. You will be given a list of X and Y value pairs. Y will be a list of Signs (+1 or -1).
2. X list will have vectors (usually in R² like (3,2)).
3. f(x): All X values will have to undergo some formula or function.
4. Sign of X: If the result of f(x) is zero or positive integer, write +1. Else, write -1.
5. If Y and Sign of X are equal, write zero. If they are not equal, write one.
6. At the end, add all ones (and zeroes lol)
7. Divide the sum by number of X and Y pairs. That is the loss.
Solved Example: Classification
X Y Which among the following models has the least misclassification
[0,1] +1
error? [September 2023, Quiz 1]

[0,2] +1
1. Sign(x1 − 2)
[1,3] +1 2. Sign(x2 + 2)
[3,4] -1
3. Sign(x1 + x2)
4. Sign(x2 − x1)
[4,5] -1

[7,6] -1

Note:
1) Misclassification Error is the same thing as Loss.
2) Sign (statement) works the same way as 1 (statement).
3) This question could also have been rephrased as ‘which of the following models is the best?’
Solved Example: Classification (Contd.)
f(x) Option 1: Sign(x1 − 2)
X Y x1 − 2 Sign Error
Step 1: First, we have to calculate f(x) for all vectors in
[0,1] +1 0-2 = -2 -1 1 the X column. The formula is (x1 - 2). This means that if
X is [0,1], we will subtract 2 from 0.
[0,2] +1 0-2 = -2 -1 1
Step 2: Then we have to take the Sign value. If f(x) is
[1,3] +1 1-2 = -1 -1 1 more than or equal to zero ( f(x) >= 0), we will write +1
else, we will write -1 in the Sign column.
[3,4] -1 3-2 = 1 +1 1

[4,5] -1 4-2 = 2 +1 1 Step 3: Compare Sign with Y. If Sign=Y, write 0. Else, if


Sign≠Y, write 1 in the error column.
[7,6] -1 7-2 = 5 +1 1
Step 4: Add all 1s and 0s from step 3 and then divide the
=6 sum by the number of X-Y pairs.
x1 x2 Loss: 6/6 = 1
Solved Example: Classification (Contd.)
f(x) Option 2: Sign(x2+2)
X Y x2 + 2 Sign Error
Step 1: First, we have to calculate f(x) for all vectors in
[0,1] +1 1+2 = 3 +1 0 the X column. The formula is (x2 + 2). This means that if
X is [0,1], we will add 2 to 1.
[0,2] +1 2+2 = 4 +1 0

[1,3] +1 3+2 = 5 +1 0 Step 2: Then we have to take the Sign value. If f(x) is
more than or equal to zero ( f(x) >= 0), we will write +1
[3,4] -1 4+2 = 6 +1 1 else, we will write -1 in the Sign column.

[4,5] -1 5+2 = 7 +1 1 Step 3: Compare Sign with Y. If Sign=Y, write 0. Else, if


[7,6] -1
Sign≠Y, write 1 in the error column.
6+2 = 8 +1 1

=3 Step 4: Add all 1s and 0s from step 3 and then divide the
sum by the number of X-Y pairs.
x1 x 2 Loss: 3/6 = 0.5
Solved Example: Classification (Contd.)
f(x)
Option 3: Sign(x1 + x2)
X Y x1 + x2 Sign Error

[0,1] +1 0+1 = 1 +1 0 Step 1: First, we have to calculate f(x) for all vectors in
the X column. The formula is (x1+ x2). This means that if
[0,2] +1 0+2 = 2 +1 0 X is [0,1], we will add 0 and 1.
[1,3] +1 1+3 = 4 +1 0
Step 2: Then we have to take the Sign value. If f(x) is
[3,4] -1 3+4 = 7 +1 1 more than or equal to zero ( f(x) >= 0), we will write +1
else, we will write -1 in the Sign column.
[4,5] -1 4+5 = 9 +1 1
Step 3: Compare Sign with Y. If Sign=Y, write 0. Else, if
[7,6] -1 7+6=13 +1 1 Sign≠Y, write 1 in the error column.
=3
Step 4: Add all 1s and 0s from step 3 and then divide the
sum by the number of X-Y pairs.
x1 x 2 Loss: 3/6 = 0.5
Solved Example: Classification (Contd.)
f(x)
Option 4: Sign(x2-x1)
X Y x2 - x1 Sign Error

[0,1] +1 1-0 = 0 +1 0 Step 1: First, we have to calculate f(x) for all vectors in
the X column. The formula is (x2- x1). This means that if
[0,2] +1 2-0 = 2 +1 0 X is [0,1], we will subtract 1 from 0.
[1,3] +1 3-1 = 2 +1 0
Step 2: Then we have to take the Sign value. If f(x) is
[3,4] -1 4-3 = 1 +1 1 more than or equal to zero ( f(x) >= 0), we will write +1
else, we will write -1 in the Sign column.
[4,5] -1 5-4 = 1 +1 1
Step 3: Compare Sign with Y. If Sign=Y, write 0. Else, if
[7,6] -1 6-7 = -1 -1 0 Sign≠Y, write 1 in the error column.
=2
Step 4: Add all 1s and 0s from step 3 and then divide the
sum by the number of X-Y pairs.
x1 x 2 Loss: 2/6 = 0.33
Solved Example: Classification (Contd.)
Let us see which option gives us the lowest loss/misclassification error:
1. 1.0
2. 0.5
3. 0.5
4. 0.33

Obviously Option 4 has the lowest value of Loss and is the best among the given models and the correct
answer.

Do it yourself
The question on the left was
asked in January 23, Quiz 1.
Try solving it on your own.
Use
paper and pen, my lazy
Do it Yourself (Classification)
NOTE :

This concludes week 1.


I won't add anything on Density Estimation because they never ask us to calculate
loss of that.

Sometimes, you might be given a very different loss function to solve the questions.
Pay attention.

Please write to 22f3001240@ds.study.iitm.ac.in for any suggestions / feedback or if


you notice any errors.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy