0% found this document useful (0 votes)
21 views6 pages

Midem ML Makeup Sol Upated

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views6 pages

Midem ML Makeup Sol Upated

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

(EC-2 Makeup)

Q.1 Consider the following dataset with 4 records. [4+2 = 6 Marks]

Input X Output Y
1 exp(2)
2 exp(4)
3 exp(6.3)
4 exp(9.2)

Assume output y=e(α * x) . Using linear regression,


(a) Find the best value of α.
(b) Find the optimal total sum of square error.
Solution:
Q.2 Consider inputs xi which are real valued attributes and the outputs yi which are real

valued of the form yi = f ( xi ) + ei , where f(xi) is the true function and ei is a random
variable representing laplacian noise with PDF given by
− yi − 
1
f ( yi  ) = *e 
2
n
h ( xi ) = i xi
Implementing a linear regression model of the form i =0 , and  = h ( xi ) ,
find the maximum likelihood estimator of . Comment on the loss function. [4+1 Marks]

Comment on Loss function: Instead of MSE, MAE is the maximum likelihood


hypothesis. So MAE is appropriate for the loss function.
Q.3 Consider a result prediction system where student’s efforts are encoded as percent of
time a student has spent studying out of total available time.
● The input X is having just one feature representing the student’s efforts having
only four discrete values (25%, 50%, 75%, and 100%)
● The output Y is having 3 classes (First class, Second class, Fail)
● The priors for each class are: P(Y = First Class) = 0.5, P(Y = Second class) = 0.3,
and P(Y = Fail) = 0.2.
● Based on the past data, the estimated the class-conditional probability
P(X| Y) are shown in the following table.
Student’s p(x|y=fail) p(x|y=second class) p(x|y=first class)
efforts
25 0.7 0.4 0.1
50 0.2 0.3 0.1
75 0.1 0.2 0.3
100 0 0.1 0.7

Consider a following loss function  ( y, y) where y = predicted class label and y is


ˆ ˆ
true class label:

0 yˆ = y
1

 ( yˆ , y ) = 
yˆ = Fail and yˆ  y

yˆ =Sec ond class and yˆ  y


2

4 yˆ = First class and yˆ  y
Consider modified Naïve Bayes hypothesis function:

Yˆ  argmax l ( y, yˆ ) P (Y = yk ) Πi P(X i |Y = yk )
yk

Use this modified hypothesis function to classify each of the examples in the given
table. [5 Marks]
Solution:

p(y|x)
Fail second class first class
25 0.35 0.12 0.02
50 0.1 0.09 0.02
75 0.05 0.06 0.06
100 0 0.03 0.14

L(y-hat,y) *p(y/x) L(y-hat,y) *p(y/x) L(y-hat,y) *p(y/x)


Highest
Fail second class first class value
25 0.35 0.24 0.08 0.35 fail
second
50 0.1 0.18 0.08 0.18 class
75 0.05 0.12 0.24 0.24 first class
10
0 0 0.06 0.56 0.56 first class

Q.4 If we modify the loss function of the linear regression model as follows:

(h ( x ) − y )
n 2
1
J ( ) =
2n
w
i =1
(i )

(i ) (i )

Where w(i) is the weight assigned to each training example. Derive the equation to
find the value of  with this modified loss function. Suppose, we estimate the value
of w(i) inversely proportional to the variance of the residuals, comment in no more
than 20 words when you prefer to use this kind of modified loss function.
[3+2=5 Marks]

Solution:

Comment: Robust against outliers. Outliers will have higher variance of the residuals
resulting into lower weight.
Q.5 Fit a logistic regression. Find the updated weights after the 3 iterations of modified
Gradient Descent algorithm where gradient update happens after every training
example using a learning rate of 0.5 and initial weights (W0, W1, W2) = (1, 1, 1) for
the
following data with the logistic regression output given by
1
1+ e
( −W +W X
0 1 1 −W2 X 2
2
)
Assume the results obtained after 3 iterations is the final weights. Using this construct
the confusion matrix for given below training data. [4+2=6 Marks]

Input Input Output


X1 X2 Label
2 0 0
0 2 0
0 -2 0
-2 0 0
0 1 1
0 -1 1

Solution:
Wi-LR*[Y-Pred – Y ]* Xi
Y-Pred =
X1 X2 y h(X) W0 w1 w2
2 0 0 0.05 0.975 0.95 1
0 2 0 0.95 0.5 0.95 0.05
0 -2 0 0.27 0.365 0.95 0.32
-2 0 0 0.05

w0 w1 w2
0.365 0.95 0.32 True Class
Y-
Pred
Confusion Matrix Y=0 Y=1
=
X1 X2 y h(X) Y
2 0 0 0.03 0 Predicted Y=0 3 0
0 2 0 0.73 1 Class Y=1 1 2
0 -2 0 0.43 0
-2 0 0 0.03 0
0 1 1 0.66 1
0 -1 1 0.51 1
Q.6 Consider the following set of training examples:

Instance Classification A1 A2
1 + T T
2 + T T
3 - T F
4 + F F
5 - F T
6 - F T

What is the information gain of A2 relative to these training examples? Provide


the equation for calculating the information gain as well as intermediate
results. [3 Marks]

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy