0% found this document useful (0 votes)

54 views8 pages

ML Midsem 2022

Uploaded by

shobhitraj0011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views8 pages

ML Midsem 2022

Uploaded by

shobhitraj0011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Machine Learning Mid-Sem

Total Points: 25

Instructions

• The duration of the exam is 1 hour.

• If any assumptions made, please mention it clearly.

• There is no partial marking for multiple choice questions and there could be multiple corrects
answers. Choose the right options and justify your choice.

• All sections/questions are compulsory to attempt.

Section A: Multiple choice questions

1. (1 point) Consider the following data :

Xtrain = (x(1) , x(2) , ..., x(mtrain ) ), Ytrain = (y (1) , y (2) , ..., y (mtrain ) )
Xtest = (x(1) , x(2) , ..., x(mtest ) ), Ytest = (y (1) , y (2) , ..., y (mtest ) )

You want to normalize your data before training your model. Which of the following proposi-
tions are true?

1. The normalizing mean and variance computed on the training set, and used to train the
model, should be used to normalize test data.
2. Test data should be normalized with its own mean and variance before being fed to
the network at test time because the test distribution might be different from the train
distribution.
3. Normalizing the input impacts the landscape of the loss function.
4. In imaging, just like for structured data, normalization consists in subtracting the mean
from the input and multiplying the result by the standard deviation.

Solution: Option (i) and (iii) We should perform feature normalisation over the training data.
Then perform normalisation on testing instances as well, but this time using the mean and
variance of training explanatory variables. In this way, we can test and evaluate whether our
model can generalize well to new, unseen data points and impacts the landscape of the loss
function. 0.5 marks for each correct answer. 1 marks total

2. (1 point) Having multiple perceptrons can actually solve the XOR problem satisfactorily: this
is because each perceptron can partition off a linear part of the space itself, and they can then
combine their results? State your reasoning clearly for choosing the correct option.

1. True – this works always, and these multiple perceptrons learn to classify even complex
problems.
2. False – perceptrons are mathematically incapable of solving linearly inseparable functions,
no matter what you do.
3. True – perceptrons can do this but are unable to learn to do it – they have to be explicitly
hand-coded.
4. False – just having a single perceptron is enough.
Solution: (3) A3. True - perceptrons can do this but are unable to learn to do it - they have
to be explicitly hand-coded. 1 mark for correct answer
3. (1 point) Suppose you are given an EM algorithm that finds maximum likelihood estimates
for a model with latent variables. You are asked to modify the algorithm so that it finds MAP
estimates instead. Which step or steps do you need to modify:
1. Expectation
2. Maximization
3. No modification necessary
4. Both
Solution: (2) Maximization [For detailed solution refer:
https://www.jmlr.org/papers/volume1/meila00a/html/node16.html
1 mark for correct answer and justification.
4. (1 point) Given

β∥w0 ∥2
nmax = α2

where nmax is the maximum number of iterations for a perceptron to converge and w0 is the
optimal weights at convergence, which of the following is true:
1. There is no unique solution for w0 but unique solution for nmax
2. There is no unique solution for nmax but unique solution for w0 exists
3. There is no unique solution for nmax and w0
4. Unique solution for nmax and w0 exists
Solution: (3) There is no unique solution for nmax and w0 .
1 mark for correct answer.
5. (1 point) Which of the following is/are true:
1. Logistic loss is better than L2 loss in classification tasks.
2. In terms of feature selection, L2 regularization is preferred since it comes up with sparse
solutions
3. A classifier that attains 100% accuracy on the training set and 70% accuracy on test set
is better than a classifier that attains 70% accuracy on the training set and 75% accuracy
on test set.
4. MSE is the preferred loss function for logistic regression
Solution: (1) Logistic loss is better than L2 loss in classification tasks. 1 mark for correct
answer and justification.

Page 3
Section B: Short Answers

6. (2 points) Given below are two versions of perceptron learning algorithm. Identify the correct
implementation and justify?

Solution: There is no difference in the perception learning algorithm part. The difference
is in the test for convergence. Option (A): The test for convergence is done after the entire
samples are seen and (B) The test for convergence is done after each sample is seen. Option A
is the right implementation where the convergence is tested after the entire samples are seen.
Suppose if you choose the first sample and the output is correct, then as per the convergence
criteria the PLA will stop resulting in improper training.
1 mark for identifying the right implementation. 1 mark for justifying how it helps in the
convergence of PLA.
7. (2 points) Given plot denotes the loss functions of logistic regression and perceptron. Compare
and contrast the nature of loss functions given in the figure (Any 2 observations).

Solution: Some observations: 1) Perceptron and LR loss functions

2) The output from perceptron cannot be interpreted as any kind of probability (as loss function
uses sgn function as transfer function. 1 mark each for the right observations

Page 4
8. (2 points) Calculate the bias and variance. Evaluate the performance in terms of bias and
variance of the model and comment. Solution:

Training error Dev error Bias Variance Conclusions

1% 20% 1% 19% low bias, high variance {poor performance}
20% 25% 20% 5% high bias, low variance {poor performance}
1% 2% 1% 1% low bias, low variance {good performance}
0.1% 0.5% 0.1% 0.4% low bias, low variance {good performance}

0.5 mark each for the bias variance calculation and right observations

9. (2 points) You would like to train a dog/cat image classifier using mini-batch gradient de-
scent. You have already split your dataset into train, validation and test sets. The classes are
balanced. You realize that within the training set, the images are ordered in such a way that
all the dog images come first and all the cat images come after. Your test set (Xtest , Ytest ) is
such that the first m1 images are of dogs and the remaining images are of cats. After shuffling
Xtest and Ytest , you evaluate your model on it to obtain a classification accuracy a1 %. You also
evaluate your model on Xtest and Ytest without shuffling to obtain accuracy a2 %. What is the
relationship between a1 and a2 (>, <, =, ≤, ≥)? Explain. solution: a1 = a2 When evaluating
on the test set, the only form of calculation that you do is a single metric (e.g. accuracy) on
the entire test set. The calculation of this metric on the entire test set does not depend on the
ordering. 0.5 mark for the right observations. 1.5 marks for justification.

Section C: Descriptive

10. (4 points points) To exploit the desirable properties of decision tree classifiers and perceptrons,
Adam came up with a new algorithm called the “perceptron tree” that combines features from
both. Perceptron trees are similar to decision trees, but each leaf node contains a perceptron
rather than a majority vote. To create a perceptron tree, the first step is to follow a regular
decision tree learning algorithm (such as ID3) and perform splitting on attributes until the
specified maximum depth is reached. Once maximum depth has been reached, at each leaf
node, a perceptron is trained on the remaining attributes which have not yet been used in that
branch. Classification of a new example is done via a similar procedure. The example is first
passed through the decision tree based on its attribute values. When it reaches a leaf node, the

Page 5
final prediction is made by running the corresponding perceptron at that node. Assume that
you have a dataset with 6 binary attributes {A, B, C, D, E, F } and two output labels {−1, 1}.
A perceptron tree of depth 2 on this dataset is given below. Weights of the perceptron are
given in the leaf nodes. Assume bias b = 1 for each perceptron.

1. What would the given perceptron tree predict as the output label for the sample x =
[1, 1, 0, 1, 0, 1]? (2 marks)
2. True or False ? “The decision boundary of a perceptron tree will always be linear.”
Justify. (1 mark)
3. “For small values of max depth, decision trees are more likely to underfit the data than
perceptron trees.” True or False? Justify. (1 mark)

Solution:
(a) A=1 and D=1 so the point is sent to the right-most leaf node, where the perceptron output
is (1*1)+(0*0)+((-1)*0)+(1*1)+1 = 1 + 0 + 0 +1 +1 = 3. Prediction = sign(3) = 1.
(b) False, since decision tree boundaries need not be linear.
(c) True. For smaller values of max depth, decision trees essentially degenerate into majority-
vote classifiers at the leaves. On the other hand, perceptron trees have the capacity to make
use of “unused” attributes at the leaves to predict the correct class. Decision trees: Non-
linear decision boundaries. Perceptron: Ability to gracefully handle unseen attribute values in
training data/ Better generalization at leaf nodes.
2 marks for (a) part. 1 mark for (b) part , to be awarded only if correct answer with correct
justification, binary marking. 1 mark for (c) part, to be awarded only if correct answer with
correct justification, binary marking

11. (4 points) We are building a random forest for a 2-class classification problem with n de-
cision trees RF = {T1 , T2 ...Tn } and bagging. Each tree generated in bagging is identically
distributed(i.d.) but not necessarily independent and the expectation of an average of n such
trees is the same as the expectation of any one of them. Find the variance of the average of the
n trees, given that the trees with indices n = 2i + 1 ,where i = {0, 1, 2, ... n−1
2 } are independent
of each other and the positive pairwise correlation between rest of them is ρ.
Solution:
Pn
Xi (1−ρ) 2
V ar i=1
B = σ2ρ + B σ

Page 6
Given that the trees with odd indices are independent of each other, their correlation as well
as covariance is 0. Considering the rest of the trees,
Equation for variance of average of trees with variance σ 2 (1 mark) Deriving final equation for
variance of average of trees (1 mark) Identifying the constraints, finding out the covariance is
0 due to independence constraint (1 mark) Applying it on variance formula and final formula
(1 mark)

Page 7
Page 8

2nd Exam Question Paper 2
No ratings yet
2nd Exam Question Paper 2
16 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
SCLX 83
No ratings yet
SCLX 83
18 pages
Midterm Review Spring18 Sols
No ratings yet
Midterm Review Spring18 Sols
22 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
Design and Construction of A Battery Level Indicator
No ratings yet
Design and Construction of A Battery Level Indicator
10 pages
Final 2019
No ratings yet
Final 2019
15 pages
MLvsMAP Merged
No ratings yet
MLvsMAP Merged
208 pages
Finals 19
No ratings yet
Finals 19
16 pages
Final2019 Solutions
No ratings yet
Final2019 Solutions
23 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
Quiz2 B
No ratings yet
Quiz2 B
6 pages
ML Finals16 PDF
No ratings yet
ML Finals16 PDF
12 pages
Final: CS 189 Spring 2016 Introduction To Machine Learning
No ratings yet
Final: CS 189 Spring 2016 Introduction To Machine Learning
12 pages
Indian Institute of Science: Roblem
No ratings yet
Indian Institute of Science: Roblem
2 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
Finals 19
No ratings yet
Finals 19
16 pages
Final Exam Solutions
No ratings yet
Final Exam Solutions
12 pages
ML End Sem Nov2024 Paper
No ratings yet
ML End Sem Nov2024 Paper
4 pages
Midpaper
No ratings yet
Midpaper
16 pages
INF8953CE Final Exam Questions 2020
No ratings yet
INF8953CE Final Exam Questions 2020
5 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
DL Assignment Solutions
No ratings yet
DL Assignment Solutions
64 pages
Week 1
No ratings yet
Week 1
5 pages
HW 3
No ratings yet
HW 3
7 pages
Midterm2008f Sol
No ratings yet
Midterm2008f Sol
12 pages
15-381 Spring 2007 Assignment 6: Learning
No ratings yet
15-381 Spring 2007 Assignment 6: Learning
14 pages
CS725 2020 Midsem
No ratings yet
CS725 2020 Midsem
3 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
SMAI End 2015 S
No ratings yet
SMAI End 2015 S
4 pages
Week 5
No ratings yet
Week 5
4 pages
NNLS1 2019 HW1 Solutions
No ratings yet
NNLS1 2019 HW1 Solutions
5 pages
Solutions To Deep Learning
No ratings yet
Solutions To Deep Learning
25 pages
Midterm Solutions PDF
No ratings yet
Midterm Solutions PDF
17 pages
Midterm Solutions Machine
100% (1)
Midterm Solutions Machine
17 pages
212 Final-Solution
No ratings yet
212 Final-Solution
23 pages
NPTEL ML Assignment Week1
100% (4)
NPTEL ML Assignment Week1
5 pages
Midterm 2006
No ratings yet
Midterm 2006
11 pages
Cs230exam Win19 Soln
No ratings yet
Cs230exam Win19 Soln
29 pages
Week 1
No ratings yet
Week 1
4 pages
Quiz and Mid Paper Data
No ratings yet
Quiz and Mid Paper Data
31 pages
PRML 2022 Endsem
No ratings yet
PRML 2022 Endsem
3 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
No ratings yet
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
3 pages
Data Science Final Mock Test
No ratings yet
Data Science Final Mock Test
47 pages
4th Attempts Huawei
No ratings yet
4th Attempts Huawei
6 pages
Advantages:: Q.No 1.a Ans
No ratings yet
Advantages:: Q.No 1.a Ans
12 pages
Machine Learning Info 4122 2022
No ratings yet
Machine Learning Info 4122 2022
4 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
1st Exam Question Paper 2
No ratings yet
1st Exam Question Paper 2
16 pages
Midterm Solutions
No ratings yet
Midterm Solutions
14 pages
Mids 21
No ratings yet
Mids 21
10 pages
10-701 Midterm Exam Solutions, Spring 2007
No ratings yet
10-701 Midterm Exam Solutions, Spring 2007
20 pages
(1 - 1) CSA2001 - Module 5 Worksheet
No ratings yet
(1 - 1) CSA2001 - Module 5 Worksheet
12 pages
Midterm Sol
No ratings yet
Midterm Sol
23 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
2017 Quiz02 Soln
No ratings yet
2017 Quiz02 Soln
2 pages
2016 Quiz02 Solutions
No ratings yet
2016 Quiz02 Solutions
1 page
2016 Quiz03 Solutions
No ratings yet
2016 Quiz03 Solutions
2 pages
2015 Midsem Solutions
No ratings yet
2015 Midsem Solutions
4 pages
eMAX EP4502 Manual
No ratings yet
eMAX EP4502 Manual
11 pages
What Is Planning Poker?
No ratings yet
What Is Planning Poker?
5 pages
CARVITE Pre Approval Form
No ratings yet
CARVITE Pre Approval Form
3 pages
SailPoint IdentityIQ Learning Path 1733317843
No ratings yet
SailPoint IdentityIQ Learning Path 1733317843
16 pages
3 NC 154602
No ratings yet
3 NC 154602
9 pages
CVPR 2022 MainConference ProgramGuide Final
No ratings yet
CVPR 2022 MainConference ProgramGuide Final
70 pages
System Software and Languages
No ratings yet
System Software and Languages
55 pages
Paper 113
No ratings yet
Paper 113
10 pages
Introduction To ROC Analysis: Pattern Recognition Letters June 2006
No ratings yet
Introduction To ROC Analysis: Pattern Recognition Letters June 2006
16 pages
Noto Sans Korean Font License
No ratings yet
Noto Sans Korean Font License
2 pages
BAIS Exam
No ratings yet
BAIS Exam
4 pages
Novel AI Applications in The Energy Sector ECCNECT2024VLVP0101 Final Report June 2025 06anUmmiFaybCQULiJc3s2yh1U 117970
No ratings yet
Novel AI Applications in The Energy Sector ECCNECT2024VLVP0101 Final Report June 2025 06anUmmiFaybCQULiJc3s2yh1U 117970
35 pages
Async-JS.L.U01-05 (Asynchronous JavaScript)
No ratings yet
Async-JS.L.U01-05 (Asynchronous JavaScript)
43 pages
Year 9 Scheme and Note
No ratings yet
Year 9 Scheme and Note
43 pages
Chirag Sangwan Resume (1) - 1
No ratings yet
Chirag Sangwan Resume (1) - 1
1 page
C CV M Model
No ratings yet
C CV M Model
1 page
Teaching and Learning Aids To Support The Deaf Students Studying
No ratings yet
Teaching and Learning Aids To Support The Deaf Students Studying
18 pages
Resume Sia
No ratings yet
Resume Sia
10 pages
Digital Therapeutics Apps On Prescription
No ratings yet
Digital Therapeutics Apps On Prescription
12 pages
Tapo C310 2.0&2.20&2.26&2.28 - Datasheet
No ratings yet
Tapo C310 2.0&2.20&2.26&2.28 - Datasheet
8 pages
Maximum Demand Controller
0% (1)
Maximum Demand Controller
4 pages
The Internet: Bringing Us Together or Tearing Us Apart
No ratings yet
The Internet: Bringing Us Together or Tearing Us Apart
4 pages
ĐỀ NGHE SỐ 13A
No ratings yet
ĐỀ NGHE SỐ 13A
10 pages
Scratch Programming Playground Learn To Program by Making Cool Games 1st Edition Sweigart Download
No ratings yet
Scratch Programming Playground Learn To Program by Making Cool Games 1st Edition Sweigart Download
91 pages
Assistive Technologies To Support Students With Dyslexia: Author: Kara Dawson Et Al
100% (1)
Assistive Technologies To Support Students With Dyslexia: Author: Kara Dawson Et Al
16 pages
Karaoke Tutorial: Part 3: Filters and Tags
No ratings yet
Karaoke Tutorial: Part 3: Filters and Tags
38 pages
Telecom Knowledge and Experience Sharing - ? LTE KPI
No ratings yet
Telecom Knowledge and Experience Sharing - ? LTE KPI
8 pages
Troubleshooting
No ratings yet
Troubleshooting
100 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML Midsem 2022

Uploaded by

ML Midsem 2022

Uploaded by

Machine Learning Mid-Sem

• The duration of the exam is 1 hour.

• If any assumptions made, please mention it clearly.

• All sections/questions are compulsory to attempt.

Section A: Multiple choice questions

1. (1 point) Consider the following data :

Solution: Some observations: 1) Perceptron and LR loss functions

Training error Dev error Bias Variance Conclusions

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.