0% found this document useful (0 votes)

22 views4 pages

2019-20-I ES Key

Uploaded by

singhalabhi53

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views4 pages

2019-20-I ES Key

Uploaded by

singhalabhi53

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

CS 771A: Introduction to Machine Learning Endsem Exam (18 Nov 2019)

Name SAMPLE SOLUTIONS 80 marks

Roll No Dept. Page 1 of 4

Instructions:
1. This question paper contains 2 pages (4 sides of paper). Please verify.
2. Write your name, roll number, department in block letters neatly with ink on each page of this question paper.
3. If you don’t write your name and roll number on all pages, pages may get lost when we unstaple to scan pages
4. Write your final answers neatly with a blue/black pen. Pencil marks may get smudged.
5. Don’t overwrite/scratch answers especially in MCQ and T/F. We will entertain no requests for leniency.

Q1. Write T or F for True/False (write only in the box on the right hand side) (16x2=32 marks)

If 𝑓, 𝑔: ℝ2 → ℝ are two convex fns, then the fn ℎ defined as ℎ(𝐱) = 𝑓(𝐱) ⋅ 𝑔(𝐱)
1
can never be a convex fn no matter which two convex functions 𝑓, 𝑔 we choose
F
It is possible to derive a Lagrangian dual problem for the 𝐿2 -regularized logistic
2
regression problem even though there are no constraints in the primal formulation
T
If 𝑋, 𝑌 are two real-valued r.v.s (not necessarily independent) such that at least one
3
of them has zero variance i.e. 𝕍[𝑋] = 0 or 𝕍[𝑌] = 0 then Cov(𝑋, 𝑌) = 0
T
The LwP algorithm when used on a binary classification problem, results in a linear
4
decision boundary no matter how many prototypes we use per class
F
The time it takes to make a prediction for a test data point with a decision tree
5
with 𝑛 leaf nodes is always 𝒪(log 𝑛) no matter what the structure of the tree.
F
If we have 10000 red and 20 green points, then best option to deal with imbalance
6
is to find 20 red points closest to the green points and throw the rest 9980 away
F
Reinf. learning is a good technique to build a RecSys if we suspect that tastes of
7
users are changing (possibly due to our own recommendations to them)
T
Bandit algorithms are named so since they operate in settings where a malicious
8
adversary can sometimes corrupt the feedback/response given to the algorithm
F
The binary relevance method in recommendation systems is best suited (in terms
9
of prediction time/model size) when the number of items/labels is extremely large
F
A NN with a three hidden layers and a single output node with all nodes except
10
input layer nodes using sigmoid activation will always learn a continuous function
T
If our goal in RecSys is to quickly find out the most liked item(s) by a certain user,
11
then we should adopt the UCB method rather than pure exploration method
T
The EM algorithm is a special case of Q-learning (recall Q learning is used in reinf.
12
learning) since the EM algorithm also optimizes a function known as the Q function
F
If we are training an ensemble of 𝑘 classifiers, then it is very simple to train all of
13
them in parallel when using bagging but not that simple when using boosting
T
If we have 𝑛 data points with 𝑑-dimensional feature vectors, then kernel PCA with
14
the Gaussian kernel can learn only at most 𝑑 components from this data if 𝑑 < 𝑛
F
If 𝐴 ∈ ℝ𝑛×𝑛 is an orthonormal matrix i.e. 𝐴⊤ 𝐴 = 𝐼𝑛 = 𝐴𝐴⊤ , then it can never be
15
the case that 𝐴 is symmetric i.e. we must have 𝐴⊤ ≠ 𝐴
F
Let 𝑋 be a real valued r.v. that always takes values in the interval [−1,1]. Then we
16
must have 𝕍[𝔼[𝑋]] = 0 i.e. if we define 𝑌 = 𝔼[𝑋] then we must have 𝕍[𝑌] = 0
T
Page 2 of 4
Q2 Consider the NN with 2 hidden layers – all nodes use the identity activation function. This NN
is clearly equivalent to a network with no hidden layers since all activation functions are linear.
Find the weights of this new network and write them down in the space provided. (4 marks)

Q3 Define 𝑓: ℝ2 × ℝ2×3 × ℝ3 → ℝ as 𝑓(𝐱, 𝑊, 𝐲) = 𝐱 ⊤ 𝑊𝐲 where 𝐱 ∈ ℝ2 , 𝐲 ∈ ℝ3 , 𝑊 ∈ ℝ2×3 . Let

1,2,1
𝐱 0 = [1,2]⊤ , 𝐲 0 = [3,4,5]⊤ , 𝑊 0 = [ ]. Define 𝑝: ℝ2 → ℝ as 𝑝(𝐱) = 𝐱 ⊤ 𝑊 0 𝑦 0 , 𝑞: ℝ2×3 → ℝ as
2,1,2
𝑞(𝑊) = (𝐱 0 )⊤ 𝑊𝐲 0 and 𝑟: ℝ3 → ℝ as 𝑟(𝐲) = (𝐱 0 )⊤ W 0 𝐲. Write the Jacobians of 𝑝, 𝑞, 𝑟 below.
Note that to avoid clutter, we are asking you to write 𝐽𝑞 as a 2 × 3 matrix. (2+3+3=8 marks)

Q4 We wish to use 𝐶-SVM to learn a binary classifier. We have 100000 train points half of which
are red and the other half green. Briefly outline a way to tune the 𝐶 parameter and justify your
reasons for the same. You may use the 100000 training points in any way you wish. (4 marks)
Since the dataset is balanced, we need not resort to class-weighted classification
tactics. We may set aside a fair number of randomly chosen points (say 30000) as
a held-out validation set, then perform a grid search over a reasonable range of
values of 𝐶 say 0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50 and choose the value for which the
SVM trained using that value of 𝐶 gives us maximum classification accuracy on the
validation dataset. Note that other methods like k-fold validation etc are also
admissible. Also, we are able to use classification accuracy as a performance
measure on the validation dataset only because the dataset is balanced. Had the
dataset been unbalanced, we should have used F-measure etc instead.
CS 771A: Introduction to Machine Learning Endsem Exam (18 Nov 2019)
Name SAMPLE SOLUTIONS 80 marks
Roll No Dept. Page 3 of 4

Q5 Let 𝐾1 : 𝒳 × 𝒳 → ℝ be a Mercer kernel with feature map 𝜙1 : 𝒳 → ℝ𝐷 for some finite 𝐷 > 0.
Define a new kernel 𝐾2 = 𝐾12 i.e. 𝐾2 (𝐱, 𝐲) = 𝐾1 (𝐱, 𝐲)2 for all 𝐱, 𝐲 ∈ 𝒳. Design a feature map for
𝐾2 i.e. 𝜙2 : 𝒳 → ℝ𝐿 for some 𝐿 > 0 s.t. 𝐾2 (𝐱, 𝐲) = 〈𝜙2 (𝐱), 𝜙2 (𝐲)〉 for all 𝐱, 𝐲 ∈ 𝒳. (6 marks)

The properties of trace tell us that 𝜙1 (𝐱)⊤ 𝜙1 (𝐲) = trace(𝜙1 (𝐱)⊤ 𝜙1 (𝐲)) =
trace(𝜙1 (𝐱)𝜙1 (𝐲)⊤ ). Also, 𝑐 ⋅ 𝑡𝑟𝑎𝑐𝑒(𝑋) = 𝑡𝑟𝑎𝑐𝑒 (𝑐 ⋅ 𝑋) for all 𝑐 ∈ ℝ. Thus we write
2
𝐾2 (𝐱, 𝐲) = 𝐾1 (𝐱, 𝐲)2 = (𝜙1 (𝐱)⊤ 𝜙1 (𝐲)) = trace(𝜙1 (𝐱)𝜙1 (𝐱)⊤ 𝜙1 (𝐲)𝜙1 (𝐲)⊤ ). If
we use 𝜙2 (𝐱) = 𝜙1 (𝐱)𝜙1 (𝐱)⊤ ∈ ℝ𝐷×𝐷 , then we have 𝐾2 (𝐱, 𝐲) = 〈𝜙2 (𝐱), 𝜙2 (𝐲)〉 for
all 𝐱, 𝐲 ∈ 𝒳.
Instead of a matrix-valued feature map, we may have a vector feature map as well
2
𝜙2 (𝐱) ∈ ℝ𝐷 i.e. 𝐿 = 𝐷 2 by creating coordinates of the form 𝐯𝑖 𝐯𝑗 : 𝑖, 𝑗 ∈ [𝐷] where
we denote 𝐯 = 𝜙1 (𝐱) (note that this essentially stretches out the 𝐷 × 𝐷 matrix we
created earlier as a long vector).

Q6 Derive the Lagrangian dual for the following weighted CSVM problem (for use in Adaboost)
1
min𝑑 ‖𝐰‖22 + ∑𝑛𝑖=1 𝑐𝑖 ⋅ [1 − 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 ]+
𝐰∈ℝ 2

Write down the problem as a constrained opt. problem, write down the Lagrangian, and show
main steps in the derivation of the dual. Assume 𝑦 𝑖 ∈ {−1,1}, 𝐱 𝑖 ∈ ℝ𝑑 , 𝑐𝑖 > 0. (3+1+2=6marks)
1
Constrained prob: min ‖𝐰‖22 + ∑𝑛𝑖=1 𝑐𝑖 ⋅ 𝜉𝑖 s.t. 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 ≥ 1 − 𝜉𝑖 and 𝜉𝑖 ≥ 0 ∀𝑖
𝐰,𝛏 2
1
Lagrangian: ℒ (𝐰, 𝛏, 𝛂, 𝛃) = ‖𝐰‖22 + ∑𝑛𝑖=1 𝑐𝑖 ⋅ 𝜉𝑖 + 𝛼𝑖 (1 − 𝜉𝑖 − 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 ) − 𝛽𝑖 𝜉𝑖
2
𝜕ℒ 𝜕ℒ
Setting = 0 gives us 𝛼𝑖 + 𝛽𝑖 = 𝑐𝑖 whereas = 𝟎 gives us 𝐰 = ∑𝑛𝑖=1 𝛼𝑖 𝑦 𝑖 𝐱 𝑖 .
𝜕𝜉𝑖 𝜕𝐰
1
Simplifying gives max ∑𝑛𝑖=1 𝛼𝑖 − ∑𝑛𝑖,𝑗=1 𝛼𝑖 𝛼𝑗 𝑦 𝑖 𝑦 𝑗 〈𝐱 𝑖 , 𝐱𝑗 〉 s.t. 𝛼𝑖 ∈ [0, 𝑐𝑖 ] ∀𝑖 ∈ [𝑛]
𝛂 2
Page 4 of 4
Q7 Consider the valley distribution with three parameters 𝒱(𝑎, 𝑏, 𝑐) where 𝑎 < 𝑏 and 𝑎 ≤ 𝑐 ≤ 𝑏
(no other restrictions on 𝑎, 𝑏, 𝑐). The PDF of this distribution, with ℎ = 4/(3(𝑏 − 𝑎)), is
0 𝑥<𝑎
ℎ(𝑥 − 𝑎)
ℎ− 𝑎≤𝑥<𝑐
2(𝑐 − 𝑎)
ℙ[𝑥 | 𝑎, 𝑏, 𝑐] = 𝒱(𝑥; 𝑎, 𝑏, 𝑐) ≜
ℎ(𝑏 − 𝑥)
ℎ− 𝑐≤𝑥≤𝑏
2(𝑏 − 𝑐)
{ 0 𝑥>𝑏
Given 𝑛 indep. samples 𝑥 1 , … , 𝑥 𝑛 ∈ ℝ (not all samples are the same) we wish to learn a valley
distribution as a generative distribution using MLE i.e. find arg max ℙ[𝑥 1 , … , 𝑥 𝑛 | 𝑎, 𝑏, 𝑐]. Give a
𝑎<𝑏,𝑎≤𝑐≤𝑏
brief description + derivation of an algorithm to find 𝑎̂MLE , 𝑏̂MLE , 𝑐̂MLE . (5+5+10=20 marks)
Observation 1: let 𝑚 ≜ min 𝑥 𝑖 and 𝑀 ≜ max 𝑥 𝑖 . Then if 𝑎 > 𝑚 or 𝑏 < 𝑀 then the
𝑖 𝑖
likelihood would vanish and thus, we must have 𝑎 ≤ 𝑚, 𝑏 ≥ 𝑀.
Observation 2: if 𝑐 ∈ [𝑚, 𝑀] then if 𝑎 < 𝑚 or 𝑏 > 𝑀 or both, then we can
increase likelihood by keeping 𝑐 the same and setting 𝑎 = 𝑚, 𝑏 = 𝑀. This is
because doing so causes (𝑏 − 𝑎) ↓ so ℎ ↑ which causes PDF to go up in the entire
interval [𝑎, 𝑏] = [𝑚, 𝑀] i.e. likelihood of all data points goes up.
Observation 3: if 𝑐 < 𝑚, then we can similarly see that setting 𝑎 = 𝑐 = 𝑚 will
strictly increase likelihood. Similarly if 𝑐 > 𝑀, we may set 𝑏 = 𝑐 = 𝑀.
The above observations tell us that 𝑎̂MLE = 𝑚, 𝑏̂MLE = 𝑀 and 𝑐̂MLE ∈ [𝑚, 𝑀]. In
general there need not be a closed form solution for 𝑐̂MLE . A sensible workaround
is to perform search in the interval [𝑚, 𝑀]. W.l.o.g. assume that 𝑥 1 ≤ 𝑥 2 … ≤ 𝑥 𝑛 .
Then for all values of 𝑐 ∈ [𝑥 𝑖 , 𝑥 𝑖+1 ], we have the NLL expression as

𝑖(
𝑖 𝑥𝑗 − 𝑎 𝑛 𝑏 − 𝑥𝑗
ℓ 𝑐) = − ∑ ln (1 − )−∑ ln (1 − )
𝑗=1 2(𝑐 − 𝑎) 𝑗=𝑖+1 2(𝑏 − 𝑐)
Note that we removed terms involving ℎ above as they do not affect the optimum.
The above function may be (approximately) minimized in the range 𝑐 ∈ [𝑥 𝑖 , 𝑥 𝑖+1 ]
using GD. The same process needs to be repeated for all 𝑖 ∈ [𝑛 − 1] to obtain an
(approximation) of the globally optimal value of 𝑐.
Pseudo Algo for estimating 𝒄̂MLE :
For 𝑖 = 1, … , 𝑛 − 1, let 𝑐̂ 𝑖 = arg min
𝑖 𝑖+1
ℓ 𝑖( )
𝑐 approximated using GD
𝑐∈[𝑥 ,𝑥 ]

Output 𝑐̂ 𝑘 where 𝑘 = arg min ℓ𝑗 (𝑐̂ 𝑗 )

𝑗∈[𝑛−1]

Laser Maker Manual (YLM) - 95p - ENG
No ratings yet
Laser Maker Manual (YLM) - 95p - ENG
95 pages
Instructional Media Chalkboards To Video
100% (1)
Instructional Media Chalkboards To Video
10 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
Python Full Material
No ratings yet
Python Full Material
131 pages
Digital Twin Applications in Aviation Industry A R
No ratings yet
Digital Twin Applications in Aviation Industry A R
17 pages
Five Generation of Computer
No ratings yet
Five Generation of Computer
6 pages
User-Agents Line-App Application Android
No ratings yet
User-Agents Line-App Application Android
37 pages
tms320f28377d (데이터시트)
No ratings yet
tms320f28377d (데이터시트)
253 pages
E.optemal Reconfiguration of Network
No ratings yet
E.optemal Reconfiguration of Network
30 pages
HCI 4th Chapt Notes
No ratings yet
HCI 4th Chapt Notes
13 pages
Milk Vending Machine
50% (2)
Milk Vending Machine
21 pages
Machine Learning PYQ 2022 Ans
No ratings yet
Machine Learning PYQ 2022 Ans
17 pages
WS en 2018.07 Rev.6 64636655
No ratings yet
WS en 2018.07 Rev.6 64636655
88 pages
I PUC Computer Science Passing Package
No ratings yet
I PUC Computer Science Passing Package
2 pages
Stanford University CS 229, Autumn 2014 Midterm Examination
No ratings yet
Stanford University CS 229, Autumn 2014 Midterm Examination
23 pages
ML 2024a QP Solution Full
No ratings yet
ML 2024a QP Solution Full
13 pages
TM-ENS-003 FD322 Basic Training Presentation Rev A
No ratings yet
TM-ENS-003 FD322 Basic Training Presentation Rev A
94 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
Midterm2008f Sol
No ratings yet
Midterm2008f Sol
12 pages
AI42001 Machine Learing Foundations ES 2024
No ratings yet
AI42001 Machine Learing Foundations ES 2024
18 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Finals 19
No ratings yet
Finals 19
16 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
Machine 2020 Jul-Dec
No ratings yet
Machine 2020 Jul-Dec
45 pages
Data+Science+Intern JD ZIGRAM Aug2024 Reviewed
No ratings yet
Data+Science+Intern JD ZIGRAM Aug2024 Reviewed
12 pages
Exam 2011
No ratings yet
Exam 2011
22 pages
Group 12 Zerodha SectionB
No ratings yet
Group 12 Zerodha SectionB
12 pages
2022 Exam2 Solution
No ratings yet
2022 Exam2 Solution
10 pages
Quiz 3
No ratings yet
Quiz 3
12 pages
52-199x Series Digital Mixers
No ratings yet
52-199x Series Digital Mixers
17 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
12s 701 Final
No ratings yet
12s 701 Final
17 pages
Tut04 - One Algorithm To Optimize Them All
No ratings yet
Tut04 - One Algorithm To Optimize Them All
19 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
Sol3 2016
No ratings yet
Sol3 2016
8 pages
Css OB
No ratings yet
Css OB
14 pages
HW 3
No ratings yet
HW 3
7 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
Monday, 22 June 2022 This Presentation Is Used For After School Hours Tutorial Services 1
No ratings yet
Monday, 22 June 2022 This Presentation Is Used For After School Hours Tutorial Services 1
24 pages
hw5 1
No ratings yet
hw5 1
6 pages
ML 20240315
No ratings yet
ML 20240315
8 pages
Machine Learning Solutions
No ratings yet
Machine Learning Solutions
6 pages
ES Key
No ratings yet
ES Key
4 pages
Kinetic Theory of Gases Notes
No ratings yet
Kinetic Theory of Gases Notes
5 pages
Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
Computer Graphics
No ratings yet
Computer Graphics
14 pages
Practice Questions Lec 18 45
No ratings yet
Practice Questions Lec 18 45
4 pages
Best Laptops 35K To 1lakh
No ratings yet
Best Laptops 35K To 1lakh
6 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
Keymaker
No ratings yet
Keymaker
3 pages
MS Key-4
No ratings yet
MS Key-4
4 pages
Raja Shankar Shah University, Chhindwara (M.P.)
No ratings yet
Raja Shankar Shah University, Chhindwara (M.P.)
2 pages
Ad
No ratings yet
Ad
5 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
2021 EE769 Tutorial Sheet 1
No ratings yet
2021 EE769 Tutorial Sheet 1
4 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
2017-18-I MS Key
No ratings yet
2017-18-I MS Key
6 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
Quiz 1
No ratings yet
Quiz 1
3 pages
HW 2
No ratings yet
HW 2
3 pages
PHI446 Mid-Sem Sample 2015
No ratings yet
PHI446 Mid-Sem Sample 2015
3 pages
10-701 Midterm Exam Solutions, Spring 2007
No ratings yet
10-701 Midterm Exam Solutions, Spring 2007
20 pages
EE 769 2023.02.23 Mid Term
No ratings yet
EE 769 2023.02.23 Mid Term
2 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
HW 3
No ratings yet
HW 3
2 pages
HW 1
No ratings yet
HW 1
2 pages
Ecomdash Setup Checklist
No ratings yet
Ecomdash Setup Checklist
2 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
ML ES 23-24-II Key
No ratings yet
ML ES 23-24-II Key
4 pages
ME341A Exam Paper Y20
No ratings yet
ME341A Exam Paper Y20
3 pages
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
No ratings yet
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
4 pages
Me341 HW11
No ratings yet
Me341 HW11
2 pages
PPS - Unit 5 Objective Questions
No ratings yet
PPS - Unit 5 Objective Questions
12 pages
Midterm Aut2014 (Final) Sol
No ratings yet
Midterm Aut2014 (Final) Sol
23 pages
Problemset2 PDF
No ratings yet
Problemset2 PDF
4 pages
Practice Midterm 2 Sol
No ratings yet
Practice Midterm 2 Sol
26 pages
Midterm 2006
No ratings yet
Midterm 2006
11 pages
Nessus Report: 21/mar/2012:16:20:52 GMT
No ratings yet
Nessus Report: 21/mar/2012:16:20:52 GMT
74 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
Midterm Solutions For Machine Learning
No ratings yet
Midterm Solutions For Machine Learning
13 pages
CS771 IITK EndSem Solutions
100% (1)
CS771 IITK EndSem Solutions
8 pages
hw3 Soln
No ratings yet
hw3 Soln
7 pages
Speech To Text - No Need To Write - 03
No ratings yet
Speech To Text - No Need To Write - 03
1 page
Template Module Content
No ratings yet
Template Module Content
2 pages
Digital Systems Design Using VHDL
No ratings yet
Digital Systems Design Using VHDL
1 page
Michael Dubois III: Dataclay - Motion Graphics Artist
No ratings yet
Michael Dubois III: Dataclay - Motion Graphics Artist
2 pages
Monitor Materno-Fetal Especializado C20
No ratings yet
Monitor Materno-Fetal Especializado C20
4 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
HW 1
No ratings yet
HW 1
3 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

2019-20-I ES Key

Uploaded by

2019-20-I ES Key

Uploaded by

CS 771A: Introduction to Machine Learning Endsem Exam (18 Nov 2019)

Name SAMPLE SOLUTIONS 80 marks

Q3 Define 𝑓: ℝ2 × ℝ2×3 × ℝ3 → ℝ as 𝑓(𝐱, 𝑊, 𝐲) = 𝐱 ⊤ 𝑊𝐲 where 𝐱 ∈ ℝ2 , 𝐲 ∈ ℝ3 , 𝑊 ∈ ℝ2×3 . Let

Output 𝑐̂ 𝑘 where 𝑘 = arg min ℓ𝑗 (𝑐̂ 𝑗 )

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.