HW 05
HW 05
INT3405
October 2, 2024
Collaboration: List the names of all people you have collaborated with and for which
question(s). You may discuss problems with other students, but you must not view or copy
another student’s written answers. Each student is responsible for independently under-
standing, writing, and submitting their own work.
Acknowledgments: If you reference solutions from published sources, the web, or other
textbooks, you must acknowledge these sources. It’s acceptable to use existing proofs for
guidance, but this must be noted, and you should first attempt to solve the problem on your
own. All students are expected to fully comprehend the steps they include in their written
work.
Exercise 1. [Reject option in classifiers] In many classification problems one has the
option either of assigning x to class j or, if you are too uncertain, of choosing the reject
option. If the cost for rejects is less than the cost of falsely classifying the object, it may
be the optimal action. Let αi mean you choose action i, for i = 1 : C + 1, where C is the
number of classes and C + 1 is the reject action. Let Y = j be the true (but unknown) state
of nature. Define the loss function as follows:
0 if i = j and i = j ∈ {1, .., C}
λ(αi |Y = j) = λr if i = C + 1 (1)
λs otherwise
In other words, you incur 0 loss if you correctly classify, you incur λr loss (cost) if you choose
the reject option, and you incur λs loss (cost) if you make a substitution error (misclassifi-
cation).
1. Show that the minimum risk is obtained if we decide Y = j if p(Y = j|x) ≥ p(Y = k|x)
for all k (i.e., j is the most probable class) and if p(Y = j|x) ≥ 1 − λλrs ; otherwise we
decide to reject.
2. Describe qualitatively what happens as λr /λs is increased from 0 to 1 (i.e., the relative
cost of rejection increases).
Exercise 2. Suppose we have a sample of N pairs xi , yi drawn i.i.d. from the distribution
characterized as follows:
xi ∼ h(x), the design density
yi = f (xi ) + ϵi , f is the regression function (2)
ϵi ∼ (0, σ 2 ) (mean zero, varianceσ 2 )
1
We construct an estimator for f linear in the yi ,
N
X
fˆ(x0 ) = ℓi (x0 ; X )yi , (3)
i=1
where the weights ℓi (x0 ; X ) do not depend on the yi , but do depend on the entire training
sequence of xi , denoted here by X .
1. Show that linear regression and k-nearest-neighbor regression are members of this class
of estimators. Describe explicitly the weights ℓi (x0 ; X ) in each of these cases.