0% found this document useful (0 votes)
19 views1 page

Shrinkage Content

Ridge regression and the lasso are two commonly used shrinkage methods for feature selection in regression models. Shrinkage works by shrinking regression coefficients towards zero, reducing the influence of unimportant features and producing models with lower prediction error variability compared to direct filtering. Ridge regression performs shrinkage by minimizing residuals plus an L2 penalty on the coefficients, while the lasso uses an L1 penalty, resulting in some coefficients being exactly zero and therefore automatically performing subset selection.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views1 page

Shrinkage Content

Ridge regression and the lasso are two commonly used shrinkage methods for feature selection in regression models. Shrinkage works by shrinking regression coefficients towards zero, reducing the influence of unimportant features and producing models with lower prediction error variability compared to direct filtering. Ridge regression performs shrinkage by minimizing residuals plus an L2 penalty on the coefficients, while the lasso uses an L1 penalty, resulting in some coefficients being exactly zero and therefore automatically performing subset selection.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Shrinkage

December 15, 2020

Shrinkage is a useful method for feature selection in regression methods.


By shrinking the regression coefficients towards 0, the relatively unimportant
features have little influence on the response variable. Thus, shrinkage may
produce a model with lower variability in prediction error, compared to direct
filtering methods such as the t-statistic (Hastie et al., 2009).
Two commonly used shrinkage methods are ridge regression (Hoerl and Ken-
nard, 1970) and the lasso (Tibshirani, 1996). For ridge regression, the regression
coefficients are estimated as
 
Xn p p 
β̂ ridge = arg min
X X
(yi − β0 − xij βj )2 + λ βj2 ,
β 
i=1 j=1

j=1

where p is the number of features, n is the sample size, and λ ≥ 0 is a complexity


parameter for controlling shrinkage amount (larger λ, larger shrinkage). Thus,
ridge regression estimates the regression coefficients by minimising
Pn the usual
sum of squares along with an L2 penalty term, which is λ j=1 βj2 . For the
Pn
lasso (Tibshirani, 1996), an L1 penalty term λ j=1 |βj | is used instead:
 
Xn p p 
β̂ lasso = arg min
X X
(yi − β0 − xij βj )2 + λ |βj | ,
β 
i=1 j=1 j=1

Equivalently,Pinclusion of the penalty term implies Pminimisation subject to the


p p
constraints j=1 βj2 ≤ t for ridge regression, and j=1 |βj | ≤ t for the lasso,
where t is some constant. For the lasso, making t small enough results in some
of the regression coefficients becoming exactly zero. Thus, the lasso also can be
considered as performing subset selection of the features.
Ref.

Hastie, T., Tibshirani, R., and Friedman, J. (2009). The elements of sta-
tistical learning: data mining, inference, and prediction (2nd ed.). New York:
Springer.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy