0% found this document useful (0 votes)

23 views74 pages

L5 Spline Regression

Uploaded by

yuebb2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views74 pages

L5 Spline Regression

Uploaded by

yuebb2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.

6 Ot

Spline Regression

MAST90083 Computational Statistics and Data Mining

Karim Seghouane
School of Mathematics & Statistics
The University of Melbourne

Spline Regression 1/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Outline

§3.1 Introduction

§3.2 Motivation

§3.3 Spline

§3.4 Penalized Spline Regression

§3.5 Linear Smoothers

§3.6 Other Basis

Spline Regression 2/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Introduction

I Some data sets are hard or impossible to model using

traditional parametric techniques
I Many data sets also involve nonlinear effects that are difficult
to model parametrically
I There is a need for flexible techniques to handle complicated
nonlinear relationships
I Here we look at some ways of freeing oneself of the
restrictions of parametric regression models

Spline Regression 3/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Introduction
The interest is the discovery of the underlying trend in the observed
data which are treated as a collection of points on the plane

Spline Regression 4/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Introduction

I Alternatively, we could think of the vertical axis as a

realization of a random variable y conditional on the variable x
I The underlying trend would then be a function

f (x) = E (y |x)
I This can also be written as

yi = f (xi ) + i , E (i ) = 0
I and the problem is referred as nonparametric regression

Spline Regression 5/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Introduction

I Aim Estimate the unspecified smooth function from the pairs

(xi , yi ) , i = 1, ..., n.
I x here will be considered univariate
I There are several available methods, here we focus first on
penalized splines
I It is an extension of linear regression modeling

Spline Regression 6/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Motivation

I Let’s start with the straight line regression model

yi = β0 + β1 xi + i

Spline Regression 7/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Motivation

I The corresponding basis for this model are the functions: 1

and x

I The model is a linear combination of these functions which is

the reason for use of the world basis

Spline Regression 8/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Motivation

I The basis functions correspond to the columns of X for fitting

the regression
 
1 x1
X =  ... ... 
 

1 xn
I The vector of fitted values
−1
ŷ = X X > X X > y = Hy

Spline Regression 9/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Motivation

I The quadratic model is a simple extension of the linear model

yi = β0 + β1 xi + β2 xi2 + i

Spline Regression 10/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Motivation

I There is an extra basis function x 2 corresponding to the

addition of the β2 xi2 term to the model

I The quadratic model is an example of how the simple linear

model might be extended to handle nonlinear structure

Spline Regression 11/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Motivation

I The basis functions correspond to the columns of X for fitting

the regression in the case of a quadratic model is given by

1 x1 x12
 

X =  ... ... .. 

.
1 xn xn2
I The vector of fitted values
−1
ŷ = X X > X X > y = Hy

Spline Regression 12/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I We know look at how the model can be extended to

accommodate a different type of nonlinear structure

I Broken line model: it consists of two differently sloped lines

that join together

Spline Regression 13/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I Broken line: A linear combination of three basis functions

I where we have (x − 0.6)+ with

(
u u>0
u+ =
0 u≤0

Spline Regression 14/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I Broken line model is

yi = β0 + β1 xi + β11 (xi − 0.6)+ + i

I which can be fit using the least square estimator with
 
1 x1 (x1 − 0.6)+
 .. .. ..
X = .

. . 
1 xn (xn − 0.6)+

Spline Regression 15/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I Assume a more complicated structure

I Straight line structure in the left-hand half but the right-hand

is prone to a high amount of detailed structure (whip model)

Spline Regression 16/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I If we have good reason to believe that our underlying

structure is of this basic, we could change the basis ?

I where the functions: (x − 0.5)+ , (x − 0.55)+ ,...,(x − 0.95)+

Spline Regression 17/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I The basis can do a reasonable job with a linear portion

between x = 0 and x = 0.5
I We can use least square to fit such model with

 
1 x1 (x1 − 0.5)+ (x1 − 0.55)+ . . . (x1 − 0.95)+
X =  ... ... .. .. .. ..
 
. . . . 
1 xn (xn − 0.5)+ (xn − 0.55)+ . . . (xn − 0.95)+

Spline Regression 18/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I It is possible to handle any complex type of structure by

simply adding functions of the form (x − k)+ to the basis
I This is equivalent to adding a column of values to the X
matrix
I The value k corresponding to the function (x − k)+ is
referred to as a knot
I This is because the function is made up of two lines that are
tied together at x = k

Spline Regression 19/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I The function (x − 0.6)+ is called a linear spline basis function

I A set of such functions is called a linear spline basis
I Any linear combination of linear spline basis functions 1, x,
(xi − k1 )+ ,...,(xi − kK )+ is a piecewise linear function with
knots k1 , k2 ,...,kK and called spline

Spline Regression 20/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Spline basis function

I Rather than referring to the spline basis function (x − k)+ it

is common to simply refer to it knots k
I We say the model has a knot at 0.35 it the function
(x − 0.35)+ is in the basis
I The spline model for a function f is

K
X
f (x) = β0 + β1 x + βki (x − ki )+
i=1

Spline Regression 21/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Illustration
I The selection of a good basis is usually challenging
I Start by trying to choose knots by trial (at range 575 and 600)

Spline Regression 22/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Illustration
I The fit lacking in quality for low values of range
I An obvious remedy is to use more knots (at range 500, 550,
600 and 650)

Spline Regression 23/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Illustration
I Larger set of knots (at every 12.5), the fitting procedure has
much more flexibility
I The plots is heavily overfitted

Spline Regression 24/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Illustration
I Pruning the knots (at 612.5, 650, 662.5 and 687.5) to
overcome the overfitting issue
I This fits the data well without overfitting
I Obtained, after a lot of time consuming trial and error

Spline Regression 25/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Knot selection

I A natural attempt at automatic selection of the knots is to

use a model selection criterion
I If there are K candidate knots then there are 2K possible
models assuming the overall intercept and linear term are
always present
I Highly computational intensive

Spline Regression 26/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Penalized spline regression

I Too many knots in the model induces roughness of the fit

I An alternative approach: retain all the knots but constrain
their influence
I Hope: this will result in a less variable fit
I Consider a general spline model with K knots, K large

Spline Regression 27/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Penalized spline regression

I The ordinary least square fit is written as

ŷ = X β̂ where β̂ minimizes ky − X βk2

I and β = [β0 , β1 , β11 , ..., β1K ] with β1k the coefficient of the
k th knot.
I Unconstrained estimation of the β leads to a wiggly fit

Spline Regression 28/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Penalized spline regression

Constraints on the β1k that might help avoid this situation are

I max |β1k | < C

I
P
|β1k | < C
P 2
I β1k < C

With an appropriate choice of C each of these will lead to a

smoother fit, however the last constraint is much simpler to
implement

Spline Regression 29/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Penalized spline regression

Define the matrix D if size (K + 2) × (K + 2)

 
0 0 0 0 0 ... 0
0 0 0 0 0 . . . 0
 
0 0 1 0 0 . . . 0

02×2 02×K
D = 0 =
 
 0 0 1 0 . . . 0 
0K ×2 IK ×K
 .. .. .. .. .. . . .. 
. . . . . . .
0 0 0 0 0 ... 1

Spline Regression 30/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Penalized spline regression

I The third constraint is easier to implement than the first two

I The minimization problem

Minimize ky − X βk2 subject to β > Dβ ≤ C

I This is equivalent to choosing β to minimize

ky − X βk2 + λβ > Dβ
I for λ ≥ 0 and has solution
−1
β̂ λ = X > X + λD X >y

Spline Regression 31/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Penalized spline regression

I The term λβ > Dβ is called a roughness penalty since it

penalizes fits that are too rough, thus yielding smoother result
I The amount of smoothness is controlled by λ, which is
therefore referred to as a smoothing parameter
I The fitted values for penalized spline regression are
−1
ŷ = X X > X + λD X >y

Spline Regression 32/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Illustration
I Linear penalized spline regression fits for different values of
the smoothing parameter (depends on λ)

Spline Regression 33/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Quadratic spline bases

I We have discussed linear splines, that is continuous, piecewise

linear functions
I The reason for the piecewise linear nature of the functions ?

Spline Regression 34/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Quadratic spline bases

I We have discussed linear splines, that is continuous, piecewise

linear functions
I The reason for the piecewise linear nature of the functions ?
I is that they are a linear combination of piecewise linear
functions of the form (x − k)+
I A simple way of escaping from piecewise linearity ?

Spline Regression 35/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Quadratic spline bases

I We have discussed linear splines, that is continuous, piecewise

Spline Regression 36/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Illustration of a quadratic spline basis function

I Illustration of the function (x − 0.6)2+

Spline Regression 37/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Quadratic spline bases

I The function doesn’t have a sharp corner like (x − 0.6)+ does

I The function (x − 0.6)2+ has a continuous first derivative
I Any linear combination of the functions

1, x, x 2 , (x − k1 )2+ , ..., (x − kK )2+

I also have a continuous first derivative and not have any sharp
corner
I This result in better fit
I This is called a quadratic spline basis with knots at k1 , ..., kK

Spline Regression 38/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Illustration of quadratic spline basis functions

I Quadratic spline do a better job of fitting peaks and valleys

Spline Regression 39/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Other spline bases

I We discussed linear and quadratic spline models
I One reason for considering other models is to achieve
smoother fits → important if one plans to differentiate the fit
to estimate derivative of the regression function
I In principle a change of basis does not change the fit but
some bases are more stable and allow computation of a fit
with better accuracy
I Besides numerical stability: ease of implementation is another
reason for selecting one basis over another
I An obvious generalization is given by

1, x, ..., x p , (x − k1 )p+ , ..., (x − kK )p+

I know as the truncated power basis of degree p
Spline Regression 40/74
§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Other spline bases

I since the function (x − k)p+ has p − 1 continuous derivatives,

higher values of p lead to smoother spline functions
I The p th degree spline model is

K
X
f (x) = β0 + β1 x + ... + βp x p + βki (x − ki )p+
i=1
I The expression for the fitted values is given by
−1
ŷ = X X > X + λD X >y

D = diag (0p+1 , 1K )

Spline Regression 41/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

B-Spline bases

Truncated power bases can be used in practice

I if the knots are selected carefully or
I a penalized fit is used

Truncated power bases have the practical disadvantage that they

are far from orthogonal
I this lead to numerical instability
I particularly when there is a large number of knots (λ is small
or zero)

Spline Regression 42/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

B-Spline bases

I In practice, especially for OLS fitting, it is advisable to work

with equivalent bases with more stable numerical properties.
I The most common choice is the B-spline basis

Spline Regression 43/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

B-Spline bases
B-spline bases of degree 1, 2 and 3 for the case of seven irregularly
spaced knots

Spline Regression 44/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

B-Spline bases

I Each of these are equivalent to the truncated power basis of

the same degree
I In regression, this means using B-spline for the columns of X
or truncated power basis of similar degree produce identical
fits (knots at the same locations).

Spline Regression 45/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

B-Spline bases
Mathematically, this equivalence is quantified as follows
I Let XT be the X matrix with columns obtained with a power
truncated basis and
I let XB be the X matrix corresponding to the B-spline basis of
the same degree and same knots locations then

XB = XT Lp where Lp is square invertible matrix

The penalized spline fit of degree p in terms of B−spline
−1
ŷ = XB XB> XB + λL>
p DLp XB> y

→ basis used in regression packages

Note 1
Spline Regression 46/74
§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Natural Cubic Spline

I Nature cubic spline is a modification of cubic spline that adds
a linearity constraint beyond the boundary knots (0 and 1)
I The other knots are called interior knots
I The linearity is enforced through the constraints that the
spline f satisfy f 00 = f 000 = 0 at the boundary knots

Spline Regression 47/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cubic smoothing spline

I Spline basis method that avoids the knot selection issue by

using a maximal set of knots (or n knots)
I Among all functions f (x) with two continuous derivatives,
select fˆ (x) that minimizes
n Z
2
X 2
f 00 (x)

{yi − f (xi )} + λ dx
i=1
I The regularization controls the complexity of the fit by
penalizing the curvature of the function f
I The minimizer of this penalized sum of squares is a natural
cubic spline with knots at the xi

Spline Regression 48/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cubic smoothing spline

The smoothing parameter λ controls the tradoff between closeness

to the data and complexity and there are two special cases
I λ = 0 : f can be any function that interpolates the data (very
rough)
I λ = ∞ : least square line fit (since no second derivative can
be tolerated)
I The function is over-parametrized since there are n knots
which implies n degrees of freedom
I The penalty term translates to a penalty on the spline
coefficients which are shrunk toward the linear fit

Spline Regression 49/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cubic smoothing spline

Since the solution is a natural spline, it can take the form

n
X
f (x) = Bi (x) βi
i=1

Bj (x) are an n−dimensional set of basis functions for representing

this family of natural spline
With this representation, the criterion for smoothing spline reduces

RSS (β, λ) = (y − Bβ)> (y − Bβ) + λβ > Ωβ

where Bij = Bi (xj ) and {Ω}lm = Bl00 (x)Bm
00 (x)dx
R

Spline Regression 50/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cubic smoothing spline

The fitted smoothing spline is given by
I The solution is
−1
β̂ = B> B + λΩ B> y
I The fitted smoothing spline is given by
n
X
fˆ (x) = Bi (x) β̂i
i=1
I Efficient computation in O(n) operations can be realized using
a Cholesky decomposition

B> B + λΩ β = B> y

Spline Regression 51/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

General form of penalized spline

The general definition of penalized spline is B(x)β and

n
X
β̂ = arg min β [yi − B (xi ) β]2 + λβ > Dβ
i=1

where D is symmetric positive semidefinite and λ > 0

I In case of spline basis D = diag (0p+1 , 1K )
I In smoothing splines D defines the penalty

Spline Regression 52/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

General form of penalized spline

When applying splines, there are two basic choices to make

I The spline model: the degree and knot locations
I The penalty: the form of the penalty

Once the choices have been made, there follow two secondary
choices
I The basis functions: truncate power functions or B-splines
I The basis functions used in the computations

Spline Regression 53/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Linear smoothers

Penalized spline is a linear function of the data y

−1
ŷ = Sλ y with Sλ = X X > X + λD X>

I where X corresponds for example to the p th degree truncated

spline basis
I Sλ is usually called the smoother matrix

In general

ŷ = Ly
where L is an n × n matrix that doesn’t depend on y directly (but
does through λ). This is also called linear smoother.
Spline Regression 54/74
§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Error of the smoothers

Let fˆ be an estimator of f obtained from

yi = f (xi ) + i
An important quantity of interest is the error incurred by an
estimator with respect to a given target. The most common
measure of error is the mean square error MSE
n o n o2
ˆ ˆ
MSE f (x) = E f (x) − f (x)

which has the advantage of admitting the decomposition

n o h n o i2 n o
MSE fˆ (x) = E fˆ(x) − f (x) + var fˆ(x)

which represents the squared bias and variance of the error.

Spline Regression 55/74
§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Error of the smoothers

I The entire curve is of interest → so it is common to measure

the error globally across several values of x
I Mean integrated squared error (MISE) is a possibility
n o Z n o
MISE fˆ(.) = MSE fˆ (x) dx
χ
I when only error at the observations are considered

n o n n
X o2
MSSE fˆ(.) = E fˆ (xi ) − f (xi )
i=1

Spline Regression 56/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Error of the smoothers

h i>
I Let f̂ = fˆ (x1 ) , ..., fˆ (xn ) denotes the vector of fitted values
and
I let f = [f (x1 ) , ..., f (xn )] denotes the vector of unknown values

MSSE f̂ = E kf̂ − fk2

I For linear smoother f̂ = Ly

X n 2 n o
MSSE f̂ = E fˆ(xi ) − f (xi ) + var fˆ (xi )
i=1

MSSE f̂ = k (L − I ) fk2 + σ2 tr LL>

Note 2
Spline Regression 57/74
§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Error of the smoothers

I The bias is given by

Bias f̂ = f − E f̂ = f − Lf
I The covariance

cov f̂ = Lcov (y) L> = σ2 LL>
I The diagonal contains the pointwise variances at the xi

Spline Regression 58/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Degrees of freedom of a smoother

I For penalized spline
−1
> >
dffit = tr X X + λD X X = tr (Sλ )

I For K knots and degree p

tr (S0 ) = p + 1 + K
I At the other extreme

tr (Sλ ) → p + 1 as λ → ∞
I So for λ > 0

p + 1 < df < p + 1 + K
Spline Regression 59/74
§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Degrees of freedom of a smoother

Different values lead to similar appearance. They have roughly the
same degree of freedom

Spline Regression 60/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cross validation

I The most common measure for the goodness of fit of a

regression curve
n
1 X 1
RSS = (yi − ŷi )2 = ky − ŷk2
N N
i=1
I It is minimized for λ = 0 for which ŷi = yi , 1 < i ≤ n
I Solution close to interpolation

Spline Regression 61/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cross validation

I Cross-validation allows the estimation of λ when fˆ (x, λ) is

used as a nonparametric regression at x
n
1 X 2
RSS (λ) = yi − fˆ−i (xi , λ)
N
i=1
I The cross-validation criterion is

n n
!2
1 X ˆ
2 1 X yi − fˆλ (xi )
CV (λ) = yi − f−i (xi , λ) =
N N 1 − Sλ (i, i)
i=1 i=1

I The choice of λ is the one that minimizes CV (λ) over λ ≥ 0

Spline Regression 62/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cross validation

A computationally efficient variant can be obtained using a

simplified version where Sλ (i, i) are replaced by their average
n
1X 1
Sλ (i, i) = tr (Sλ )
n n
i=1

This leads to the generalization cross validation

n
1 X [(I − Sλ ) y]i 2

RSS(λ)
GCV(λ) = =
N 1 − n−1 tr(Sλ ) (1 − n−1 tr(Sλ ))2
i=1

Spline Regression 63/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Selection with a criterion

Our interest

MSSE f̂ = k (L − I ) fk2 + σ2 tr LL>

where Sλ = L, however

E (RSS) = E kf̂ − yk2 = MSSE f̂ + σ2 (n − 2dffit )

where dffit = tr (Sλ ) = tr (L).

Note 3

Spline Regression 64/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Selection with a criterion

It follows that if σ̂2 is an unbiased estimate of σ2 then

IC = RSS + 2σ̂2 dffit

is an unbiased estimator of

MSSE f̂ + nσ2

but nσ2 doesn’t depend on Sλ → then minimizing

IC is
approximately similar to minimizing MSSE f̂

Spline Regression 65/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Selection with a criterion

For penalized splines this leads to

Cp (λ) = RSS(λ) + 2σ̂2 dffit (λ)

for selecting λ. We represent λ̂Cp the smoothing parameter
obtained by minimizing Cp (λ).
As estimate of σ̂2 we take

RSS (λ)
σ̂2 =
dfres (λ)
Note 4

Spline Regression 66/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Selection with a criterion

GCV can approximately take a different form

RSS(λ)
GCV (λ) =
(1 − n−1 tr(Sλ ))2

= RSS(λ) + 2σ̂2 (λ) dffit

The main difference is that GCV estimates σ2 using RSS(λ) where
as Cp (λ) requires a prior estimate of σ2 .

Spline Regression 67/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Selection with a criterion

This can be extended to other selection criteria for example

2
AIC (λ) = log (RSS(λ)) + dffit
n

Spline Regression 68/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Other basis

Assume f is defined on the unit interval, under some regularity

conditions, f admits a Fourier series representation
∞
X s
f (x) = β0 + βj sin(jπx) + βjc cos(jπx)
j=1

For higher values of j, the functions sin(jπx) and cos(jπx) become

more oscillatory → account for the finer structure in f

Spline Regression 69/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Other basis

For smoother f , the corresponding coefficients are small

J n
X o
fˆ(x) = β̂0 + β̂js sin(jπx) + β̂jc cos(jπx)
j=1

β̂js , β̂jc , (1 ≤ j ≤ J) and β̂0 are obtained by least squares.

The values of J is the smoothing parameter in this case.

Spline Regression 70/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Radial Basis functions

An extension of the truncated power functions

1, x, ..., x p , |x − k1 |p , ..., |x − kK |p
where

|x − ki |p = r (|x − ki |) where r (u) = u p

This shows that this basis |x − ki |p (1 ≤ i ≤ K ) depends only on
the distance |x − ki | and the univariate function r

Spline Regression 71/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Radial Basis functions

Extension to multivariate cases x ∈ Rd and k1 , ..., kK ∈ Rd is

straightforward

r (kx − ki k)
√
I where kvk = v> v is the vector length
I These functions are radially symmetric
I They are called radial basis functions

Spline Regression 72/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cubic approximation

A cubic smoothing spline approximation can be written as

n
X
fˆ(x) = β̂0 + β̂1 x + β̂1j |x − xj |3
j=1

where β̂0 , β̂1 , β̂11 , ..., β̂1n minimize

ky − X0 β 0 − X1 β 1 k2 + λβ >
1 K β1

X0> β 1 = 0
where β 0 = [β0 , β1 ]> , β 1 = [β11 , ..., β1n ]> , X0 = [1, xi ]1≤ß≤n and
X1 = K = |xi − xj |3 1≤i,j≤n

Spline Regression 73/74

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.6 Ot

Cubic approximation

Computational saving can be obtained

by specifying a knot
3

sequence k1 , ..., kK and using K = |ki − kj | 1≤i,j≤K and
X = |x − ki |3 1≤i≤K

Spline Regression 74/74

DC-6 Om
100% (4)
DC-6 Om
522 pages
Statistical Methods For Bioinformatics Lecture 5
No ratings yet
Statistical Methods For Bioinformatics Lecture 5
48 pages
The Art of Troubleshooting - Ebook - V2
No ratings yet
The Art of Troubleshooting - Ebook - V2
356 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Lesson Plan - Metal Work
50% (2)
Lesson Plan - Metal Work
6 pages
Penalized Regression
No ratings yet
Penalized Regression
6 pages
Flexible Regression - Lecture 6: Marnie Mclean Room 344 Mathematics and Statistics Building
No ratings yet
Flexible Regression - Lecture 6: Marnie Mclean Room 344 Mathematics and Statistics Building
27 pages
15 Splines
No ratings yet
15 Splines
51 pages
Module08 PolynomialRegressionSplineGAMs
No ratings yet
Module08 PolynomialRegressionSplineGAMs
56 pages
Lec 08 - Polynomial Regression
No ratings yet
Lec 08 - Polynomial Regression
56 pages
Smoothspline PDF
No ratings yet
Smoothspline PDF
4 pages
Spline and Penalized Regression
No ratings yet
Spline and Penalized Regression
45 pages
Modeling With Penalized Splines
No ratings yet
Modeling With Penalized Splines
50 pages
SMSP
No ratings yet
SMSP
39 pages
Slides4 mrbm2324
No ratings yet
Slides4 mrbm2324
40 pages
Estimating Penalized Spline Regressions: Theory and Application To Economics
No ratings yet
Estimating Penalized Spline Regressions: Theory and Application To Economics
16 pages
Chapter 5: Basis Functions and Regularization
No ratings yet
Chapter 5: Basis Functions and Regularization
4 pages
Chapter3 Annotated Almostwholething
No ratings yet
Chapter3 Annotated Almostwholething
26 pages
Pspline
No ratings yet
Pspline
6 pages
Spline Models: - Introduction To CS and NCS - Regression Splines - Smoothing Splines
No ratings yet
Spline Models: - Introduction To CS and NCS - Regression Splines - Smoothing Splines
24 pages
Basis Approaches
No ratings yet
Basis Approaches
9 pages
Asymptotic Properties of Penalized Spline Estimators: Faculty of Business and Economics
No ratings yet
Asymptotic Properties of Penalized Spline Estimators: Faculty of Business and Economics
32 pages
Polynomial Regression
No ratings yet
Polynomial Regression
15 pages
Splines
No ratings yet
Splines
35 pages
Wahba Improper Priors
No ratings yet
Wahba Improper Priors
9 pages
Package Splines2': September 19, 2021
No ratings yet
Package Splines2': September 19, 2021
25 pages
Appendix Nonparametric Regression
No ratings yet
Appendix Nonparametric Regression
17 pages
Matlabnoteschap 06
No ratings yet
Matlabnoteschap 06
34 pages
A Robust Variant of Cubic Smoothing Spline Approximation
No ratings yet
A Robust Variant of Cubic Smoothing Spline Approximation
21 pages
Spllinefit
100% (1)
Spllinefit
17 pages
Sy19 A22 Cours6
No ratings yet
Sy19 A22 Cours6
73 pages
Lecture 16: Polynomial and Categorical Regression 1 Review
No ratings yet
Lecture 16: Polynomial and Categorical Regression 1 Review
10 pages
Smooth Splines Large Data
No ratings yet
Smooth Splines Large Data
22 pages
Lecture02 95791
No ratings yet
Lecture02 95791
94 pages
Linear Ization 1
No ratings yet
Linear Ization 1
24 pages
Nonlinear Regression
No ratings yet
Nonlinear Regression
8 pages
SDV
No ratings yet
SDV
82 pages
10 Matlab Fitting
No ratings yet
10 Matlab Fitting
36 pages
Chapter 7 Polynomial Regression Models: Ray-Bing Chen Institute of Statistics National University of Kaohsiung
No ratings yet
Chapter 7 Polynomial Regression Models: Ray-Bing Chen Institute of Statistics National University of Kaohsiung
69 pages
07 Moving Beyond Linearity I 169
No ratings yet
07 Moving Beyond Linearity I 169
40 pages
Mars PDF
No ratings yet
Mars PDF
15 pages
ML Cheat
No ratings yet
ML Cheat
9 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
Chapter 5 - Excursion B-Splines - Commented2
No ratings yet
Chapter 5 - Excursion B-Splines - Commented2
47 pages
Linear Regression Course
No ratings yet
Linear Regression Course
22 pages
Advanced Regression Models
No ratings yet
Advanced Regression Models
5 pages
A 8 R (SSR)
No ratings yet
A 8 R (SSR)
11 pages
MFIN 305 - Lecture1
No ratings yet
MFIN 305 - Lecture1
77 pages
MATLAB Examples - Interpolation and Curve Fitting
No ratings yet
MATLAB Examples - Interpolation and Curve Fitting
25 pages
Curve Fitting
No ratings yet
Curve Fitting
23 pages
Lecture-6 Linear Regression Addition
No ratings yet
Lecture-6 Linear Regression Addition
15 pages
Isotonic Smoothing Spline Regression
No ratings yet
Isotonic Smoothing Spline Regression
18 pages
Polynomial and Spline Interpolation: A Chemical Reaction
No ratings yet
Polynomial and Spline Interpolation: A Chemical Reaction
4 pages
Least Square Fit
100% (2)
Least Square Fit
16 pages
A Practical Guide To Spline: Mathematics of Computation January 1978
No ratings yet
A Practical Guide To Spline: Mathematics of Computation January 1978
8 pages
Chapter 7 - Handsout Machine Learning
No ratings yet
Chapter 7 - Handsout Machine Learning
18 pages
B-Spline Interpolation: Charles Frye Introduction To Splines
No ratings yet
B-Spline Interpolation: Charles Frye Introduction To Splines
10 pages
Basis Expansion and Regularization: Prof. Liqing Zhang
No ratings yet
Basis Expansion and Regularization: Prof. Liqing Zhang
45 pages
Introduction Linear Regression 2015
No ratings yet
Introduction Linear Regression 2015
9 pages
Lecture 3 Multi-Regresion 2022.
No ratings yet
Lecture 3 Multi-Regresion 2022.
16 pages
Least Squares Fitting of Data To A Curve
No ratings yet
Least Squares Fitting of Data To A Curve
54 pages
Lecture 8
No ratings yet
Lecture 8
24 pages
Foundations of Image Science
From Everand
Foundations of Image Science
Harrison H. Barrett
No ratings yet
L2 Linear Regression
No ratings yet
L2 Linear Regression
61 pages
L9 Support Vector Machines
No ratings yet
L9 Support Vector Machines
83 pages
L4 Kernel Regression
No ratings yet
L4 Kernel Regression
42 pages
L7 Artificial Neural Networks
No ratings yet
L7 Artificial Neural Networks
56 pages
M P5 Rev1 Sem2 2024 2025
No ratings yet
M P5 Rev1 Sem2 2024 2025
6 pages
Sony Ericsson Product
No ratings yet
Sony Ericsson Product
34 pages
Sundyne Compressor Brochure - US
No ratings yet
Sundyne Compressor Brochure - US
16 pages
Chapter-4: Operations, Material and Maketing Management: Definition & Importance of Operational Management
No ratings yet
Chapter-4: Operations, Material and Maketing Management: Definition & Importance of Operational Management
47 pages
ABC Telecom
No ratings yet
ABC Telecom
8 pages
TOR B1 Listening WS 3 Standard
No ratings yet
TOR B1 Listening WS 3 Standard
3 pages
Naukri VinitaSingh 1790045 - 08 00 - 1
No ratings yet
Naukri VinitaSingh 1790045 - 08 00 - 1
3 pages
Occult Herbmaster - Theras
No ratings yet
Occult Herbmaster - Theras
1 page
Manual Phonic
0% (1)
Manual Phonic
46 pages
Abrahams & Millar (2008)
No ratings yet
Abrahams & Millar (2008)
27 pages
Isd Process V1
100% (1)
Isd Process V1
3 pages
Bapp Telesys Tag 3
No ratings yet
Bapp Telesys Tag 3
4 pages
Topics in Finite and Discrete Mathematics - Sheldon M. Ross
100% (1)
Topics in Finite and Discrete Mathematics - Sheldon M. Ross
279 pages
Assignment 1 ECN3112
No ratings yet
Assignment 1 ECN3112
4 pages
Bohler Image Brochure GB
No ratings yet
Bohler Image Brochure GB
38 pages
Orientering
No ratings yet
Orientering
15 pages
Kra 4 Community Linkages and Professional Engagement & Personal Growth and
No ratings yet
Kra 4 Community Linkages and Professional Engagement & Personal Growth and
7 pages
Chemical Engineering in Practice Second Edition - Sampler
100% (1)
Chemical Engineering in Practice Second Edition - Sampler
99 pages
Landing Page Inspiration 3
No ratings yet
Landing Page Inspiration 3
1 page
Fig. Qty Description Code Fig. Qty Description Code: Carburettor 40 DCOE Part No. 19550.174 Parts
No ratings yet
Fig. Qty Description Code Fig. Qty Description Code: Carburettor 40 DCOE Part No. 19550.174 Parts
2 pages
Quick Guide aCMQ
No ratings yet
Quick Guide aCMQ
4 pages
Understanding SAP EWM Wave
No ratings yet
Understanding SAP EWM Wave
8 pages
Basic Japanese Language Chapter 1
No ratings yet
Basic Japanese Language Chapter 1
20 pages
A Cyber Security Awareness and Education Framework For South Africa
No ratings yet
A Cyber Security Awareness and Education Framework For South Africa
219 pages
Manual HON 370 20 GB
No ratings yet
Manual HON 370 20 GB
51 pages
How To Make A Good Presentation
No ratings yet
How To Make A Good Presentation
34 pages
Get General Organic and Biochemistry 4th Edition Katherine Denniston Free All Chapters
100% (7)
Get General Organic and Biochemistry 4th Edition Katherine Denniston Free All Chapters
82 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

L5 Spline Regression

Uploaded by

L5 Spline Regression

Uploaded by

§3.1 Introduction §3.2 Motivation §3.3 Spline §3.4 Penalized Spline Regression §3.5 Linear Smoothers §3.

MAST90083 Computational Statistics and Data Mining

Spline Regression 1/74

§3.4 Penalized Spline Regression

§3.5 Linear Smoothers

§3.6 Other Basis

Spline Regression 2/74

I Some data sets are hard or impossible to model using

Spline Regression 3/74

Spline Regression 4/74

I Alternatively, we could think of the vertical axis as a

Spline Regression 5/74

I Aim Estimate the unspecified smooth function from the pairs

Spline Regression 6/74

I Let’s start with the straight line regression model

Spline Regression 7/74

I The corresponding basis for this model are the functions: 1

I The model is a linear combination of these functions which is

Spline Regression 8/74

I The basis functions correspond to the columns of X for fitting

Spline Regression 9/74

I The quadratic model is a simple extension of the linear model

Spline Regression 10/74

I There is an extra basis function x 2 corresponding to the

I The quadratic model is an example of how the simple linear

Spline Regression 11/74

I The basis functions correspond to the columns of X for fitting

Spline Regression 12/74

Spline basis function

I We know look at how the model can be extended to

I Broken line model: it consists of two differently sloped lines

Spline Regression 13/74

Spline basis function

I Broken line: A linear combination of three basis functions

I where we have (x − 0.6)+ with

Spline Regression 14/74

Spline basis function

I Broken line model is

yi = β0 + β1 xi + β11 (xi − 0.6)+ + i

Spline Regression 15/74

Spline basis function

I Assume a more complicated structure

I Straight line structure in the left-hand half but the right-hand

Spline Regression 16/74

Spline basis function

I If we have good reason to believe that our underlying

I where the functions: (x − 0.5)+ , (x − 0.55)+ ,...,(x − 0.95)+

Spline Regression 17/74

Spline basis function

I The basis can do a reasonable job with a linear portion

Spline Regression 18/74

Spline basis function

I It is possible to handle any complex type of structure by

Spline Regression 19/74

Spline basis function

I The function (x − 0.6)+ is called a linear spline basis function

Spline Regression 20/74

Spline basis function

I Rather than referring to the spline basis function (x − k)+ it

Spline Regression 21/74

Spline Regression 22/74

Spline Regression 23/74

Spline Regression 24/74

Spline Regression 25/74

I A natural attempt at automatic selection of the knots is to

Spline Regression 26/74

Penalized spline regression

I Too many knots in the model induces roughness of the fit

Spline Regression 27/74

Penalized spline regression

I The ordinary least square fit is written as

ŷ = X β̂ where β̂ minimizes ky − X βk2

Spline Regression 28/74

Penalized spline regression

I max |β1k | < C

With an appropriate choice of C each of these will lead to a

yi = β0 + β1 xi + β11 (xi − 0.6)+ + i