Chapter 3 - Curve Fitting
Chapter 3 - Curve Fitting
SoICT
Hanoi University of Science and Technology
2 / 60
Introduction
1 Problem
2 Interpolation
Lagrange interpolation
Spline interpolation
3 Regression
Linear regression
High-order curve fitting
General curve fitting
3 / 60
Problem
fi = f (xi ) i = 0, 1, · · · , N
Given the dataset {(xi , fi ), i = 0, 1, ..., N}, we need to find a way to approximate the function
f (x) by a function (which can be calculated using an analytic formula) p(x).
4 / 60
Problem (continued)
continue...
To solve the problem
We will learn two methods used to construct curve fitting.
Interpolation: function p(x) must go through all points in the dataset.
Regression: given the form p(x) with parameters, we must determine the parameters to
minimize a certain error criterion (usually the least squares criterion is used).
5 / 60
Interpolation
Introduction
The possible interpolation with functions p(x) is
Polynomial function
Rational function
Fourier series function
6 / 60
Interpolation
Interpolation
Definition: We say the function p(x) is an interpolation function of the dataset
{(xi , fi ), i = 0, 1, ..., N} if the following conditions are satisfied:
pi = fi , i = 0, 1, ..., N
This system of N + 1 of equations is called interpolation condition. The reason that the
polynomial function was chosen is because it is easy to calculate values, derivatives, and
differentials (the analytic properties of the function are obvious).
The polynomial that interpolates the data set is called the interpolating polynomial.
Definition: We call a polynomial of degree of no more than K has form of:
pK (x) = a0 + a1 x + · · · + aK x K
Linear interpolation
We have two data points (x0 , f0 ) and (x1 , f1 ) which need interpolation by polynomial
p1 (x) ≡ a0 + a1 x
p1 (x) satisfies the system of equations
p1 (x0 ) ≡ a0 + a1 x0 = f0
p1 (x1 ) ≡ a0 + a1 x1 = f1
8 / 60
Interpolation
Linear interpolation (continued)
f
(9,6)
p1 (x) = 0.6 + 0.6x
(4,3)
9 / 60
Interpolation
Lagrange Interpolation Formula
It is necessary to construct a curve that passes through 3, 4 or more points (see figure).
(9,6)
(7,4)
(4,3)
x 10 / 60
Interpolation
11 / 60
Interpolation
Assuming that x0 , x1 , · · · , xN is distinct, then according to the theory of solving the system of
linear equations in the previous chapter, we have
N=M: system of equations with unique solution
M<N: can choose the dataset so that it has no solution
M>N: if the system has a solution, it has infinitely many solutions
12 / 60
Interpolation
1 xN · · · xNM
The determinant of Vn is: Y
VN = (xi − xj )
i>j
13 / 60
Interpolation
14 / 60
Interpolation
Example 1:
Determine the coefficients of the interpolation polynomial p2 (x) = a0 + a1 x + a2 x 2
interpolating the dataset {(-1,0),(0,1),(1,3 )}, we have interpolation condition
a0 + (−1)a1 + 1a2 = 0
a0 + 0a1 + 0a2 = 1
a0 + 1a1 + 1a2 = 3
Solving the system of equations we have a0 = 1, a1 = 1.5 and a2 = 0.5. So the interpolated
polynomial is
p2 (x) = 1 + 1.5x + 0.5x 2
15 / 60
Interpolation
16 / 60
Interpolation
17 / 60
Interpolation
It is clearly satisfied (
1, i = j
Vi (xj ) =
0, i ̸= j
So we can summarize the above formula as follows:
Q
i̸=j (x − xj )
Vi (x) = Q
i̸=j (xi − xj )
18 / 60
Interpolation
(x − x1 )(x − x2 ) · · · (x − xN )
pN (x) = f0
(x0 − x1 ) · · · (x0 − xN )
···
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
+ fi (1)
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
···
(x − x0 )(x − x1 ) · · · (x − xN−1 )
+ fN
(xN − x0 ) · · · (xN − xN−1 )
19 / 60
Interpolation
Install Vi (x) using Matlab
v = polyinterp(x, y , u) calculates v (j) = P(u(j)) where P is a polynomial of degree
d = length(x) − 1 such that P(x(i)) = y (i). Use Lagrange interpolation formula. Two vectors
x, y represent the coordinates of the data points.
function v = polyinterp(x,y,u)
n = length(x);
v = zeros(size(u));
for k = 1:n
w = ones(size(u));
for j = [1:k-1 k+1:n]
w = (ux(j))./(x(k)-x(j)).*w;
end
v = v + w*y(k);
end
20 / 60
Interpolation
Install Vi (x) using Matlab (continued)
function v = polyinterp(x,y,u)
n = length(x);
v = zeros(size(u));
for k = 1:n
w = ones(size(u));
for j = [1:k-1 k+1:n]
w = (ux(j))./(x(k)-x(j)).*w;
end
v = v + w*y(k);
end
The two vectors x,y represent the coordinates of the data points.
u is a vector containing the coordinates of the points to be calculated according to the
Lagrange interpolation formula.
21 / 60
Interpolation
Example 2:
x = [0 1 2 3];
y = [-5 -6 -1 16];
u= [0.25:0.01:3.25];
v = polyinterp(x,y,u);
plot(x,y,’o’,u,v,’-’)
22 / 60
Interpolation
23 / 60
Interpolation
Interpolation error
Definition: The error of the interpolated polynomial g (x) is determined by the formula:
25 / 60
Lagrange interpolation
26 / 60
Spline interpolation
Problem
In this case we consider interpolation by a set of low-order polynomials instead of a single
higher-order polynomial
Interpolation by linear spline
Interpolation by cubic spline
Recall the Lagrange interpolation error evaluation formula
27 / 60
Spline interpolation
Problem (continued)
Assuming every data point xi ∈ [a, b] then we have
This evaluation indicates that if we want to reduce the error while keeping the number of data
points N, we need to find a way to reduce the size of |b − a|.
Our approach is piecewise polynominal approximation: Thus the segment [a, b] will be divided
into many small non-intersecting segments and different polynomials will approximated on
each sub-segment.
28 / 60
Spline interpolation
linear spline S1,N (x) is a continuous function that interpolates given data and is built from
linear functions defined by two data point interpolation polynomials:
29 / 60
Spline interpolation
Easy to see
fi − fi−1
Li (x) = (x − xi ) + fi
xi − xi−1
is the equation of the line through two points (xi , fi ) and (xi−1 , fi−1 )
30 / 60
Spline interpolation
|xi − xi−1 |2
max |f (z) − S1,N (z)| ≤ × max |f (2) (x)|
z∈[xi−1 ,xi ] 8 x∈[xi−1 ,xi ]
h2
≤ × max |f (2) (x)|
8 x∈[xi−1 ,xi ]
31 / 60
Spline interpolation
Example 3:
Given the dataset {(-1,0),(0,1),(1,3)}, we build a spline calculated as follows
(
f1 −f0 f1 −f0
−x0 (x − x1 ) + f1 = x1 −x0 (x − x0 ) + f0 , if x ∈ [x0 , x1 ]
S1,2 (x) = xf21 −f f2 −f1
x2 −x1 (x − x2 ) + f2 = x2 −x1 (x − x1 ) + f1 , if x ∈ [x1 , x2 ]
1
32 / 60
Spline interpolation
Example 4:
» clear;
» x=1:6;
» y=[16 18 21 17 15 12];
» plot(x,y,’o’)
» hold on; grid on;
» axis([-1 7 10 22])
» xx = [1:0.01:6];
» yy = piecelin(x,y,xx);
» plot(xx,yy,’r’)
» hold off
» xlabel(’x-axis’);ylabel(’y-axis’);
» title(’Linear Spline Interpolation’);
33 / 60
Spline interpolation
34 / 60
Spline interpolation
Third-order splines
Instead of line segments, third-order splines use 3rd degree polynomials to approximate the
segment polynomial. I also have the dataset
{(xi , fi ) : i = 0, 1, ..., N}
Inside
a = x0 < x1 < · · · < xN = b, h ≡ max |xi − xi−1 |,
i
35 / 60
Spline interpolation
S3,N (x) is a continuous function with continuous first and second derivatives on the
segment [x0 , xN ] (including boundaries)
36 / 60
Spline interpolation
37 / 60
Spline interpolation
p1 ”(x0 ) = 0; pN ”(xN ) = 0;
Not-a-knot Condition
38 / 60
Spline interpolation
39 / 60
Spline interpolation
Example 4:
» x = 0:10; y = sin(10*x);
» p = spline(x,y);
» xx = linspace(0,10,200);
» yy = ppval(p,xx);
» plot(x,y,’o’,xx,yy)
40 / 60
Level 3 spline interpolation
41 / 60
Spline interpolation
Example 5:
» x = 0:10; y = sin(10*x);
» xx = linspace(0,10,200);
» coef = polyfit(x,y,10);
Warning: Polynomial is badly conditioned. Remove repeated data points or try
centering and scaling as described in HELP POLYFIT. (Type "warning off
MATLAB:polyfit:RepeatedPointsOrRescale" to suppress this warning.)
» yy = polyval(coef,xx);
» plot(x,y,’o’,xx,yy,’r’)
42 / 60
Level 3 spline interpolation
So when we interpolate with spline of order 3, it is better than polynomial interpolation with43 / 60
Spline interpolation
Example 6:
The power consumption of a device in 12 hours is as follows
1 2 3 4 5 6
54.4 54.6 67.1 78.3 85.3 88.7
7 8 9 10 11 12
96.9 97.6 84.1 80.1 68.8 61.1
Using the third-order spline to interpolate the above data point set.
44 / 60
Third-order spline interpolation
45 / 60
Level 3 spline interpolation
46 / 60
Regression
Problem
Given the dataset {(xi , fi ), i = 1, · · · , N}, we need to find a way to approximate the function
f (x) by a function p(x). We need to take care of measuring the total error over the entire
segment.
Maximum error E∞ (f ) = max1≤k≤n |f (xk ) − yk |
Average error E1 (f ) = n1 nk=1 |f (xk ) − yk |
P
47 / 60
Regression
Linear regression
Linear regression is finding a line that fits the data points in the least squares sense.
Given a set of N pairs of data points {(xi , fi ) : i = 1, · · · , N}
Find the coefficient m and the free constant b of the line
y (x) = mx + b
such that this line fits the data according to the least squares criterion.
X N
L(m, b) = (fi − (mxi + b))2
i=1
48 / 60
Regression
Linear regression (continued)
That is, we need to find m and b to minimize the function
N
X
L(m, b) = (fi − (mxi + b))2
i=1
To find this minimum, we need to solve the system of equations to find the stopping point
N
∂L X
= 2(fi − (mxi + b))(−xi ) = 0
∂m
i=1
N
∂L X
= 2(fi − (mxi + b))(−1) = 0
∂b
i=1
49 / 60
Regression
or in matrix form, ! ! !
PN 2 PN PN
i=1 xi i=1 xi m i=1 xi fi
PN = PN
i=1 xi N b i=1 fi
Example 6:
Construct the fitting curve for the following dataset
i 1 2 3 4
xi 1 3 4 5
fi 2 4 3 1
13m + 4b = 10
51m + 13b = 31
then find b = 107/35 and m = −6/35, the line to find is y = (−6/35)x + 107/35
51 / 60
Regression
52 / 60
Regression
The special feature of the above matrix is that it is definite symmetry, so we can use Gaussian
elimination without changing the line to solve the system of standard equations.
53 / 60
Regression
Example 7:
Determine the parameters a, b, cc of the curve
y = ax 2 + bx + c
x 1 2 3 4 5 6 7 8
y 1 8 27 64 125 216 350 560
54 / 60
Regression
Example 7: (continued)
x = [1 2 3 4 5 6 7 8];
y = [1 8 27 64 125 216 350 560];
s0 = length(x); s1 = sum(x); s2 = sum(x.ˆ2);
s3 = sum(x.ˆ3); s4 = sum(x.ˆ4);
A = [s4 s3 s2; s3 s2 s1; s2 s1 s0];
b = [sum(x.ˆ2.*y); sum(x.*y);sum(y)];
c0 = A\b;
c = polyfit(x,y,2); % Find a regression polynomial that matches the data
c0 % Returns the calculated coefficient
c % Returns the coefficient found by polyfit
xx = 1:0.1:8;
yy = polyval(c0,xx);
plot(x,y,’o’,xx,yy)
55 / 60
Regression
56 / 60
Regression
57 / 60
Regression
58 / 60
Regression
59 / 60
Summary
Summary
Curve fitting problem
Definition of interpolation
▶ Lagrange Interpolation
▶ Spline Interpolation
Definition of regression
▶ Linear Regression
▶ Higher order polynomial regression
60 / 60