0% found this document useful (0 votes)
3 views12 pages

SLAM Least Squares Notes

This document provides an introduction to least squares optimization techniques, covering one-dimensional and multi-dimensional linear least squares, as well as nonlinear least squares. It discusses the relationship between least squares and the Kalman filter, and includes examples and applications of these techniques. The document also addresses the uncertainty of estimates and the importance of weights in the least squares formulation.

Uploaded by

Minh Duc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views12 pages

SLAM Least Squares Notes

This document provides an introduction to least squares optimization techniques, covering one-dimensional and multi-dimensional linear least squares, as well as nonlinear least squares. It discusses the relationship between least squares and the Kalman filter, and includes examples and applications of these techniques. The document also addresses the uncertainty of estimates and the importance of weights in the least squares formulation.

Uploaded by

Minh Duc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

An Introduction to Least Squares Techniques

Shoudong Huang
March 21, 2017

Abstract
This notes describes least squares optimisation techniques using simple
examples. It starts from one dimensional linear least squares (LLS). Fol-
lowed by multi-dimensional LLS, and then nonlinear least squares (NLS).
The relation between LLS and Kalman filter (KF) is also discussed.

Contents
1 One dimensional linear least squares 1
1.1 Least squares and weighted least squares . . . . . . . . . . . . . . 2
1.2 1D Gaussian distribution and its information . . . . . . . . . . . 3
1.3 Weighted least squares with Gaussian measurement noises . . . . 3
1.4 The uncertainty of weighted least squares eatimate . . . . . . . . 4

2 Two dimensional linear least squares 6


2.1 An example of two dimensional least squares problem . . . . . . 6
2.2 Two dimensional weighted least squares problem . . . . . . . . . 6
2.3 The uncertainty of weighted least squares eatimate . . . . . . . . 7

3 General linear least squares formula 8


3.1 Linear least squares problem . . . . . . . . . . . . . . . . . . . . 8
3.2 Weighted linear least squares problem . . . . . . . . . . . . . . . 8
3.3 Relation to Kalman filter . . . . . . . . . . . . . . . . . . . . . . 9

4 Nonlinear least squares 10


4.1 Nonlinear least squares and Gauss-Newton iteration . . . . . . . 10
4.2 Weighted nonlinear least squares . . . . . . . . . . . . . . . . . . 11

5 Applications 12

1 One dimensional linear least squares


Least squares is a method to estimate a constant parameter (vector) through
(redundant) measurements. Now we consider an example in one dimensional
(1D) case.

1
1.1 Least squares and weighted least squares
Suppose x is the scalar parameter to be estimated (e.g. the height of a building).
If we have two measurements about x, say z1 = 100m and z2 = 110m, then we
can use least squares method to obtain an estimate of x.
Intuitively, we want z1 − x to be close to 0, and we also want z2 − x to be
close to 0. Thus we formulate the problem as:
Least Squares Problem: Find x to minimise

f (x) = (z1 − x)2 + (z2 − x)2 . (1)

To solve this problem, we compute the stationary point (letting the derivative
equal to 0)
f 0 (x) = −2(z1 − x) − 2(z2 − x) = 0
and the unique solution is
1 1
x∗ = z1 + z2 . (2)
2 2
This solution is simply the average of z1 = 100m and z2 = 110m, that is,
x∗ = 105m. This answer matches our intuition, averaging the two measure-
ments.
In the case when one measurement is more accurate than the other, we need
to use different weights w1 > 0 and w2 > 0 for the two terms in the least squares
formulation.
Weighted Least Squares Problem: Find x to minimise

f (x) = w1 (z1 − x)2 + w2 (z2 − x)2 . (3)

To solve this problem, we compute the stationary point by solving

f 0 (x) = −2w1 (z1 − x) − 2w2 (z2 − x) = 0

and the unique solution is


w1 w2
x∗ = z1 + z2 . (4)
w1 + w2 w1 + w2
This solution is the weighted average of the two measurements. When w1 =
w2 , the solution is the same as that in (2). When w1 6= w2 , the solution will be
different from (2). For example, if w1 = 4, w2 = 1, then
4 1 4 1
x∗ = z1 + z2 = × 100 + × 110 = 102. (5)
5 5 5 5
This solution is again between z1 and z2 but is closer to z1 because the weight
of z1 , namely w1 , is larger.
Clearly the weights w1 and w2 are very important in the least squares opti-
misation. How to choose the weights? Intuitively, the weights depends on the
amount of information contained in each of the measurements. For Gaussian
distributions, there is a clear definition of the information.

2
1D Gaussian Distribution 1D Gaussian Distribution

0.09

0.08 0.02

0.07

0.06 0.015

0.05

0.04 0.01

0.03

0.02 0.005

0.01

0 0
0 50 100 150 200 0 50 100 150 200
N(100,25) N(100,400)

(a) One dimensional Gaussian distribution (b) One dimensional Gaussian distribution
with smaller variance σ 2 = 25 (information with larger variance σ 2 = 400 (information
= 1/25). = 1/400).

Figure 1: One dimensional Gaussian distributions

1.2 1D Gaussian distribution and its information


If a random variable x follows a Gaussian distribution, it is denoted as

x ∼ N (m, σ 2 ) (6)

where m is the mean and σ 2 is the variance.


The meaning is — x is likely to be around the mean m, the level of un-
certainty depends on the variance σ 2 . The larger the variance, the larger the
uncertainty. See Figure 1.
(Fisher) information of a Gaussian distribution N (m, σ 2 ) is the inverse
of the variance,
1
I = 2. (7)
σ
The larger the uncertainty, the smaller the information. Also see Figure 1.

1.3 Weighted least squares with Gaussian measurement


noises
When the measurement noises are assumed to be zero mean Gaussian, that is

z1 = x + n1 , n1 ∼ N (0, σ12 ) (8)

z2 = x + n2 , n2 ∼ N (0, σ22 ) (9)


(where n1 and n2 represent the noises in the measurements), the weights in the
weighted least squares are chosen as the information of each Gaussian distribu-
tion, that is,
1 1
w1 = I1 = 2 , w2 = I2 = 2 .
σ1 σ2

3
Weighted Least Squares Problem with Gaussian Measurement Noises:
Find x to minimise
1 1
f (x) = (z1 − x)2 + 2 (z2 − x)2 . (10)
σ12 σ2

From (4), the solution in this case is


w1 w2
x∗ = w1 +w2 z1 + w1 +w2 z2

1 1
σ12 σ22
= 1
+ 1 z1 + 1
+ 1 z2 (11)
σ12 σ22 σ12 σ22

σ22 σ12
= z
σ12 +σ22 1
+ z .
σ12 +σ22 2

When σ12 = 1, σ22 = 4, we have w1 = 1, w2 = 1/4, then we get the


same solution as in (5). Note that multiplying both w1 and w2 by a (positive)
constant will not change the least squares estimate x∗ .

1.4 The uncertainty of weighted least squares eatimate


It is clear that the least squares estimate x∗ is just “an intelligent guess” of the
true value of x. What can we say about the accuracy of x∗ ?
Since the measurements z1 and z2 contain noises, if we take the measure-
ments again, we will get different values for z1 and z2 , and we will obtain a
different value of x∗ (since it is a function of z1 and z2 ). Imagine that we take
the measurements z1 and z2 many many times, then we will have many different
x∗ s. The distribution of these x∗ s tell us something about the uncertainty of
the least squares estimate.
When the measurement noises are Gaussian, the distribution of these x∗ s is
also Gaussian. Its variance can be used to describe the uncertainty of the least
squares estimate.
For the least squares problem (10), the variance of the least squares estimate
can be obtained by

1 1 σ2 + σ2
I = I1 + I2 = 2 + 2 = 1 2 22. (12)
σ1 σ2 σ1 σ2

σ12 σ22
σ 2 = I −1 = . (13)
σ12 + σ22
Equation (12) means the total information is the sum of the information
contained in z1 and the information contained in z2 . Equation (13) means the
variance is the inverse of the information.
Figure 2 illustrates the result of both least squares estimate and its variance.
This is actually the same as the update step of 1D Kalman filter [1].

4
Least Squares Estimate

0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
90 95 100 105 110 115 120 125
N(100,1) (z1) + N(110,4) (z2) ==> N(102,0.8)

Figure 2: One dimensional least squares estimate (dot line represents z1 and its
uncertainty, dashed line represents z2 and its uncertainty, solid line repersents
the least squares estimate and its uncertainty). I1 = 1, I2 = 41 , I = I1 + I2 = 45 ;
variance = I1 = 0.8; least squares estimate = II1 × 100 + II2 × 110 = 102.

5
2 Two dimensional linear least squares
This section considers the least squares problems when there are two parameters
to be estimated, x1 and x2 .

2.1 An example of two dimensional least squares problem


One example of two dimensional linear least squares (LLS) problem is to com-
pute the “intersection” of three straight lines. We all know that three straight
lines normally have three intersection points. The problem is to find one single
point that is the “closest” to all the three lines.
Suppose the three lines are given by the following three equations:
b1 = a11 x1 + a12 x2 ;
b2 = a21 x1 + a22 x2 ; (14)
b3 = a31 x1 + a32 x2 ;
Now the problem can be formulated as:
Problem: Find x1 , x2 to minimise
f (x1 , x2 ) = [b1 −(a11 x1 +a12 x2 )]2 +[b2 −(a21 x1 +a22 x2 )]2 +[b3 −(a31 x1 +a32 x2 )]2
(15)
This problem can be solved by letting the partial derivatives equal to zero
(the first order condition):

∂f
∂x1 = −2a11 [b1 − (a11 x1 + a12 x2 )] − 2a21 [b2 − (a21 x1 + a22 x2 )] − 2a31 [b3 − (a31 x1 + a32 x2 )] = 0;
∂f
∂x2 = −2a12 [b1 − (a11 x1 + a12 x2 )] − 2a22 [b2 − (a21 x1 + a22 x2 )] − 2a32 [b3 − (a31 x1 + a32 x2 )] = 0;
(16)
which is
a11 (a11 x1 + a12 x2 ) + a21 (a21 x1 + a22 x2 ) + a31 (a31 x1 + a32 x2 ) = a11 b1 + a21 b2 + a31 b3 ;
a12 (a11 x1 + a12 x2 ) + a22 (a21 x1 + a22 x2 ) + a32 (a31 x1 + a32 x2 ) = a12 b1 + a22 b2 + a32 b3 ;
(17)
or
    
a11 a11 + a21 a21 + a31 a31 a11 a12 + a21 a22 + a31 a32 x1a11 b1 + a21 b2 + a31 b3
= .
a12 a11 + a22 a21 + a32 a31 a12 a12 + a22 + a22 + a32 a32 x2a12 b1 + a22 b2 + a32 b3
(18)
Solving this linear equations we can obtain x∗1 and x∗2 , the solution to the least
squares problem (15).

2.2 Two dimensional weighted least squares problem


When there are three measurements z1 , z2 , z3 about three linear combinations
of x1 and x2 , and the measurement noises n1 , n2 , n3 are zero-mean Gaussain,
we have
z1 = a11 x1 + a12 x2 + n1 , n1 ∼ N (0, σ12 )
z2 = a21 x1 + a22 x2 + n2 , n2 ∼ N (0, σ22 ) (19)
z3 = a31 x1 + a32 x2 + n3 , n3 ∼ N (0, σ32 )

6
In this case, we can formulate the weighted LLS problem as:
Weighted LLS Problem: Find x1 , x2 to minimise
1 1 1
f (x1 , x2 ) = [z1 −(a11 x1 +a12 x2 )]2 + 2 [z2 −(a21 x1 +a22 x2 )]2 + 2 [z3 −(a31 x1 +a32 x2 )]2 .
σ12 σ2 σ3
(20)
This problem can also be solved by letting the partial derivatives equal to
zero.

2.3 The uncertainty of weighted least squares eatimate


For the uncertainty of the weighted least squares eatimate, each measurement
contains some information about the parameters x1 , x2 , the information can be
computed as 1

a211
   
1 a11 1 a11 a12
I1 = σ2 [a11 a12 ] = σ2
1 a12 1 a12 a11 a212

a221
   
1 a21 1 a21 a22
I2 = [a21 a22 ] = (21)
σ22 a22 σ22 a22 a21 a222

a231
   
1 a31 1 a31 a32
I3 = [a31 a32 ] =
σ32 a32 σ32 a32 a31 a232
Now the covariance matrix of the weighted least squares estimate can be
obtained by
 1 
 σ12 0 0 
 a11 a12
a11 a21 a31  0 1
0   a21 a22  , (22)
I = I1 + I2 + I3 =

a12 a22 a32
 σ22
0 0 σ12 a31 a32
3

P LS = I −1 . (23)
Again, the first equation means the total information is the sum of the infor-
mation containted in each of the measurements, the second equation means the
covariance matrix is the inverse of the information matrix.
Note that (19) can be written as
     
z1 a11 a12   n1
 z2  =  a21 a22  x1 +  n2  (24)
x2
z3 a31 a32 n3
1 For a measurement given by z = AX + n , n ∼ N (0, σ 2 ), the information about vector X
z z z
contained in z is given by σ12 AT A. For a measurement vector given by Z = AX + NZ , NZ ∼
z
N (0, PZ ), the information about X contained in Z is given by AT PZ−1 A. See equation (3.4.2.-
6) and the sentence below it in [2].

7
Also, (18) can also be written as
   
 a a12  b1
a11 a21 a31  11
   
x1 a11 a21 a31
a21 a22  =  b2  . (25)
a12 a22 a32 x2 a12 a22 a32
a31 a32 b3

When writing everything in the matrix/vector format, we can get the general
linear least squares formulation and solution in a more concise manner. The next
section will provide the details.

3 General linear least squares formula


3.1 Linear least squares problem
A general n-dimensional least squares problem is given by:
Linear Least Squares Problem: Find X to minimize

f (X) = kb − AXk2 = (b − AX)T (b − AX)

where
     
b1 a11 a12 ··· a1n x1
 b2   a21 a22 ··· a2n   x2 
b= , A =  , X =   , m ≥ n. (26)
     
.. .. .. .. .. ..
 .   . . . .   . 
bm am1 am2 ··· amn xn

When A is full rank (AT A is invertible), the solution is

X ∗ = (AT A)−1 AT b. (27)

Why? Because

f (X) = (b − AX)T (b − AX) = bT b − 2bT AX + X T AT AX

The minimum can be found by letting


df
= 2AT AX − 2AT b = 0
dX
which lead to the unique solution

X = (AT A)−1 AT b.

3.2 Weighted linear least squares problem


If a m-dimensional (m ≥ n) measurement vector Z is given by

Z = AX + NZ , NZ ∼ N (0, PZ ), (28)

8
where A and X are defined in (26), then we can formulate the following weighted
LLS problem.
Weighted LLS Problem: Find X to minimise

(Z − AX)T PZ−1 (Z − AX). (29)

When A is full rank, the unique solution is:

X ∗ = (AT PZ−1 A)−1 AT PZ−1 Z. (30)

The covariance matrix of the estimate error can be computed by

P LS = (AT PZ−1 A)−1 . (31)

3.3 Relation to Kalman filter


Kalman filter (KF) can be used in the state estimation of a linear dynamic
system. In a KF, there are two steps involved: a prediction step and an update
step (see [1] [3] for more details). In fact, the KF formulas for both the predi-
tion step and the update step can be derived using the weighted least squares
formula.
We have shown in Figure 2 and Section 1.4 that the formula in the update
step of 1D KF is the same as the LLS formula. Now we show that for the 1D
KF, the prediction step formula can also be obtained through LLS.
At time k, the estimate of xk follows a Gaussian distribution xk ∼ N (x̂k , σk2 ).
Assume the dynamic equation is given by

xk+1 = xk + uk + wk (32)

where uk ∈ R is the control input at time k and wk ∈ R is the process noise


which is assumed to be zero mean Gaussian with variance σu2 .
The prediction step aims to obtain an estimate of xk+1 from the above
information. Now we do this using least squares method.
The Gaussian distribution xk ∼ N (x̂k , σk2 ) can be regarded as a measurement

x̂k = xk + nk , nk ∼ N (0, σk2 ) (33)

The dynamic equation can be written as

−uk = xk − xk+1 + wk , wk ∼ N (0, σu2 ) (34)

This is in the format of (28) where

σk2
       
xk x̂k 1 0 0
X= , Z= , A= , PZ = . (35)
xk+1 −uk 1 −1 0 σu2

9
Now we can formulate the weighted LLS problem as in (29). Since A is full
rank, the covariance matrix can be computed by (31) as
" 1
# !−1
0

1 1 2 1 0
P LS = (AT PZ−1 A)−1 = σk
1
0−1 0 1 −1
2
σu (36)
σk2 σk2 σk2
    
1 0 0 1 1
= = ,
1 −1 0 σu2 0 −1 σk2 σk + σu2
2

and the solution can be obtained through (30) as

X∗ = (AT P −1 A)−1 AT PZ−1 Z


 2Z " 1 #
σk2 0
 
σk 1 1 2
σk x̂k
=
σk2 σk2 + σu2 0 −1 0 1
σu2 −uk

1 0

x̂k
 (37)
=
 1 −1  −uk
x̂k
= .
x̂k + uk

Note that in least squares, we estimate both xk and xk+1 . If we want to


only keep the estimate of xk+1 , we can marginalise out xk and get
2
x̂k+1 = x̂k + uk , σk+1 = σk2 + σu2 , (38)

which is the same as that in the 1D KF prediction formula.


The prrof of the equivalence between KF and LLS for the general multiple
dimensional case is similar as above and the details are left to the readers as
exercises.

4 Nonlinear least squares


When the measurement functions are nonlinear, we will have nonlinear least
squares (NLS). Although KF formula can be extended to the case of nonlinear
dynamic and observation models, known as extended Kalman filter (EKF) [4].
The solution to NLS is no longer equivalent to the solution of EKF.2

4.1 Nonlinear least squares and Gauss-Newton iteration


When the functions involved are nonlinear, we can formulate nonlinear least
squares problem.
Nonlinear Least Squares Problem: Find X to minimise

kZ − F (X)k2 = [Z − F (X)]T [Z − F (X)]


2 The EKF formula is very similar to the one iteration in Gauss-Newton algorithm.

10
where
     
z1 f1 (X) f1 (x1 , x2 , · · · , xn )
 z2   f2 (X)   f2 (x1 , x2 , · · · , xn ) 
Z=  , F (X) =  =  , m ≥ n. (39)
     
.. .. ..
 .   .   . 
zm fm (X) fm (x1 , x2 , · · · , xn )
Suppose X is close to X0 , by Taylor expansion (linearisation),

F (X) ≈ F (X0 ) + JF (X0 )(X − X0 )


where JF (X0 ) is the Jacobian matrix given by
 ∂f ∂f1 ∂f1

∂x
1
∂x2 ··· ∂xn
 ∂f21 ∂f2 ∂f2
···

 ∂x1 ∂x2 ∂xn 
JF (X) = 
 .. .. .. .. 

 . . . . 
∂fm ∂fm ∂fm
∂x1 ∂x2 · · · ∂xn

evaluated at X0 .
Thus
Z − F (X) ≈ Z − F (X0 ) + JF (X0 )X0 − JF (X0 )X.
Solution: Let
A = JF (X0 ), b = Z − F (X0 ) + JF (X0 )X0 ,
using the linear least square solution X = (AT A)−1 AT b, we get

X1 = [JFT (X0 )JF (X0 )]−1 JFT (X0 )[Z − F (X0 ) + JF (X0 )X0 ].
In general,
Xk+1 = [JFT (Xk )JF (Xk )]−1 JFT (Xk )[Z − F (Xk ) + JF (Xk )Xk ].
Iterating this until Xk converge we can get the solution.
This is called Gauss-Newton iteration.

4.2 Weighted nonlinear least squares


If a m-dimensional (m ≥ n) measurement vector Z is given by
Z = F (X) + NZ , NZ ∼ N (0, PZ ), (40)
where A and X are defined in (39), then we can formulate the following weighted
least squares problem.
Weighted Nonlinear Least Squares Problem: Find X to minimise
[Z − F (X)]T PZ−1 [Z − F (X)]
Solution: first get initial value X0 , then start iteration:
Xk+1 = [JFT (Xk )PZ−1 JF (Xk )]−1 JFT (Xk )PZ−1 [Z − F (Xk ) + JF (Xk )Xk ].
This is Gauss-Newton iteration for weighted NLS.

11
5 Applications
LLS and NLS have many applications. Simultaneous Localisation and Mapping
(SLAM) is one of them. For 1D SLAM problem, LLS can be applied to solve it
and the solution is the same as the solution obtained by KF [5]. For 2D or 3D
SLAM, NLS can be used [6] [7] [8] and the solution is better than that obtained
using EKF [9] [10].

References
[1] S. Huang, Understanding Extended Kalman Filter – Part I: One Di-
mensional Kalman Filter. Faculty of Engineering and IT, Univeristy of
Technology Sydney. 2010.
[2] Y. Bar-Shalom, X. R. Li, and T. Kirubarajan, Estimation with Appli-
cations to Tracking and Navigation: Theory Algorithms and Software.
John Wiley & Sons. 2001. (Electronic Version)
[3] S. Huang, Understanding Extended Kalman Filter – Part II: Multi-
dimensional Kalman Filter. Faculty of Engineering and IT, Univeristy
of Technology Sydney. 2010.
[4] S. Huang, Understanding Extended Kalman Filter – Part III: Extended
Kalman Filter. Faculty of Engineering and IT, Univeristy of Technology
Sydney. 2010.
[5] S. Huang, Notes on 1D SLAM. Faculty of Engineering and IT, Univeristy
of Technology Sydney. 2016.
[6] F. Dellaert and M. Kaess, Square root SAM: Simultaneous localization
and mapping via square root information smoothing. International Jour-
nal of Robotics Research, 25(12):1181-1203, 2006.
[7] R. Kummerle, G, Grisetti, H. Strasdat, K. Konolige, and W. Burgard,
g2o: A general framework for graph optimization. IEEE International
Conference on Robotics and Automation, pp. 3607-3613, 2011.
[8] H. Wang, S. Huang, U. Frese, and G. Dissanayake. The nonlinearity
structure of point feature SLAM problems with spherical covariance ma-
trices. Automatica, 49, no. 10 (2013): 3112-3119.
[9] G. Dissanayake, P. Newman, S. Clark, H. Durrant-Whyte, and M.
Csobra, A solution to the simultaneous localization and map build-
ing (SLAM) problem. IEEE Transactions on Robotics and Automation,
17(3):229-241, 2001.
[10] S. Huang, G. Dissanayake, Convergence and consistency analysis for Ex-
tended Kalman Filter based SLAM, IEEE Transactions on Robotics,
23(5):1036-1049, 2007.

12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy