Point Cloud Densification
Point Cloud Densification
net/publication/265190953
CITATIONS READS
3 6,270
1 author:
Mona Forsman
SEE PROFILE
All content following this page was uploaded by Mona Forsman on 19 January 2015.
U ME Å U NIVERSITY
D EPARTMENT OF P HYSICS
SE-901 87 UME Å
SWEDEN
Abstract
Several automatic methods exist for creating 3D point clouds extracted from 2D photos. In many
cases, the result is a sparse point cloud, unevenly distributed over the scene.
After determining the coordinates of the same point in two images of an object, the 3D position
of that point can be calculated using knowledge of camera data and relative orientation.
A model created from a unevenly distributed point clouds may loss detail and precision in the
sparse areas. The aim of this thesis is to study methods for densification of point clouds.
This thesis contains a literature study over different methods for extracting matched point pairs,
and an implementation of Least Square Template Matching (LSTM) with a set of improvement
techniques. The implementation is evaluated on a set of different scenes of various difficulty.
LSTM is implemented by working on a dense grid of points in an image and Wallis filtering is
used to enhance contrast. The matched point correspondences are evaluated with parameters from
the optimization in order to keep good matches and discard bad ones. The purpose is to find details
close to a plane in the images, or on plane-like surfaces.
A set of extensions to LSTM is implemented in the aim of improving the quality of the matched
points. The seed points are improved by Transformed Normalized Cross Correlation (TNCC) and
Multiple Seed Points (MSP) for the same template, and then tested to see if they converge to the
same result. Wallis filtering is used to increase the contrast in the image. The quality of the extracted
points are evaluated with respect to correlation with other optimization parameters and comparison
of standard deviation in x- and y- direction. If a point is rejected, the option to try again with a larger
template size exists, called Adaptive Template Size (ATS).
ii
Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Theory 3
2.1 The 3D modeling process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Projective geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Homogenous coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Transformations of P2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 The pinhole camera model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Stereo view geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4.1 Epipolar geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4.2 The Fundamental Matrix, F . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4.3 Triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.4 Image rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5.1 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.3 Rank N-1 approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Least Squares Template Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 14
iii
iv CONTENTS
3.3 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 SIFT, Scale-Invariant Feature Transform . . . . . . . . . . . . . . . . . . . 16
3.3.2 Maximum Stable Extremal Regions . . . . . . . . . . . . . . . . . . . . . 16
3.3.3 Distinctive Similarity Measure . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.4 Multi-View Stereo reconstruction algorithms . . . . . . . . . . . . . . . . 16
3.4 Quality of matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Implementation 17
4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.1 Algorithm overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.2 Adaptive Template Size (ATS) . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.3 Wallis filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.4 Transformed Normalized Cross Correlation (TNCC) . . . . . . . . . . . . 19
4.2.5 Multiple Seed Points (MSP) . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.6 Acceptance criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.7 Error codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Choice of template size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.1 Calculation of z-coordinate from perturbed input data . . . . . . . . . . . . 25
5 Experiments 27
5.1 Image sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1.1 Image pair A, the loading dock . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1.2 Image pair B, “Sliperiet” . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.1.3 Image pair B, “Elgiganten” . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.1 Experiment 1, Asphalt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.2 Experiment 2, Brick walls . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.3 Experiment 3, Door . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.4 Experiment 4, Lawn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.5 Experiment 5, Corrugated plate . . . . . . . . . . . . . . . . . . . . . . . 32
6 Results 33
6.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.1 Asphalt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.2 Brick walls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.1.3 Door . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.1.4 Lawn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.1.5 Corrugated plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
CONTENTS v
7 Discussion 47
7.1 Evaluation of aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.2 Additional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.2.1 Point cloud density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.2.2 Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.2.3 Error codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.2.4 Homographies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8 Conclusions 51
9 Future work 53
10 Acknowledgements 55
References 57
A Homographies 59
A.1 Loading dock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.2 Building Sliperiet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
A.3 Building Elgiganten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
B Abbreviations 61
vi CONTENTS
List of Figures
vii
viii LIST OF FIGURES
Introduction
1.1 Background
Several automatic methods exist for creating 3D point clouds extracted from sets of images. In many
cases, they create sparse point clouds which are unevenly distributed over the objects. The task of
this thesis is to evaluate, compare and develop routines and theory for densification of 3D point
clouds obtained from images.
Point clouds are used in 3D modeling for generation of accurate models of real world items or
scenes. If the point cloud is sparse, the detail of the model will suffer as well as the precision of
approximated geometric primitives, therefore the subject of densification methods are of interest to
study.
1.2 Aims
The aims of this thesis are to evaluate some methods for generation of point clouds and finding pos-
sible refinements, that should result in more detailed 3D reconstructions of for example buildings
and ground. Some important aspects are speed, robustness, and quality of the output.
– A sparse 3D point cloud have been automatically reconstructed on the ground. The ground
topography is represented as a 2.5D mesh. The goal is to extract more points to obtain a
topography of higher resolution.
1
2 Chapter 1. Introduction
Theory
Photogrammetry deals with finding the geometric properties of objects, starting with a set of images
of the object. As mentioned in McGlone et al. [2004], the subject of photogrammetry was born
in the 1850:s, when the ability to take aerial photographies from hot-air balloons gave inspiration
to ideas of techniques to make measurements in aerial photographs in the aim of making maps of
forests and ground.
The technique is today used in different applications like computer and robotic vision, see for
example Hartley and Zisserman [2003], in creating models of objects and landscapes, and creating
models of buildings for simulators and virtual reality.
7. Point cloud densification is used to find more details and retrieve more points for better
estimation of planes and geometry.
3
4 Chapter 2. Theory
ax + by + c = 0
can be represented as
l = [a, b, c]T ,
that means the line consists of all points x = [x, y]T that satisfies the equation ax + by + c = 0. In
homogenous coordinates the point will be [x, y, 1]T . Two lines l and l0 intersect in the point x if the
cross product of the lines equals the point,
x = l × l0 .
p = [x, y, z, 1]T
and a plane by
l = [a, b, c, d].
2.2.2 Transformations of P2
Transformations of the projective plane P2 are classified in four classes, Isometries, Similarity trans-
formations, Affine transformations and Projective transformations. A transformation is performed
using a matrix multiplication of a transformation matrix H and the points x to transform,
x0 = Hx.
Isometries
Isometries are the simplest kind of transformations. They consist of a translation and a rotation of
the plane, which means that distances and angles are preserved. The transformation is represented
by
0 R t
x = T x,
0 1
Figure 2.1: Similarity, affine and projective transform of the same pattern.
Similarity transformations
Combining the rotation of an isometry with a scaling factor s gives a similarity transform
0 sR t
x = T x.
0 1
A similarity transform preserves angles between lines, the shape of an object and the ratios between
distances and areas. This transform has four degrees of freedom.
Affine transformations
An affine transformation combines the similarity transform with a deformation of the plane, which
in block matrix form are
A t
x0 = x,
0T 1
where A is a composition of a rotation matrix and a deformation matrix, which is diagonal and
contains scaling factors for x and y
λ1 0
D= .
0 λ2
Projective transformations
Projective transformations give perspective views where objects far away is smaller than close ones.
The transformation is represented by
0 A t
x = T x,
v v
2.3. The pinhole camera model 7
The vector vT determines the transformation of the ideal point where parallel lines intersect. The
projective transform has 8 degrees of freedom, only the ratio between the elements in the matrix are
fixed. This makes it possible to determine the transform between two planes from four pairs of
points.
Figure 2.2: Schematic view of a pinhole camera. The image plane is shown in front of the camera
centre to simplify the image, in real cameras the image plane, image sensor, is behind the centre of
the camera.
A simple camera model is the pinhole camera. A 3D point X in world coordinates maps on to
the 2D point x on the image plane Z of the camera where the ray between X and the camera centre
C intersects the plane. The focal distance f is the distance between the image plane and the camera
centre which is the focal point of the lens. Orthogonal to the image plane, the principal ray passes
through the camera centre along the principal axis, originating in the principal point of the image
plane. The principal plane is the plane parallel with the image plane through the camera centre.
Figure 2.2 shows a schematic view of the pinhole camera model.
The projection x of a 3D point X on the image plane of the camera is given by
x = P X,
where the camera matrix P is composed by the 3 × 4 matrix
P = KR[I| − C],
The camera matrix describes a camera setup composed by internal and external camera pa-
rameters. The internal parameters are the focal length f of the camera, the principal point P, the
resolution mx , my and optional skew s. The focal length and the principal point are converted to
pixels using the resolution parameters. αx = f mx , αy = f my is the focal length in pixels and
x0 = mx Px , y0 = my Py is the principal point. The internal parameters are stored in the camera
calibration matrix
αx s x0
K= αy y0 . (2.1)
1
The external parameters determine the camera position relative to the world. These are the posi-
tion of the camera centre C and the rotation of the camera constructed by a rotation matrix R.
8 Chapter 2. Theory
Figure 2.3: The epipolar line connects the the cameras’ focal points.
F = [e0 ]× P 0 P +
where [e0 ]× is the representation matrix for transforming a cross product to a matrix-vector multi-
plication and P + is the pseudoinverse of the matrix P .
2.4. Stereo view geometry 9
2.4.3 Triangulation
When the camera matrices are calculated, and the coordinates for a point correspondence are known,
the 3D point can be calculated by solving the equation system
x = PX
x0 = P 0 X
for X.
Figure 2.4: The grid to the left is curved as a lens distorted image, the right image is the rectified
grid.
10 Chapter 2. Theory
2.5 Estimation
This section follows mostly the notation from Montgomery et al. [2004].
2.5.1 Statistics
Origins of errors
In tasks where measurements are done there usually occur some errors. The quality of measurements
are affected by systematic errors (bias) and unstructured errors (variance).
In photogrammetry usual origins of errors are the camera calibration, the quality of point extrac-
tion, the quality of the model function and numeric errors in triangulation and optimization.
The Normal distribution describes the way many random errors affects the results. The most prob-
able value is close to the expected value µ. Few values are far away. The standard deviation σ (and
variance) describes the dispersion of values. The distribution of a normal random variable is defined
by the probability density function N (µ, σ 2 ), giving
1 −(x−µ)2
f (x) = √ e 2σ2 for − ∞ < x < ∞,
2πσ
Figure 2.5: Normal distribution with expectation value µ = 5 and variance σ 2 = 10.
2.5. Estimation 11
y (n/2−1) e−y/2
py (y, n) = n ∈ N , y > 0,
2n/2 Γ( n2 )
where Γ(.) is the Gamma-function. A particular case is the sum of squared independent random
variables, zi ∼ N (0, 1)
n
X
y= zi2
i=1
where f (x) is the probability density function of the distribution and µ is the expected value. The
standard deviation is σ, the square root of the variance.
Covariance
Covariance is a measure of how two variables interact with each other. A covariance of zero implies
that the variables are uncorrelated. Of two variables x and y with estimation values E(x) and E(y)
and mean values µx and µy the covariance is
Correlation coefficients
The correlation coefficient is the normalized covariance and determines the strength of the linear
relationship between the variables. The correlation coefficient is determined by
Cov(x, y)
ρxy = q ,
(σx2 σy2
Figure 2.6: Values in the left image are correlated with a correlation coefficient close to 1. Values in
the right image are not correlated, and hence the correlation coefficient is close to 0.
2.5.2 Optimization
Optimization has a set of different applications in photogrammetry. One is in template matching,
which is the interesting area for this work, and one is in Bundle adjustment where a model and a
point cloud are adjusted to each other. The task of least squares optimization is to minimize the
norm ||r(x)|| of the residual r(x) between a model function f (x) and the observations b. If the
model is linear, the residual becomes
r(x) = Ax − b.
2.5. Estimation 13
Weighted optimization
If the covariances of the variables are known and not zero, the problem is considered weighted and
the optimization problem can be formulated as
min ||r(x)||2W = min ||Ax − b||2W ,
x x
The index W indicates that the norm ||.||2W is weighted and defined as
||x||2W = xT Wx.
The weight matrix W is defined by
W = C−1
bb ,
where Cbb is the covariance matrix of the observations. The structure of the covariances may be
enough, and then the covariance matrix can be decomposed to
Cbb = σ02 Qbb
where σ02 is a scaling parameter and Qbb is a structure parameter.
is solved.
This can be solved by using the SVD of A
A = U DV T .
The aim of the optimization is to reduce the noise function e(x, y). The template function f (x, y)
is transformed by an affine transformation approximating the projective difference between the im-
ages to match the search patch function g(x, y). Optionally, the transformation is combined with
radiometric parameters to compensate for lighting differences. The optimization parameters are
combined in a vector xo , based on the homography matrix H. The Gauss Newton optimization is
then applied on the elements of x to minimize the resulting noise.
The output from LSTM is primary the position of the template in the both images. Optional, the
implementations also return the results of the other optimization parameters, the step lengths used
by Gauss Newton and statistics.
Chapter 3
3.1 Introduction
There are many different methods used in the aim of improving the density of point clouds, with
different pros and cons. For image acquisition video data or still images can be used, sometimes
in combination with laser scanner data of the object. Both feature based methods and area based
methods are used in various implementation. Much of the work with densification is done on small
objects with symmetrical camera network all around the object.
The task of finding general methods for densification is still an open task.
15
16 Chapter 3. Overview of methods for densification
3.3 Matching
There are two main classes of methods for matching of images, feature based methods and area
based methods, in some cases combined with each other to improve results. Here are some different
approaches briefly presented.
Implementation
The objective for this thesis is template matching in the purpose of point cloud densification. In this
section, an implementation of least squares template matching is described, combined with a set of
different methods for refinements in preprocessing of the images, analysis of resulting parameters
and successive improvement.
Four corners in one image and a homography between the particular plane to match in the pair
of images is given to the method. The images have known epipolar geometry and camera positions.
The implemented methods shall be evaluated with respect to
– Point cloud quality:
• Completeness — what is the density of the generated point cloud?
• Robustness — how many matches are correct?
• Precision — what is the reconstruction error for the correct matches?
– Method sensitivity to:
• Object geometry — how large deviations from the basic shape can be reconstructed?
• Camera network geometry — how well do the methods work with a long baseline, i.e.
the images were taken far apart?
4.1 Method
The choice of method is Least Squares Template Matching (LSTM) as described in ch. 2.6. A dense
grid of seed points is constructed for the area of interest in the left image. The grid is transformed to
the right image using the homography to generate initial guesses for the points. This gives a more
evenly distributed set of seed points than feature-based methods usually generates. The density of
points is possible to adjust by changing the grid.
A number of extensions to LSTM have been implemented. These are; Wallis Filtering for con-
trast enhancement, Transformed Normalized Cross Correlation (TNCC) to improve the initial pairs
of seed points, Multiple Seed Points (MSP) to ensure stability of the match and Adaptive Template
Size (ATS) to detect matches where the scale of gradients in the image is too large for the first
one used. The matches found are evaluated in respect to matching parameters to detect and reject
possible false matchings.
17
18 Chapter 4. Implementation
2. Place a rectangular grid over the area in the left image for point generation as in figure 4.1.
3. Create initial seed points by transforming the grid points with the homography to the right
image as in figure 4.2.
(a) Optional: Improve the seed point by Transformed Normalized Cross Correlation, see
section 4.2.4
– If the maximum value of the cross correlation is too low, set an error code.
(b) Cut out a template centered in the grid point of the left image and a search patch cen-
tered in the improved seed point in the right image which is three times larger than the
template, see figure 4.3.
(c) Optional: If Multiple seed points is used, put eight extra seed points around the grid
point in the left image.
– Perform Least Squares Template Matching for all nine points.
– Choose coordinates the most start points converge to as the found point. If less than
three converges to the same point, restart from the normalized cross correlation with
a larger template.
– If less than four seed points converge to the same point, set an error code.
(d) If Multiple seed points is not used, perform least squares template matching on the point.
(e) Calculate the covariance matrix for the found point. If it not passes the analysis, set an
error code.
6. Calculate 3D points for the accepted point pairs using calibration data for the cameras.
(1 − α)(Ii,j − µ)
Ji,j = αµ +
σ
where Ji,j is the new value of the processed pixel in the image, Ii,j is the old value, α is a blending
parameter, µ is the mean intensity of the pixels in the filter window and σ is the standard deviation.
If α is close to 1one, the Wallis filter is an averaging filter and if α is close to zero it normalizes the
intensity of the pixels.
This process is time consuming, so it is wise to only filter the interesting parts on an image. In
images 4.4 and 4.5, a grass area is displayed in gray scale and its Wallis filtered version.
– Take a template from the left image centered in the grid point.
– Find the corners of an area three times as large as the template centered in the same point in
the left image.
– Transform the corner points using the homography to the right image.
– Pick the rectangular area including the corner points and transform the sub image with imtransform
to reshape it as the left image.
– Find the best starting point with normxcorr2 for the template in the transformed sub image.
– Transform the coordinates for the best point to absolute coordinates in the right image. This
is the new seed point.
– If the maximum cross correlation value is to low, try with larger template size (and larger
search patch).
Image 4.6 shows the Normalized cross correlation. This improvement is expensive in terms of exe-
cution time, especially if the template sizes are large, and is not recommended to use in combination
with Adaptive Template Size.
1. The max value from Normalized cross correlation was lower than 0.8.
2. Less than 4 of multiple seed points converged to the same result.
3. Maximum adaptive template size reached without any accepted point found.
An accepted point has the code 0. Codes 1-3 appears only when their respective method is applied.
Code 3 for maximum adaptive template size overrides the other codes.
4.2. Implementation details 21
Figure 4.2: Seed points generated by transforming a 10 × 10 grid of points with the homography H.
22 Chapter 4. Implementation
Figure 4.3: Left image is an example of a template 15 × 15 pixels, right image is the corresponding
search patch.
4.2. Implementation details 23
Figure 4.6: Normalized cross correlation of a template and the corresponding search patch in area
A.5. The left image is the template of 21×21 pixels, the middle image is the search patch, and the
right image is the resulting cross correlation where light pixel implies high correlation.
Figure 4.7: Example of a search patch for multiple seed points in area A.5. Green circles are seed
points, red stars are the found points.
4.3. Choice of template size 25
x = [2203.3, 1101.6]T
using the camera equation (eq.2.1). A shift of the z-coordinate of 0.2 meters gives the 2D point
x0 = [2204.5, 1100.5]T .
pixels.
This tells us that a 0.2 meter detail is projected less than two pixels from its corresponding point
in the plane. Least Squares Template Matching requires that the true point is within a half template
size from the seed point, which is fulfilled by a template size of seven pixels.
26 Chapter 4. Implementation
Chapter 5
Experiments
A number of experiments were designed to investigate the properties of LSTM and the extensions
described in chapter 4. In this chapter the examples are described, followed by their results in chapter
6.
The tests were run under MATLAB 7.9.0 (R2009b) on Intel Core2 Quad CPU Q9300 2.500GHz
with 4096 MB memory and Intel Core2 Quad Q9400 2.66GHz with 4096 MB memory. The second
type has about 5% higher performance.
A.3 A painted door with a window and some signs, very dark area of the image.
27
28 Chapter 5. Experiments
B.2 Plastered wall with windows, signs and obstacles at the bottom.
B.3 Plastered wall with some fine cracks in the plaster and windows (not used in final version).
B.4 Asphalt in skew angle with a lot of random structure.
5.2 Experiments
5.2.1 Experiment 1, Asphalt
Two asphalt areas were chosen to study the effects of different lightning in the same kind of area,
the effect of a gradient from a shadow, and how multiple seed points works in an area with a lot
of small, irregular structure. The chosen areas are area A.5 and area B.4. For both areas, TNCC
and Wallis filtering are used, and in the second run also combined with MSP. For reference, a run
without extensions is done.
Results
6.1 Experiments
In this section the results from the experiments are presented in detail.
6.1.1 Asphalt
The asphalt area contains lots of small irregular gradients as seen in image 6.2. In the Wallis filtered
image the gradient from the shadow is distinct as well as parts of the others, see figure 6.3, but
some smaller gradients are suppressed. In this experiment, an outlier is defined as a point whose 3D
coordinate is more than 5 cm away from a manually determined plane in the point cloud. As seen in
image 6.1, the combination of TNCC and Wallis only detects points along the gradient conjured by
the shadow. The effect of Wallis filtering of asphalt is shown in image 4.6.
Table 6.1: Example 1, Area A.4 and area B.4. No improvements vs. TNCC + Wallis vs. TNCC +
Wallis + MSP vs. Wallis vs. Wallis + MSP.
Area + method A.5 + 0 A.5 + cw A.5 + cwm A.5 + w A.5 + wm
Number of accepted points 5601 79 79 6363 6423
Runtime (s) 184.3 7646 9412 335.7 474.4
Median residuals 0.01 0.009 0.008 0.006 0.006
Std.deviation 0.015 0.01 0.012 0.012 0.012
No. outliers 30 0 0 0 0
33
34 Chapter 6. Results
Figure 6.1: Red points are points detected in area A.5 with TNCC and Wallis filtering, green points
are seed points.
Figure 6.4: 3d point cloud of matched points using LSTM with MSP in areas A.1, A.2 and A.4.
Table 6.2: Example 2, Area A.1, area A.2 and A.4. No improvements versus MSP.
Area + method A.1 + 0 A.1 + m A.2 + 0 A.2 + m A.4 +0 A.4 +m
Number of accepted points 3674 3683 1517 1484 2926 2902
Runtime (s) 105.2 845.4 138.9 1327 175.3 1428
6.1. Experiments 37
Figure 6.5: Result of MSP in area A.1 in left image, template in right image. Blue circles are the
multiple seed points, red dots are found points.
38 Chapter 6. Results
6.1.3 Door
On the dark door, improvements with Wallis filtering respectively Wallis with ATS give over 100
times as many hits as improvement with Wallis filtering and TNCC, see table 6.3 and figure 6.6, 6.7,
and 6.8 for a comparison. A seen in the histogram in figure 6.9, larger template sizes only make a
minor increase to the amount of accepted points.
Table 6.3: Example 3, Area A.3. TNCC + Wallis vs. TNCC + Wallis + ATS.
Area + method A.3 + cw A.3 + wa A.3 w
Number of accepted points 41 9545 6182
Runtime (s) 7375 374.68 117.4
6.1. Experiments 39
Figure 6.6: Seed points giving accepted results in part of Area A.3 using Wallis and TNCC.
Figure 6.7: Seed points giving accepted results in part of Area A.3 using Wallis.
40 Chapter 6. Results
Figure 6.8: Seed points giving accepted results in part of Area A.3 using Wallis and ATS.
6.1. Experiments 41
Figure 6.9: Histogram over used template sizes in Experiment 3, A.3, with Wallis filtering and ATS.
42 Chapter 6. Results
6.1.4 Lawn
On the irregular gradients of a lawn, the extension of Wallis filtering found about 75% more points
than standard LSTM, fulfilling the purpose of densification, as seen in table 6.4. Figure 6.11 com-
pared to 6.10 shows, the Wallis filtering helped for accepting points in areas where the baseline
LSTM failed.
Figure 6.10: Seed points giving accepted results in part of area C.1 using baseline LSTM.
Table 6.4: Results from example 4, Area C.1 baseline LSTM vs Wallis filtering
Area + method C.1 + 0 C.1 + w
Number of accepted points 2283 3844
Runtime (s) 389 459
6.1. Experiments 43
Figure 6.11: Seed points giving accepted results in part of area C.1 with Wallis filtering.
44 Chapter 6. Results
Figure 6.12: Seed points giving accepted results in part of area C.2 with baseline LSTM.
Table 6.5: Results Example 5, Area C.2 and C.3 no improvements, Wallis vs. Wallis and ATS.
Area + method C.2 + 0 C2 + w c.2 + wa C.3 + 0 C.3 + w C.3 + wa
Number of accepted points 172 148 1116 1139 2771 6037
Runtime (s) 278.5 150.6 1392 408.9 129.9 642.4
6.1. Experiments 45
Figure 6.13: Seed points giving accepted results in part of area C.2 with Wallis and ATS.
46 Chapter 6. Results
Figure 6.14: Histogram over required template sizes for acceptance of points in area C.2.
Chapter 7
Discussion
47
48 Chapter 7. Discussion
– TNCC reduces the density of the point cloud, sometimes more than ten times,
– Wallis filtering improves usually the density of the point cloud significantly, however there
are some exceptions, area C.2, B.2 and B.4, where the densities decreased some.
– ATS increases the density, how much depends on structure of the surface.
– MSP neither increase or decrease the density more than a few percent.
7.2.2 Runtime
The runtime varies a lot between the different methods, and is also depending on the type of area.
Some things to notice:
– Wallis filtering takes time directly depending on the size of the filtered area. When matching
large amounts of points, the overhead for Wallis filtering isn’t a problem.
– TNCC works fast on small areas, but in combination with APS it may generate too long
runtimes. A couple of runs in this configuration are not presented because the runs exceeded
a week and therefore aborted. With a smaller maximum template size, for example 21 × 21
pixels, this would not have been a problem.
– MSP gives the predicted nine times longer runtimes than base line LSTM.
– ATS runtime varies depending on how often large template sizes are used. It has acceptable
speed if many points are accepted on small templates, but if the templates grows it slows down
significantly.
The MSP rejects very few points by its error code 2. It may be possible to set a higher threshold
on number of points converging to the same point here.
The error code 6 is possible to be reset by 7 in the error code logic, and may in reality fit on more
points.
APS is required to set the error code 3, and resets then all other codes, so that one is not used in
this evaluation.
7.2.4 Homographies
The quality of the homography is very relevant for the result, since a bad fitting hompgraphy gives
bad seed points for the optimization. The code implemented for assisted creation of homographies
gives a warning if it is plausible that the quality is low, but it is recommended to always generate a
couple of homography matrices and compare them as well as test them on the image.
50 Chapter 7. Discussion
Chapter 8
Conclusions
It is possible to densify a point cloud using extensions to LSTM. Aspects of point quality and time
consumptions need to be considered in comparison to the need of densification of the point cloud.
The task of densification of point clouds suffer from the same difficulty as most image analysis tasks;
generally working methods aren’t existing, and different kinds of objects take advantage of different
methods.
51
52 Chapter 8. Conclusions
Chapter 9
Future work
During the work on this thesis a lot of questions possible to evaluate have risen. A selection of them
are:
– Template sizes It would be possible to write an algorithm for finding the optimal patch size
in different parts of an image. This would probably include frequency analysis of the image
using Fourier transform or wavelets, or using the SIFT-detectors radius information.
– TNCC. The threshold for accepting a seed point from TNCC needs a deeper evaluation.
– Precision of point clouds. The point clouds generated from LSTM with the different ex-
tensions would be compared to ground truth of the object, determining the precision of the
points.
– Bench mark.Constructing a set of images including camera calibration data, ground truth,
homographies and sets of points to match and detect would give the opportunity to compare
different methods regarding precision, robustness and time consumed in a re-usable way.
– MSP A deeper study of distribution of the multiple seed points, the threshold for the definition
of the same convergence point, and number of agreeing points can be interesting.
53
54 Chapter 9. Future work
Chapter 10
Acknowledgements
I want to thank my supervisor, Niclas Börlin, for taking his time to have me as a student. Your talent
in inspiring and making the subject fun to work with has been much worth for me to accomplish this
work. I also want to thank David Grundberg and Håkan Fors Nilsson, for helping me out a couple
of times.
55
56 Chapter 10. Acknowledgements
References
Steven D. Blostein and Thomas S. Huang. Error analysis in stereo determination of 3-d point posi-
tions. IEEE T Pattern Anal, 9(6):752–765, November 1987. doi: 10.1109/TPAMI.1987.4767982.
Niclas Börlin and Christina Igasto. 3d measurements of buildings and environment for harbor sim-
ulators. Technical Report UMINF 09.19, Department of Computing Science, Umeå University,
SE-901 87 Umeå, Sweden, October 2009.
Nicolas D’Apuzzo. Surface measurement and tracking of human body parts from multi station
video sequences. PhD thesis, Institute of Geodesy and Photogrammetry, ETH Zürich, Zürich,
Switzerland, October 2003.
S.F. El-Hakim, J.-A. Beraldin, M. Picard, and G. Godin. Detailed 3d reconstruction of large-scale
heritage sites with integrated techniques. IEEE Comput Graphics Appl, 24(3):21–29, May 2004.
ISSN 0272-1716. doi: 10.1109/MCG.2004.1318815.
Håkan Fors Nilsson and David Grundberg. Plane-based close range photogrammetric reconstruction
of buildings. Master’s thesis, Department of Computing Science, Umeå University, Technical
report UMNAD 784/09, UMINF 09.18 2009.
Wolfgang Förstner and Bernhard Wrobel. Mathematical Concepts in Photogrammetry, chapter 2,
pages 15–180. IAPRS, 5 edition, 2004.
Jan-Michael Frahm, Marc Pollefeys, Brian Clipp, David Gallup, Rahul Raguram, ChangChang Wu,
and Christopher Zach. 3d reconstruction of architectural scenes from uncalibrated video se-
quences. International Archives of Photogrammetry, Remote Sensing, and Spatial Information
Sciences, XXXVIII(5/W1):7 pp, October 2009.
D. Gallup, J.-M. Frahm, P. Mordohai, Q. Yang, and M. Pollefeys. Real-time plane-sweeping stereo
with multiple sweeping directions. In Proc. CVPR, pages 1–8, Minneapolis, Minnesota, USA,
June 2007. IEEE. doi: 10.1109/CVPR.2007.383245.
Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing. Addison-Wesley, 3rd edition,
2008.
A. Gruen. Least squares matching: a fundamental measurement algorithm. In K. B. Atkinson,
editor, Close Range Photogrammetry and Machine Vision, chapter 8, pages 217–255. Whittles,
Caithness, Scotland, 1996.
A. W. Gruen. Adaptive least squares correlation: A powerful image matching technique. S Afr J of
Photogrammetry, 14(3):175–187, 1985.
Armin Grün, Fabio Remondino, and Li Zhang. Photogrammetric reconstruction of the great buddha
of bamiyan, afghanistan. Photogramm Rec, 19(107):177–199, 2004.
57
58 REFERENCES
R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University
Press, ISBN: 0521623049, 2000.
R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University
Press, ISBN: 0521540518, 2nd edition, 2003.
David G. Lowe. Object recognition from local scale-invariant features. In Proc Intl Conf on Com-
puter Vision, pages 1150–1157, Corfu, Greece, September 1999.
J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide baseline stereo from maximally stable
extremal regions. In A. David Marshall and Paul L. Rosin, editors, Proc British Machine Vision
Conference, pages 384–393, Cardiff, UK, September 2002. British Machine Vision Association.
Chris McGlone, Edward Mikhail, and Jim Bethel, editors. Manual of Photogrammetry. ASPRS, 5th
edition, July 2004. ISBN 1-57083-071-1.
Douglas C. Montgomery, George C. Runger, and Norma Faris Hubele. In Engineering Statistics,
2004. ISBN 0-471-45240-8.
Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer-Verlag, 1999. ISBN
0-387-98793-2.
F. Remondino, S. El-Hakim, S. Girardi, A. Rizzi, S. Benedetti, and L. Gonzo. 3d virtual recon-
struction and visualization of complex architectures - the 3d-arch project. International Archives
of Photogrammetry, Remote Sensing, and Spatial Information Sciences, XXXVIII(5/W1):9 pp,
October 2009.
Fabio Remondino. Image-based modeling for object and human reconstruction. PhD thesis, Institute
of Geodesy and Photogrammetry, ETH Zürich, ETH Hoenggerberg, Zürich, Swizerland, 2006.
Fabio Remondino, Sabry F. El-Hakim, Armin Gruen, and Li Zhang. Development and performance
analysis of image matching for detailed surface reconstruction of heritage objects. IEEE Signal
Proc Mag, 25(4):55–64, July 2008.
Steven M. Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. A compari-
son and evaluation of multi-view stereo reconstruction algorithms. In CVPR’06, volume 1, pages
519–528, June 2006. doi: 10.1109/CVPR.2006.19.
Homographies
Here is the homography matrices used for the different areas presented.
0.9591 −0.1217 −68.80
H1 = 0.0052 0.7070 158.1
0 −0.0002 1
0.912 −0.055 12.70
H2 = −0.0158 0.9234 75.42
−0 −0 1
1.024 0.021 −95.45
H3 = 0.008 1.083 −31.09
0 0 1
0.8794 −0.251 78.94
H4 = 0.0011 0.772 96.15
0 −0.0002 1
1.0 −0.9 1194.1
H5 = 0. 1.0 36.4
0 0 1
59
60 Chapter A. Homographies
Abbreviations
61