SLAM Manuscript
SLAM Manuscript
Abbreviation Explanation
AI Artificial Intelligence
ANN Artificial neural network
AUV Autonomous Underwater Vehicle
BA Bundle Adjustment
BRIEF Binary Robust Independent Elementary Features
CML Concurrent Localization and Mapping
DOF Degree Of Freedom
DoG Difference-of-Gaussian
?
corresponding author
EKF Extended Kalman Filter
FAST Features from Accelerated Segment Test
g2o General Graph Optimization
GNC Guidance, Navigation and Control
GPS Global Positioning System
iLBA Incremental Light Bundle Adjustment
IMU Inertial Measurement Unit
iSAM Incremental Smoothing and Mapping
KF Kalman Filter
LAGO Linear Approximation for pose Graph Optimization
LBA Light Bundle Adjustment
LIDAR Light Detection and Ranging
MAV Micro Aerial Vehicles
MLR Multi-level relaxation
MSCKF Multi-State Constraint Kalman Filter
ORB-SLAM Oriented FAST and Rotated BRIEF
RANSAC Random sample consensus
RGB-D Red Green Blue Depth
Right Invariant Error Extended Kalman filter
RIEKF-VINS
Visual Inertial Navigation Systems
SFM Structure from Motion
SGD Stochastic gradient descent
SIFT Scale Invariant Feature Transform
SLAM Simultaneous Localization and Mapping
SPA Sparse Pose Adjustment
SURF Speeded Up Robust Feature
UAV Un-manned Autonomous Vehicle
UKF Unscented Kalman Filter
VINS Visual Inertial Navigation Systems
VO Visual odometry
vSLAM Visual Simultaneous Localization and Mapping
1 Introduction
Robotics is the emerging field of science which aims to reduce human depen-
dency. The implementation of robotic technology provides assistance to various
technical fields like guidance, navigation and control, collaborative work etc. Re-
cently, integration of artificial Intelligence (AI) technology with robotics ensures
autonomy to the robots for decision making. This phenomena assists robots in
predicting the uncertainties in the unknown world. Perception of robots, mainly
develops through single or multiple sensor data [1]. Using these sensor data robot
perceives the surrounding environment. This execution largely depend on the
sensor data processing. In the field of navigation, sensor data track the position
and orientation of the robot body and ensure controls for navigation accordingly.
The mechanism of sensor fusion poses multiple challenges for real-time applica-
tion which leads to generate several modeling algorithm [2]. Based on sensing
methodology a robot can be classified as autonomous in real-time or not. In most
cases sensing works according to the deterministic algorithm in a known envi-
ronment (prior information). Autonomy generates plans based on sensor data.
Depending on the dynamically changing environment, appropriate sequence of
actions are adapted. The non-deterministic algorithms tries to overcome the
probability of error in difficult modeling process. It is said that a machine or
software system is capable to directly replace a human being. A human being
has the capability to plan the path to proceed and also can do problem solving.
A machine works under strict human supervision. Without human interference
or without risk of failure in unmanned ground vehicle, monitoring and planning
with its intelligent capabilities can be achieved [3].
Autonomous robotics faces major challenges in the field of path planning(
moving from point A to point B), mapping(assuming the unknown world by inte-
grating the measurements of positions and rebuilding it), localization(assuming
the perfect location with respect to the world model from sensor data). It has
also made advancement in the SLAM technique, a combination of the above
mentioned challenges. The robotic system works on the principle of sense, plan
and act.
Robot principle
input output
factors
Sensed
Sense Sensor data information
Sensed or
Plan cognitive Directives
Information
Sensed
information or Actuator
Act
directives commands
Fig. 1. Interfaces of Robotics System: This table depicts different sub-systems of au-
tonomous robots
Primarily, robots are used for multiple purposes like surveillance, production,
agriculture, surgery, monitoring etc.It can replace human for dangerous assign-
ments, assist the aging group in their daily chores, amuse by providing entertain-
ment and at last enables humans to project properly from a distance in real time
by sensing, planning and acting accordingly. the military drones guides soldiers
to sense(see) and shoot the target from a beyond visual range. The automated
surgery robots helps surgeons work with precision. Robots are also widely ap-
plied in disaster m while resource allocations [3]. Gradually, autonomous robots
are occupying space in the vast and fast evolution of advanced applications with
multi-dimensional operational challenges.Under those circumstances , robots re-
quire fail-safe guidance, control and navigation (GNC) enabled localization for
monitoring its pose in the unknown environment. Pose estimation generally con-
sists of integrated information of both position and orientation of the moving
robot. Further, this localization in each iteration helps in constructing a map of
the unknown environment (terrestrial, aerial, or underwater). [4] [5]. This inte-
grated process comprised of localization and mapping is known as simultaneous
localization and mapping (SLAM) [6].
Simultaneous localization and mapping technique can be also applied to arti-
ficial intelligence mobile robot for self-exploration in different geographical envi-
ronments. It solves the computational problem for an autonomous mobile robot
which is to be placed in an unknown location and to construct as well as update
a map while simultaneously tracking its position or location. Localization defines
the exact or accurate current pose of the robot in an unknown environment [7].
The minimum requirement for SLAM technique is that the robot should be
mounted with multiple sensors like camera, LIDAR, sonar, IMU, GPS etc [2].
Internal sensors like accelerometer (measures transnational motion), gyroscope
(measure rotational motion and changes in its orientation). External sensors like
cameras (captures real-time structures of landmarks), bump sensor (detects and
avoids obstacles), force-torque sensors (measures the 6 DOF forces for move-
ment), spectrometers. Sensors collect information about the appearance from
the data collected from particular locations along with specific landmarks, thus
building a map. Super-imposing the sensor information over the map built can
enhance the detailing of the unknown environment [8].
The main objective is to self-explore and avoids collisions with the obstacles
by implementing estimation technique under the probabilistic framework. The
autonomous robot or more scientifically called as artificial intelligence robot
which can ‘think’ like most animals or us, human beings do while making the
decision and act accordingly based on the instructions issued after the decision
making as its response [9]. Intensive relevant research has been done in the field
of automatic navigation and control of autonomous mobile robots.
In the initial years of 1985-1990, mapping and localization were performed con-
currently and hence the term used was concurrent localization and mapping
(CML) [10, 11]. After a few years, SLAM was introduced by [6].
It is a technique to achieve autonomous control of robots in robotics. Robot
gathers the information of the unknown environment and continuously update
it. [12] The objective is to self-explore the environment avoiding obstacles within
it [13]. Several challenges are faced during self-exploration that includes avoid-
ing obstacles and landmarks while navigation, the capability of estimating own
location in an unknown environment, generating the map of the environment,
and based on that should be able to make decisions without human interference
[14].To build the map, robot depends on inputs from two types of sensors i.e.
proprioceptive and exteroceptive.
Proprioceptive sensors such as gyroscope and accelerometer measure internal
values of the system like velocity, wheel load, position change, and accelerations
and estimate the position of the entity using dead reckoning navigation method.
But due to inherent noise and cumulative error, position estimation of entity is
a bit challenging. On the other hand, exteroceptive sensors for instance sonar,
laser, cameras, and GPS acquires information from the outside world. Although
sonar and laser provide precise information. However, their cost and bulkiness
make them incompetent in a highly cluttered environment restricting their use in
airborne robots and humanoids. On the contrary GPS, sensors are also inefficient
and don’t work well indoors and underwater.
In the last decades, it is completely visible that cameras are the most widely
used exteroceptive sensor with the ability to retrieve the environment’s appear-
ance, color, and texture. These are less expensive, light-weight, and require low
power consumption. It helps the robot in the detection and recognition of land-
marks. SLAM using cameras as the only sensor is referred to as visual SLAM.
Visual Slam technique in the autonomous robot uses a camera for visual infor-
mation that captures video of the surrounding environment. Then the camera
measurements and motion can be estimated. The configuration of the camera as
the sensor is simple, which is widely applicable in the field of computer vision, in
un-manned autonomous vehicle (UAV) and augmented reality. The camera pose
which includes position and orientation is estimated and a 2D-3D structure of
the surrounding is constructed, simultaneously.
The basic SLAM characteristics [15] to be satisfied for autonomous robots
may be summarized as- i)Accuracy- it is the localization of the robot in the
local and global coordinate system of the map. It is always maintained below the
threshold. ii) Scalability- the capability to work in constant time and also possess
a constant memory to load mapped data for easy accessibility. iii)Availability- an
adequate accurate SLAM algorithm for localization can be used with the existing
map. iv)Recovery- the ability of the robot to localize itself in a large map. It
also helps to recover from tracking failures. v) Updatability- the changes in the
current observation are automatically made in the existing map. The changes
made are also not permanent, which may also be altered in the subsequent steps.
vi) Dynamicity- the ability to handle the changes in the dynamic environment.
The presence of obstacles causing collision can affect the SLAM approaches. The
climatic condition may also have an adverse impact on the localization process.
It can be further classified into three categories as feature-based, direct, and
RGB-D camera-based. In a feature-based approach, vSLAM uses a monocular
camera to extract feature points on tracking and mapping.
In the process of initialization, for camera pose estimation and 3D map re-
construction, the features or points in the global coordinate system is defined
Initialization
vSLAM Tracking
Mapping
The robot solving a SLAM problem minimum requires a sensor that acquires
information about the environment in which it is moving. SLAM problem can
be defined in two ways online SLAM using equation (1) and full SLAM using
equation (2)
P (xk , m|Z1:k , U1:k ) (1)
P (x1:k , m|Z1:k , U1:k ) (2)
The only difference between the two is that in full SLAM posteriors are
calculated over the entire path x1:k . Here x represents location and orientation
of the robot, m is the ith feature location, Uk is the control vector and Zk is the
observation taken by sensor.
Applying Bayes theorem with assumptions that landmarks are time-invariant
in environment and robot is having Markov motion model SLAM formula can
be described by the recursive expression given by equation (3) [38] [42]
P (xk , m|Zk , Uk ) = η.P (Zk |xk , m) ∫ P (xk |xk−1 , Uk )(xk−1 , m|Zk−1 , Uk−1 )dxk−1
(3)
where η is the normalizing constant, P (xk |xk−1 , Uk ) is the motion model and
P (Zk |xk , m) is observation model of the robot respectively.
In general single robot SLAM is implemented in three steps:-
xk = f (xk−1 , Uk ) + wk (4)
Zk = h(xk , m) + νk (5)
xk assumes the prior estimate of x̂k−1 and current observation yk are Gaus-
sian Random Variables.
UKF Algorithm
Initialize with:
Time update:
X xk|k−1 = F X xk−1 , X vk−1
(21)
2L
(m)
X
x̂−
k = Wi ∞
Xi,k|k−1 (22)
i=0
2L h ih iT
(ci
X
P−
k = Wi x
Xi,k|k−1 − x̂−
k X x
i,k|k−1 − x̂−
k (23)
i=0
h i
Yk|k−1 = H X xk|k−1 , X nk−1 (24)
2L
(m)
X
ŷk− = Wi Yi,k|k−1 (25)
i=0
2L
X (c) T
Xi,k|k−1 − x̂− Yi,k|k−1 − ŷk−
Pxk yk = Wi k (27)
i=0
K = Pxk yk P−1
ỹk ỹk (28)
x̂k = x̂− −
k + K yk − ŷk (29)
Pk = P−
k − KPỹk ỹk K
T
(30)
T h iT
T T T
where, xa = xT vT nT , X a = (X x ) (X v ) (X n )
λ = composite scaling
parameter, L = dimension of augmented state, Pv = process noise cov., Pv =
measurement noise cov., Wi = weights as calculated
In comparison to EKF computation,UKF doesn’t need the calculation of
Jacobian and Hessian in the algorithm. In the noisy series of the state estimation
problem UKF has superior performance.
Feature Detection
Detector
and Matching
Descriptor
Matching
Fig. 6. Main Component Of Feature Detection And Matching under SLAM framework
Detection technique traces the interest points for an object. In the case of
edge, the boundary direction edge changes abruptly while the corner is the inter-
section of edge points. Mono-SLAM emphasizes edge segments for the construc-
tion of maps and visual SLAM on corners for locating landmarks [59]. Edges
are less affected by a sudden movement of camera which can blur the features is
explained by Klein et al. [28]. It is generally constant under changes in illumi-
nation, orientation, brightness, and scale.
Fig. 7. (a) Flat region (b) Edge region (c) Corner region
to changes. [61] The interest point along with its descriptor is its local feature
point. In Matching, the descriptors are compared for similar features across the
image1 [(Pi , Qi )] with matching features of image2 [(Pi0 , Q0i )].
The descriptor represents the difference in the binary string of certain pairs of
pixels around the interest point [62].In the algorithm of a feature descriptor, an
image is an input and all distinct keypoints are found. The region around it is
described, extracted, and normalized. The normalized region extracts encoded
local descriptor features as a number series which is differentiated easily. it is then
matched for similar features along with the image. The important characteristics
of some visual SLAM systems are summarized in this paper in Table 2.
where
G(x, y, σ) = (1/2πσ 2 ) exp −(x2 + y 2 )/2σ 2
I(x, y + 1) − I(x, y − 1)
θ(x, y) = tan−1
I(x + 1, y) − I(x − 1, y)
4. Keypoint Descriptor: Describing the keypoints as a high dimensional vector.
It describes the neighboring keypoints i.e., local structure. It compares the
intensity features making it invariant to illumination and any other changes
in viewing (viewpoint).
the keypoint is at the center of the 16X16 window containing 128 bin values
along 4X4X8 directions with generally 4X4 used over 16X16 sample arrays.
Here keypoint descriptor is represented as feature vectors.
5. Keypoint Matching. Keypoints between two images are matched by identify-
ing their nearest neighbors. Due to noise, in some cases, the second closest-
match may be nearer to the first. The nearest neighbors are identified and
keypoints of two separate images are matched.
Zhouet al. [64] explains SIFT using mean shift algorithm for tracking ob-
jects by similarity searching. An expectation-maximization algorithm evaluates
probability estimation for similarity check. This mutually supporting method is
16 X 16 Window 128 Dimensional Vector
Keypoint
Lxx (X, σ) Lxy (X, σ)
H(X, σ) = (37)
Lxy (X, σ) Lyy (X, σ)
Panchal et al. [66] made a comparative study of SIFT and SURF considering
all the dependent factors.
1. Feature Detection using FAST [56] It considers a center pixel p, with pixel
intensity Ip, for the threshold t. The arc is represented by the discontinuous
line passing through 12 adjoining pixels brighter than the fixed threshold.
First, the selected pixel is decided to be recognized as an interest point. The
segment test condition selects 16 pixels surrounding the pixel p, (Bresenham
circle) for suitability to be corner. If all the surrounding pixels are brighter
than the sum of pixel’s intensity and threshold, we can denote p as a corner
or all pixels are to be darker than Ip ≤ t.
Fig. 12. Interest point under test along with the 16 pixels on the circle around central
pixel
The more emphasis is put on the four pixels, 1,3, 9,13 (i.e., the four compass
directions). Pixel p is a corner if at least three out of these four should
achieve a threshold criterion for the existence of interest point. It must be
brighter than Ip+t or darker than Ip, else it is not a corner. This process of
full strength is checked for all the 16 pixels with high output performance.
This process of full strength is checked for all the 16 pixels with high output
performance. By overcoming the limitation of speed due to the ordering
for determination of 16 pixels and a maximum number of interest points
detection for N<12 which slows the process.
2. Machine Learning Approach The limitation of the above method is solved
to a great extent by this method. It follows two steps-
First, for constructing a corner detector from a set of images, chosen from the
training set Second, the FAST algorithm is applied to the images to find the
feature points using the segment test criterion with the earlier threshold. The
16 surrounding pixels are tested for each feature point to obtain the feature
vector p. All the individual pixels in the supposed circle x {1,2,. . . .16} pixel
x, relative to feature vector p {p → x} belongs to one of the below states.
darker
if Intensity of pixel x ≤ Ip − t
Sp → x = similar if Ip − t < Intensity of pixel x < Ip + t (38)
brighter if Ip + t ≤ Intensity of pixel x
Calonder et al. [76] proposes BRIEF, the most efficient feature descriptor
which uses an intensity difference test on few data bits for computation. It rep-
resents an image patch as a binary string Hamming distance helps for fast evalu-
ation of recognition performance. The construction and matching for the BRIEF
descriptor are faster than SIFT and SURF approaches.
Heinly et al. [77] studies about the various factors that has impact on the
BRIEF descriptor’s efficiency and robustness. As BRIEF is invariant to scales,
orientations, transition, it exhibited better performance for a non-geometric
transform as well as perspective transforms. It influences the binary string de-
scriptor.
Li et al. [78] surveys the evolution, working, application, strengths, and draw-
backs of interest point detectors. It focuses on the selection approaches of feature
extraction methods.
A set of keyframes k1 and k2 shares mappoints with current frames and neigh-
bors of k1 respectively with the reference keyframe being a subset of k1. New
keyframes are inserted to enable the robustness of the tracking process. The re-
dundant keyframes are discarded which doesn’t fulfill the criteria of minimum
50 points passing from earlier global relocalization or earlier keyframe inser-
tion. In local mapping, for keyframe insertion, the covisibility graph is updated
by adding edges and the spanning tree with keyframes (common most points).
They undergo different tests for the first three keyframes for a mappoint to be
retained within it. After the test, mappoints can be removed by the culling of
keyframes. In new mappoint creation, by the process of ORB triangulation, new
mappoints are created in the covisibility graph from connected keyframes. The
ORB pair triangulation checks are acceptance of new points, reprojection error,
positive depth of cameras. Mappoints are focused on the connected keyframes.
In local bundle adjustment, mappoints connected to the currently processed
keyframe and that present in the covisibility graph is optimized by BA. To mini-
mize the complexity, the local map detects and removes the redundant keyframes
known as local keyframe culling. Unless the scene changes, the visual features
content is fixed. It discards mappoints with 90% similarity in the following three
keyframes in the covisibility graph.
Fig. 14. Local mapping stages before loop closing
Mur-Artal et al. [85] For loop detection and closing, the currently processed
keyframe and the last processed keyframe of the local mapping is considered.
The consistency of three consecutive loop candidates is observed in the covisi-
bility graph. To satisfy the geometrical validation of a loop and for loop closing,
similarity transformation is applied. The repeated map or the matched keypoints
are fused [87]. New edges are inserted into the covisibility graph which is be-
ing repeatedly updated to attach to loop closure. This helps in loop correction.
Pose graph optimization is conducted over the essential graph for effective loop
closure. The obtained mappoint is individually transformed accordingly to any
one of the observing keyframes.
vanced light bundle adjustment optimization can exclude the 3D points online
reconstruction, thereby reducing the computational time and increased accuracy.
Bundle Adjustment optimizes the solution by minimizing the reprojection
error that occurs along with the viewed camera frame and the reprojected 3D
point. BA SLAM system faces the shortcoming of initialization, estimation, and
timely maintenance of the 3D map, which makes the processing complicated
[114]. Also during rotational as well as slow movement as proposed by Bustoset
al. [115]. To overcome these, SLAM using optimization is proposed. It carries out
incremental optimization on camera orientations. From it, the camera position
and 3D model points are estimated [116]. Consider Zi,j be the coordinates of
the i-th scene point as seen in the j-th image Zj . SfM aims to estimate the
coordinates X = xi of the scene points and poses (Rj , tj ) of the images Zj . The
BA formulation
X
min ||Zi,j − f (Xi |Rj , tj )||22 (40)
{Xi },{(Rj ,tj )}
i,j
The BA-SLAM estimates the variables which update the keyframes, map-
point selection,discarding redundant points by satisfying the conditions of loop
constraints. This makes the system complex leading to failure in optimization.
The problem also arises in the pure rotational motion [120] and slow camera mo-
tion [121]. By rotation averaging, a simpler estimation of camera orientation can
be estimated to obtain the 3D map model and camera position [122] [123] [124].
It is implemented by an advanced optimization method in visual SLAM named
as L-infinity SLAM.
In the L-infinity SLAM algorithm, the camera poses and maps are derived
using a known rotation problem by an estimated orientation [125] .L-infinity
SLAM is used to estimate the orientations, as its position and map are deduced
from global optimization [126]. The simplicity of the technology enables it to
handle pure rotational motion [127]. Rotation averaging formulation
X
min ||Rj,k − Rk R−1 2
j ||F (41)
{Rj }
j,k∈N
p^
Reprojection
error
p
where xk represent the camera poses (i.e. 6DOF position and orientation) at
time step tk , Mi is the set of landmarks which represents the 3D scene points
observed at time index i and priors represent prior information on the estimated
variables.
Bundle adjustment has a computational complexity which is directly de-
pendent on the the factors like the number of images captured, the 3D point
observation along with the actual image observations.This causes the optimiza-
tion problem. The various steps of solving BA problem
Step 1:- Calculating joint pdf, given by
Xk∗ , L∗k = argmaxP (Xk , Lk | Zk ) (43)
Xi ,Ik
Step 2:- The maximum a posteriori estimation is done each time a new im-
age is added.Incremental optimization is the requirement for good initialization.
Calculating the maximum a posteriori estimate over the joint pdf.
Xk∗ , L∗k = arg maxP (Xk , Lk | Zk ) = arg min − log P (Xk , Lk | Zk ) (44)
Xk ,Lk Xk ,Lk
.
2
here, fmm (yi , yi−1 ) = exp − 12 kyi − Φi−1 yi−1 kΣmm corresponds to the target
motion model
. 2
1 j
fproj (xi , lj ) = exp − 2 zi − proj (xi , lj ) correspond to the landmark
Σv
observation model
.
2
fproj (xi , yi ) = exp − 21 kziyi − proj (xi , yi )kΣv correspond to the target ob-
servation model
The Joint Pdf of LBA is given by ‘
Nh
Y
PLBA (X | Z) ∝ f2v/3v (Xi ) (48)
i=1
where f2v/3v represents the involved two- and three view factors
Here, f2v and f3v : are maximum likelihood of 2-view and 3-view constraints
.
2
f2v (xk , xl ) = exp − 21 kg2v (xk , xl , zk , zl )kΣ2v
2
f3v (xk , xl , xm) = exp − 12 kg3v (xk , xl , xm, zk , zl , zm )kΣ3v
In robotics application,the joint pdf includes the two and three view factors
along with factors representing measurement likelihood detected from additional
sensors as per requirement (information fusion).
7 Conclusion
Simultaneous localization and Mapping is the most advanced technique that fa-
cilitate autonomy to the robots. This technique also ensures the multi-sensor data
fusion for decision making by robots. This article summarizes different SLAM
techniques and their corresponding technical aspects for comprehensive under-
standing about SLAM. Family of both deterministic SLAM and probabilistic
SLAM are explored for understanding of comprehensive framework.
The deterministic SLAM directly uses the sensor data for tracking and map-
ping purpose. However, estimation of the probable poses and features are gener-
ally not required in featureless environment.In the laser based (LIDAR) SLAM,
the LASER sensors are used in pair with the IMU. The information in the form
of scans are processed to build the pose graph. It is fast and accurate because
laser performs with high precision. The only problem is occlusion. The vision
based SLAM (vSLAM) uses cameras paired with the IMU. The data from the
cameras keeps a track of the changes in position of the robot in motion. The
3D location can be triangulated from the successive camera frames. The error in
this approach is reprojection error.
The probabilistic SLAM adopts the state estimation techniques for percep-
tion and probable action. The next states of the systems are predicted from
the previous state (using the Bayesian formula). The motion model and the
observation model can be formulated to derive the state of the system. This
approach uses an iterative mathematical process that use a set of equations and
consecutive data inputs to quickly estimate the true value, position, velocity etc
of the object being measured, when the measured values contain unpredicted
or random error uncertainty or variation. Kalman filter(KF) uses of linearized
motion and observation models. Kalman Filter operates on past outputs . It
predicts the new state and its uncertainty (from the motion model). The cor-
rected value is deduced with the new observation measurement. The orientation
in the robot pose contain non-linearities in the robot motion model and the fea-
ture observation model. EKF-SLAM linearises both the motion and observation
models by using first-order Taylor series expansions around a working point,
which generates the current state estimate. The Jacobians causes linearization
errors.The UKF-SLAM processes the non-linear model directly.UKF addresses
the approximation issue of EKF. UKF based SLAM can accurately approximate
the third-order (Taylor series expansion) of any nonlinearity. Factor Graph per-
forms information fusion includes the joint pdf of two and three view factors
with the measurement likelihood detected from additional sensors.
Feature point-based SLAM approach includes feature detectors and descrip-
tors for ensuring stable estimation results. In this process, estimation of the
camera motion obtained from the frames captured and mapping estimates from
the feature points, the 3D pose by triangulation. Technical aspects are cate-
gorized based on different image matching techniques. These are dependent on
transformations factors such as scaling, rotation, noise and distortion.
a)SIFT features are scale-orientation invariant that helps for robot pose esti-
mation and 3D map building. The SIFT features estimates the ego-motion of a
robot.
b)SURF has additional characteristics of handling images with blurring and ro-
tation, but fails at handling viewpoint change and illumination change.
c)FAST is a corner detection method that extracts the feature points for track-
ing and mapping objects.FAST also lacks in corner detection perfectly aligned
along the x and y coordinate axes.
d) BRIEF is an efficient feature descriptor which uses an intensity difference test
on few data bits for computation. It represents an image patch as a binary string
Hamming distance helps for fast evaluation of recognition performance.
e) Oriented FAST and Rotated BRIEF (ORB) SLAM is the combination of the
FAST and BRIEF methods. FAST(Feature Accelerated Segment Test) is a key-
point detector. It calculates the intensity threshold between the center pixel and
those in a circular ring of distance. BRIEF (Binary Robust Independent Ele-
mentary Features) use binary strings as a feature point descriptor. It performs a
relatively small number of intensity difference tests to represent an image patch
as a binary string.By comparing SIFT, SURF, FAST, BRIEF and ORB, we
observe ORB is the fast algorithm. ORB gets efficient keypoint. ORB-SLAM
estimates accurate rotational movement. Real time Loop closing in an unknown
environment based on localization is an open challenge in ORB–SLAM. L-infinity
SLAM is a simpler alternative to SLAM systems based on bundle adjustment.
There is no need to maintain an accurate map and camera motions at key frame
rate as demanded by systems based on bundle adjustment.
Optimization based SLAM used to optimize the camera poses by suppressing
the accumulated error. The camera poses are represented as a graph and the
consistent graph suppresses the error in the optimization.Bundle adjustment
(BA) is used to minimize the reprojection error of the map by optimizing both
the map and the camera poses.
References
1. Arimoto, S., Kawamura, S., Miyazaki, F.: Bettering operation of robots by learn-
ing. Journal of Robotic systems 1(2), 123–140 (1984)
2. Durrant-Whyte, H., Henderson, T.C.: Multisensor data fusion. In: Springer hand-
book of robotics, pp. 867–896. Springer (2016)
3. Murphy, R.R.: Introduction to AI robotics. MIT press (2019)
4. Liu, Q., Li, R., Hu, H., Gu, D.: Extracting semantic information from visual data:
A survey. Robotics 5(1), 8 (2016)
5. Yang, G.Z., Dario, P., Kragic, D.: Social robotics—trust, learning, and social
interaction (2018)
6. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part i.
IEEE robotics & automation magazine 13(2), 99–110 (2006)
7. Morad, S.D.: The spinning projectile extreme environment robot (2019)
8. Siva, S., Zhang, H.: Omnidirectional multisensory perception fusion for long-term
place recognition. In: 2018 IEEE International Conference on Robotics and Au-
tomation (ICRA). pp. 1–9. IEEE (2018)
9. Lu, Y., Xue, Z., Xia, G.S., Zhang, L.: A survey on vision-based uav navigation.
Geo-spatial information science 21(1), 21–32 (2018)
10. Chatila, R., Laumond, J.P.: Position referencing and consistent world modeling for
mobile robots. In: Proceedings. 1985 IEEE International Conference on Robotics
and Automation. vol. 2, pp. 138–145. IEEE (1985)
11. Smith, R., Self, M., Cheeseman, P.: Estimating uncertain spatial relationships in
robotics. In: Autonomous robot vehicles, pp. 167–193. Springer (1990)
12. Clabaugh, C., Matarić, M.J.: Robots for the people, by the people: Personalizing
human-machine interaction. Science robotics 3(21), eaat7451 (2018)
13. Leonard, J.J., Durrant-Whyte, H.F.: Simultaneous map building and localization
for an autonomous mobile robot. In: IROS. vol. 3, pp. 1442–1447 (1991)
14. Dissanayake, G., Huang, S., Wang, Z., Ranasinghe, R.: A review of recent de-
velopments in simultaneous localization and mapping. In: 2011 6th International
Conference on Industrial and Information Systems. pp. 477–482. IEEE (2011)
15. Bresson, G., Alsayed, Z., Yu, L., Glaser, S.: Simultaneous localization and map-
ping: A survey of current trends in autonomous driving. IEEE Transactions on
Intelligent Vehicles 2(3), 194–220 (2017)
16. Taketomi, T., Uchiyama, H., Ikeda, S.: Visual slam algorithms: a survey from
2010 to 2016. IPSJ Transactions on Computer Vision and Applications 9(1), 16
(2017)
17. Fuentes-Pacheco, J., Ruiz-Ascencio, J., Rendón-Mancha, J.M.: Visual simultane-
ous localization and mapping: a survey. Artificial intelligence review 43(1), 55–81
(2015)
18. Dharmasiri, T., Lui, V., Drummond, T.: Mo-slam: Multi object slam with run-
time object discovery through duplicates. In: 2016 IEEE/RSJ International Con-
ference on Intelligent Robots and Systems (IROS). pp. 1214–1221. IEEE (2016)
19. Manderson, T., Shkurti, F., Dudek, G.: Texture-aware slam using stereo imagery
and inertial information. In: 2016 13th Conference on Computer and Robot Vision
(CRV). pp. 456–463. IEEE (2016)
20. Kochanov, D., Ošep, A., Stückler, J., Leibe, B.: Scene flow propagation for seman-
tic mapping and object discovery in dynamic street scenes. In: 2016 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS). pp. 1785–
1792. IEEE (2016)
21. Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An
evaluation of the rgb-d slam system. In: 2012 IEEE International Conference on
Robotics and Automation. pp. 1691–1696. IEEE (2012)
22. Belter, D., Nowicki, M., Skrzypczyński, P.: Improving accuracy of feature-based
rgb-d slam by modeling spatial uncertainty of point features. In: 2016 IEEE in-
ternational conference on robotics and automation (ICRA). pp. 1279–1284. IEEE
(2016)
23. Ataer-Cansizoglu, E., Taguchi, Y., Ramalingam, S.: Pinpoint slam: A hybrid of
2d and 3d simultaneous localization and mapping for rgb-d sensors. In: 2016
IEEE international conference on robotics and automation (ICRA). pp. 1300–
1307. IEEE (2016)
24. Kendall, A., Cipolla, R.: Modelling uncertainty in deep learning for camera relo-
calization. In: 2016 IEEE international conference on Robotics and Automation
(ICRA). pp. 4762–4769. IEEE (2016)
25. Neubert, P., Schubert, S., Protzel, P.: Sampling-based methods for visual nav-
igation in 3d maps by synthesizing depth images. In: 2017 IEEE/RSJ Interna-
tional Conference on Intelligent Robots and Systems (IROS). pp. 2492–2498.
IEEE (2017)
26. Cieslewski, T., Scaramuzza, D.: Efficient decentralized visual place recognition
from full-image descriptors. In: 2017 International Symposium on Multi-Robot
and Multi-Agent Systems (MRS). pp. 78–82. IEEE (2017)
27. Cabrera-Ponce, A.A., Martinez-Carranza, J.: A vision-based approach for au-
tonomous landing. In: 2017 Workshop on Research, Education and Development
of Unmanned Aerial Systems (RED-UAS). pp. 126–131. IEEE (2017)
28. Klein, G., Murray, D.: Improving the agility of keyframe-based slam. In: European
conference on computer vision. pp. 802–815. Springer (2008)
29. Ferrera, M., Moras, J., Trouvé-Peloux, P., Creuze, V.: Real-time monocular visual
odometry for turbid and dynamic underwater environments. Sensors 19(3), 687
(2019)
30. Kalman, R.E.: A new approach to linear filtering and prediction problems (1960)
31. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE transactions on
pattern analysis and machine intelligence 40(3), 611–625 (2017)
32. Aqel, M.O., Marhaban, M.H., Saripan, M.I., Ismail, N.B.: Review of visual odom-
etry: types, approaches, challenges, and applications. SpringerPlus 5(1), 1897
(2016)
33. Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4104–
4113 (2016)
34. Ozyesil, O., Voroninski, V., Basri, R., Singer, A.: A survey of structure from
motion. arXiv preprint arXiv:1701.08493 (2017)
35. Li, A.Q., Coskun, A., Doherty, S.M., Ghasemlou, S., Jagtap, A.S., Modasshir, M.,
Rahman, S., Singh, A., Xanthidis, M., O’Kane, J.M., et al.: Experimental com-
parison of open source vision-based state estimation algorithms. In: International
Symposium on Experimental Robotics. pp. 775–786. Springer (2016)
36. Jones, E.S., Soatto, S.: Visual-inertial navigation, mapping and localization: A
scalable real-time causal approach. The International Journal of Robotics Re-
search 30(4), 407–430 (2011)
37. Blanco, J.L., González, J., Fernández-Madrigal, J.A.: A pure probabilistic ap-
proach to range-only slam. In: 2008 IEEE International Conference on Robotics
and Automation. pp. 1436–1441. IEEE (2008)
38. Korkmaz, M., Yılmaz, N., Durdu, A.: Comparison of the slam algorithms: Hangar
experiments. In: MATEC Web of Conferences. vol. 42, p. 03009. EDP Sciences
(2016)
39. Garcı́a, S., López, M.E., Barea, R., Bergasa, L.M., Gómez, A., Molinos, E.J.:
Indoor slam for micro aerial vehicles control using monocular camera and sen-
sor fusion. In: 2016 international conference on autonomous robot systems and
competitions (ICARSC). pp. 205–210. IEEE (2016)
40. Sayre-McCord, T., Guerra, W., Antonini, A., Arneberg, J., Brown, A., Cavalheiro,
G., Fang, Y., Gorodetsky, A., McCoy, D., Quilter, S., et al.: Visual-inertial navi-
gation algorithm development using photorealistic camera simulation in the loop.
In: 2018 IEEE International Conference on Robotics and Automation (ICRA).
pp. 2566–2573. IEEE (2018)
41. Wu, K., Zhang, T., Su, D., Huang, S., Dissanayake, G.: An invariant-ekf vins
algorithm for improving consistency. In: 2017 IEEE/RSJ International Conference
on Intelligent Robots and Systems (IROS). pp. 1578–1585. IEEE (2017)
42. Kuzmin, M.: Classification and comparison of the existing slam methods for
groups of robots. In: 2018 22nd Conference of Open Innovations Association
(FRUCT). pp. 115–120. IEEE (2018)
43. Brown, R.G., Hwang, P.Y.: Introduction to random signals and applied kalman
filtering: with matlab exercises and solutions. Introduction to random signals and
applied Kalman filtering: with MATLAB exercises and solutions (1997)
44. Chatterjee, A., Rakshit, A., Singh, N.N.: Simultaneous localization and mapping
(slam) in mobile robots. In: Vision Based Autonomous Robot Navigation, pp.
167–206. Springer (2013)
45. Li, S., Ni, P.: Square-root unscented kalman filter based simultaneous localization
and mapping. In: The 2010 IEEE International Conference on Information and
Automation. pp. 2384–2388. IEEE (2010)
46. Konolige, K., Agrawal, M.: Frameslam: From bundle adjustment to real-time vi-
sual mapping. IEEE Transactions on Robotics 24(5), 1066–1077 (2008)
47. Wan, E.A., Van Der Merwe, R.: The unscented kalman filter for nonlinear estima-
tion. In: Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing,
Communications, and Control Symposium (Cat. No. 00EX373). pp. 153–158. Ieee
(2000)
48. Julier, S.J., Uhlmann, J.K.: New extension of the kalman filter to nonlinear sys-
tems. In: Signal processing, sensor fusion, and target recognition VI. vol. 3068,
pp. 182–193. International Society for Optics and Photonics (1997)
49. Hassaballah, M., Abdelmgeid, A.A., Alshazly, H.A.: Image features detection,
description and matching. In: Image Feature Detectors and Descriptors, pp. 11–
45. Springer (2016)
50. Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: A survey
and taxonomy. IEEE transactions on pattern analysis and machine intelligence
41(2), 423–443 (2018)
51. Baltrušaitis, T., Ahuja, C., Morency, L.P.: Challenges and applications in mul-
timodal machine learning. In: The Handbook of Multimodal-Multisensor Inter-
faces: Signal Processing, Architectures, and Detection of Emotion and Cognition-
Volume 2, pp. 17–48 (2018)
52. Gil, A., Mozos, O.M., Ballesta, M., Reinoso, O.: A comparative evaluation of
interest point detectors and local descriptors for visual slam. Machine Vision and
Applications 21(6), 905–920 (2010)
53. Salahat, E., Qasaimeh, M.: Recent advances in features extraction and description
algorithms: A comprehensive survey. In: 2017 IEEE international conference on
industrial technology (ICIT). pp. 1059–1063. IEEE (2017)
54. Přibyl, B., Chalmers, A., Zemčı́k, P., Hooberman, L., Čadı́k, M.: Evaluation of
feature point detection in high dynamic range imagery. Journal of Visual Com-
munication and Image Representation 38, 141–160 (2016)
55. Harris, C.G., Stephens, M., et al.: A combined corner and edge detector. In: Alvey
vision conference. vol. 15, pp. 10–5244. Citeseer (1988)
56. Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In:
European conference on computer vision. pp. 430–443. Springer (2006)
57. Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: a survey. Now
Publishers Inc (2008)
58. Aulinas, J., Carreras, M., Llado, X., Salvi, J., Garcia, R., Prados, R., Petillot,
Y.R.: Feature extraction for underwater visual slam. In: OCEANS 2011 IEEE-
Spain. pp. 1–7. IEEE (2011)
59. Eade, E., Drummond, T.: Edge landmarks in monocular slam. In: In Proc. British
Machine Vision Conf. Citeseer (2006)
60. Harris, C.G., Pike, J.: 3d positional integration from image sequences. Image and
Vision Computing 6(2), 87–90 (1988)
61. Ballesta, M., Gil, A., Martinez Mozos, O., Reinoso, O., et al.: Local descriptors
for visual slam (2007)
62. Muñoz-Salinas, R., Marı́n-Jimenez, M.J., Yeguas-Bolivar, E., Medina-Carnicer,
R.: Mapping and localization from planar markers. Pattern Recognition 73, 158–
171 (2018)
63. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Interna-
tional journal of computer vision 60(2), 91–110 (2004)
64. Zhou, H., Yuan, Y., Shi, C.: Object tracking using sift features and mean shift.
Computer vision and image understanding 113(3), 345–352 (2009)
65. Lindeberg, T.: Scale invariant feature transform (2012)
66. Panchal, P., Panchal, S., Shah, S.: A comparison of sift and surf. International
Journal of Innovative Research in Computer and Communication Engineering
1(2), 323–327 (2013)
67. Se, S., Lowe, D., Little, J.: Mobile robot localization and mapping with un-
certainty using scale-invariant visual landmarks. The international Journal of
robotics Research 21(8), 735–758 (2002)
68. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: Eu-
ropean conference on computer vision. pp. 404–417. Springer (2006)
69. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf).
Computer vision and image understanding 110(3), 346–359 (2008)
70. Kole, S., Agarwal, C., Gupta, T., Singh, S.: Surf and ransac: A conglomerative
approach to object recognition. International Journal of Computer Applications
109(4) (2015)
71. Rosten, E., Porter, R., Drummond, T.: Faster and better: A machine learning
approach to corner detection. IEEE transactions on pattern analysis and machine
intelligence 32(1), 105–119 (2008)
72. Li, A., Jiang, W., Yuan, W., Dai, D., Zhang, S., Wei, Z.: An improved fast+ surf
fast matching algorithm. Procedia Computer Science 107, 306–312 (2017)
73. Viswanathan, D.G.: Features from accelerated segment test (fast). Homepages.
Inf. Ed. Ac. Uk (2009)
74. Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. In-
ternational Journal of computer vision 37(2), 151–172 (2000)
75. Canny, J.F.: Finding edges and lines in images. Tech. rep., MASSACHUSETTS
INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE LAB (1983)
76. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent
elementary features. In: European conference on computer vision. pp. 778–792.
Springer (2010)
77. Heinly, J., Dunn, E., Frahm, J.M.: Comparative evaluation of binary features. In:
European Conference on Computer Vision. pp. 759–773. Springer (2012)
78. Li, Y., Wang, S., Tian, Q., Ding, X.: A survey of recent advances in visual feature
detection. Neurocomputing 149, 736–751 (2015)
79. Paz, L.M., Piniés, P., Tardós, J.D., Neira, J.: Large-scale 6-dof slam with stereo-
in-hand. IEEE transactions on robotics 24(5), 946–957 (2008)
80. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: An efficient alternative to
sift or surf. In: 2011 International conference on computer vision. pp. 2564–2571.
Ieee (2011)
81. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe lsh: efficient
indexing for high-dimensional similarity search. In: Proceedings of the 33rd inter-
national conference on Very large data bases. pp. 950–961 (2007)
82. Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: Orb-slam: a versatile and accurate
monocular slam system. IEEE transactions on robotics 31(5), 1147–1163 (2015)
83. Mur-Artal, R., Tardós, J.D.: Orb-slam: tracking and mapping recognizable fea-
tures. In: Workshop on Multi View Geometry in Robotics (MVIGRO)-RSS. vol.
2014, p. 2 (2014)
84. Mur-Artal, R., Tardós, J.D.: Visual-inertial monocular slam with map reuse. IEEE
Robotics and Automation Letters 2(2), 796–803 (2017)
85. Mur-Artal, R., Tardós, J.D.: Fast relocalisation and loop closing in keyframe-
based slam. In: 2014 IEEE International Conference on Robotics and Automation
(ICRA). pp. 846–853. IEEE (2014)
86. Fujimoto, S., Hu, Z., Chapuis, R., Aufrère, R.: Orb-slam map initialization im-
provement using depth. In: 2016 IEEE International Conference on Image Pro-
cessing (ICIP). pp. 261–265. IEEE (2016)
87. Majdik, A.L., Verda, D., Albers-Schoenberg, Y., Scaramuzza, D.: Air-ground
matching: Appearance-based gps-denied urban localization of micro aerial ve-
hicles. Journal of Field Robotics 32(7), 1015–1039 (2015)
88. Han, L., Zhou, G., Xu, L., Fang, L.: Beyond sift using binary features in loop clo-
sure detection. In: 2017 IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS). pp. 4057–4063. IEEE (2017)
89. Negre, P.L., Bonin-Font, F., Oliver, G.: Cluster-based loop closing detection for
underwater slam in feature-poor regions. In: 2016 IEEE International Conference
on Robotics and Automation (ICRA). pp. 2589–2595. IEEE (2016)
90. Gálvez-López, D., Tardos, J.D.: Bags of binary words for fast place recognition
in image sequences. IEEE Transactions on Robotics 28(5), 1188–1197 (2012)
91. Mur-Artal, R., Tardós, J.D.: Orb-slam2: An open-source slam system for monoc-
ular, stereo, and rgb-d cameras. IEEE Transactions on Robotics 33(5), 1255–1262
(2017)
92. Civera, J., Davison, A.J., Montiel, J.M.: Inverse depth parametrization for monoc-
ular slam. IEEE transactions on robotics 24(5), 932–945 (2008)
93. Engel, J., Stückler, J., Cremers, D.: Large-scale direct slam with stereo cameras.
In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS). pp. 1935–1942. IEEE (2015)
94. Endres, F., Hess, J., Sturm, J., Cremers, D., Burgard, W.: 3-d mapping with an
rgb-d camera. IEEE transactions on robotics 30(1), 177–187 (2013)
95. Caldato, B.A., Achilles Filho, R., Castanho, J.E.C.: Orb-odom: Stereo and odome-
ter sensor fusion for simultaneous localization and mapping. In: 2017 latin Amer-
ican robotics symposium (LARS) and 2017 Brazilian symposium on robotics
(SBR). pp. 1–5. Ieee (2017)
96. Kerl, C., Sturm, J., Cremers, D.: Dense visual slam for rgb-d cameras. In: 2013
IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 2100–
2106. IEEE (2013)
97. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: The kitti
dataset. The International Journal of Robotics Research 32(11), 1231–1237 (2013)
98. Strasdat, H., Davison, A.J., Montiel, J.M., Konolige, K.: Double window optimisa-
tion for constant time visual slam. In: 2011 international conference on computer
vision. pp. 2352–2359. IEEE (2011)
99. Lv, Q., Lin, H., Wang, G., Wei, H., Wang, Y.: Orb-slam-based tracing and 3d
reconstruction for robot using kinect 2.0. In: 2017 29th Chinese Control And
Decision Conference (CCDC). pp. 3319–3324. IEEE (2017)
100. Li, M., Zhang, M., Fu, Y., Guo, W., Zhong, X., Wang, X., Chen, F.: Fast
and robust mapping with low-cost kinect v2 for photovoltaic panel cleaning
robot. In: 2016 International Conference on Advanced Robotics and Mechatronics
(ICARM). pp. 95–100. IEEE (2016)
101. Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F.: Pl-
slam: Real-time monocular visual slam with points and lines. In: 2017 IEEE in-
ternational conference on robotics and automation (ICRA). pp. 4503–4508. IEEE
(2017)
102. Karami, E., Prasad, S., Shehata, M.: Image matching using sift, surf, brief and orb:
performance comparison for distorted images. arXiv preprint arXiv:1710.02726
(2017)
103. Chien, H.J., Chuang, C.C., Chen, C.Y., Klette, R.: When to use what feature? sift,
surf, orb, or a-kaze features for monocular visual odometry. In: 2016 International
Conference on Image and Vision Computing New Zealand (IVCNZ). pp. 1–6.
IEEE (2016)
104. Leutenegger, S., Lynen, S., Bosse, M., Siegwart, R., Furgale, P.: Keyframe-based
visual–inertial odometry using nonlinear optimization. The International Journal
of Robotics Research 34(3), 314–334 (2015)
105. Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjust-
ment—a modern synthesis. In: International workshop on vision algorithms. pp.
298–372. Springer (1999)
106. Dellaert, F.: Visual slam tutorial: Bundle adjustment (2014)
107. Engels, C., Stewénius, H., Nistér, D.: Bundle adjustment rules. Photogrammetric
computer vision 2(32) (2006)
108. Sweeney, C., Sattler, T., Hollerer, T., Turk, M., Pollefeys, M.: Optimizing the
viewing graph for structure-from-motion. In: Proceedings of the IEEE Interna-
tional Conference on Computer Vision. pp. 801–809 (2015)
109. Mouragnon, E., Lhuillier, M., Dhome, M., Dekeyser, F., Sayd, P.: Real time lo-
calization and 3d reconstruction. In: 2006 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (CVPR’06). vol. 1, pp. 363–370. IEEE
(2006)
110. Zhang, Y., Yang, J., Zhang, H., Hwang, J.N.: Bundle adjustment for monocular
visual odometry based on detected traffic sign features. In: 2019 IEEE Interna-
tional Conference on Image Processing (ICIP). pp. 4350–4354. IEEE (2019)
111. Grisetti, G., Kümmerle, R., Strasdat, H., Konolige, K.: g2o: A general framework
for (hyper) graph optimization. In: Proceedings of the IEEE International Con-
ference on Robotics and Automation (ICRA), Shanghai, China. pp. 9–13 (2011)
112. Chojnacki, M., Indelman, V.: Vision-based dynamic target trajectory and ego-
motion estimation using incremental light bundle adjustment. International Jour-
nal of Micro Air Vehicles, Special Issue on Estimation and Control for MAV Nav-
igation in GPS-denied Cluttered Environments 10(2), 157–170 (2018)
113. Indelman, V., Roberts, R., Dellaert, F.: Incremental light bundle adjust-
ment for structure from motion and robotics. Robotics and Autonomous Sys-
tems 70, 63–82 (2015), http://www.sciencedirect.com/science/article/pii/
S0921889015000810
114. Demoulin, Q., Lefebvre-Albaret, F., Basarab, A., Kouamé, D., Tourneret, J.Y.:
Constrained bundle adjustment applied to wing 3d reconstruction with mechani-
cal limitations
115. Bustos, Á.P., Chin, T.J., Eriksson, A., Reid, I.: Visual slam: Why bundle adjust?
In: 2019 International Conference on Robotics and Automation (ICRA). pp. 2385–
2391. IEEE (2019)
116. Li, X., Ling, H.: Hybrid camera pose estimation with online partitioning. arXiv
preprint arXiv:1908.01797 (2019)
117. Grisetti, G., Stachniss, C., Burgard, W.: Nonlinear constraint network optimiza-
tion for efficient map learning. IEEE Transactions on Intelligent Transportation
Systems 10(3), 428–439 (2009)
118. Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g 2 o: A gen-
eral framework for graph optimization. In: 2011 IEEE International Conference
on Robotics and Automation. pp. 3607–3613. IEEE (2011)
119. Engel, J.: Tutorial on geometric and semantic 3d reconstruction, cvpr 2017
120. Pirchheim, C., Schmalstieg, D., Reitmayr, G.: Handling pure camera rotation
in keyframe-based slam. In: 2013 IEEE international symposium on mixed and
augmented reality (ISMAR). pp. 229–238. IEEE (2013)
121. Indelman, V.: Bundle adjustment without iterative structure estimation and its
application to navigation. In: Proceedings of the 2012 IEEE/ION Position, Loca-
tion and Navigation Symposium. pp. 748–756. IEEE (2012)
122. Hartley, R., Trumpf, J., Dai, Y., Li, H.: Rotation averaging. International journal
of computer vision 103(3), 267–305 (2013)
123. Bagchi, S., Chin, T.J.: Event-based star tracking via multiresolution progressive
hough transforms. In: The IEEE Winter Conference on Applications of Computer
Vision. pp. 2143–2152 (2020)
124. Eriksson, A., Olsson, C., Kahl, F., Chin, T.J.: Rotation averaging and strong
duality. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition. pp. 127–135 (2018)
125. Sim, K., Hartley, R.: Recovering camera motion using l\infty minimization. In:
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recog-
nition (CVPR’06). vol. 1, pp. 1230–1237 (2006)
126. Kneip, L., Li, H.: Efficient computation of relative pose for multi-camera sys-
tems. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition. pp. 446–453 (2014)
127. Khosravian, A., Chin, T.J., Reid, I., Mahony, R.: A discrete-time attitude observer
on so (3) for vision and gps fusion. In: 2017 IEEE International Conference on
Robotics and Automation (ICRA). pp. 5688–5695. IEEE (2017)
128. Liu, L.Y.: Towards Observable Urban Visual SLAM. Ph.D. thesis (2020)
129. Ovechkin, V., Indelman, V.: Bafs: Bundle adjustment with feature scale con-
straints for enhanced estimation accuracy. IEEE Robotics and Automation Let-
ters (RA-L) 3(2), 804–810 (2018)
130. Dellaert, F., Kaess, M., et al.: Factor graphs for robot perception. Foundations
and Trends R in Robotics 6(1-2), 1–139 (2017)
131. Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product
algorithm. IEEE Transactions on information theory 47(2), 498–519 (2001)
132. Indelman, V., Roberts, R., Dellaert, F.: Incremental light bundle adjustment for
structure from motion and robotics. Robotics and Autonomous Systems 70, 63–82
(2015)
133. Konolige, K., Grisetti, G., Kümmerle, R., Burgard, W., Limketkai, B., Vincent,
R.: Efficient sparse pose adjustment for 2d mapping. In: 2010 IEEE/RSJ Inter-
national Conference on Intelligent Robots and Systems. pp. 22–29. IEEE (2010)
134. Carlone, L., Aragues, R., Castellanos, J.A., Bona, B.: A fast and accurate approxi-
mation for planar pose graph optimization. The International Journal of Robotics
Research 33(7), 965–987 (2014)
135. Frese, U., Larsson, P., Duckett, T.: A multilevel relaxation algorithm for simulta-
neous localization and mapping. IEEE Transactions on Robotics 21(2), 196–207
(2005)
136. Thrun, S., Montemerlo, M.: The graph slam algorithm with applications to large-
scale mapping of urban structures. The International Journal of Robotics Re-
search 25(5-6), 403–429 (2006)
137. Olson, E., Agarwal, P.: Inference on networks of mixtures for robust robot map-
ping. The International Journal of Robotics Research 32(7), 826–840 (2013)
138. Wang, H., Hu, G., Huang, S., Dissanayake, G.: On the structure of nonlinearities
in pose graph slam. In: Proc. Robot.: Sci. Syst. VIII. pp. 425–433 (2013)
139. Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J.J., Dellaert, F.: isam2:
Incremental smoothing and mapping using the bayes tree. The International Jour-
nal of Robotics Research 31(2), 216–235 (2012)
140. Carlone, L., Aragues, R., Castellanos, J.A., Bona, B.: A first-order solution to
simultaneous localization and mapping with graphical models. In: 2011 IEEE In-
ternational Conference on Robotics and Automation. pp. 1764–1771. IEEE (2011)
141. Carlone, L., Aragues, R., Castellanos, J.A., Bona, B.: A linear approximation
for graph-based simultaneous localization and mapping. Robotics: Science and
Systems VII pp. 41–48 (2012)
142. Carlone, L., Aragues, R., Castellanos, J., Bona, B.: A fast and accurate approxi-
mation for pose graph optimization. Int. J. Robot. Res (2012)
143. Gawel, A., Cieslewski, T., Dubé, R., Bosse, M., Siegwart, R., Nieto, J.: Structure-
based vision-laser matching. In: 2016 IEEE/RSJ International Conference on In-
telligent Robots and Systems (IROS). pp. 182–188. IEEE (2016)
144. Wang, C., Zhang, H., Nguyen, T.M., Xie, L.: Ultra-wideband aided fast local-
ization and mapping system. In: 2017 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS). pp. 1602–1609. IEEE (2017)
145. Liang, X., Chen, H., Li, Y., Liu, Y.: Visual laser-slam in large-scale indoor envi-
ronments. In: 2016 IEEE International Conference on Robotics and Biomimetics
(ROBIO). pp. 19–24. IEEE (2016)
146. Evers, C., Naylor, P.A.: Optimized self-localization for slam in dynamic scenes us-
ing probability hypothesis density filters. IEEE Transactions on Signal Processing
66(4), 863–878 (2017)
147. Xin, G.x., Zhang, X.t., Wang, X., Song, J.: A rgbd slam algorithm combining orb
with prosac for indoor mobile robot. In: 2015 4th International Conference on
Computer Science and Network Technology (ICCSNT). vol. 1, pp. 71–74. IEEE
(2015)
148. Wang, X., Chen, H., Li, Y.: Online calibration for monocular vision and odometry
fusion. In: 2017 IEEE International Conference on Unmanned Systems (ICUS).
pp. 602–607. IEEE (2017)
149. Guerrero-Font, E., Massot-Campos, M., Negre, P.L., Bonin-Font, F., Codina,
G.O.: An usbl-aided multisensor navigation system for field auvs. In: 2016 IEEE
International Conference on Multisensor Fusion and Integration for Intelligent
Systems (MFI). pp. 430–435. IEEE (2016)
150. Qiu, K., Liu, T., Shen, S.: Model-based global localization for aerial robots using
edge alignment. IEEE Robotics and Automation Letters 2(3), 1256–1263 (2017)
151. Li, J., Kaess, M., Eustice, R.M., Johnson-Roberson, M.: Pose-graph slam using
forward-looking sonar. IEEE Robotics and Automation Letters 3(3), 2330–2337
(2018)
152. Luo, J., Qin, S.: A fast algorithm of slam based on combinatorial interval filters.
IEEE Access 6, 28174–28192 (2018)
153. Concha, A., Loianno, G., Kumar, V., Civera, J.: Visual-inertial direct slam. In:
2016 IEEE international conference on robotics and automation (ICRA). pp.
1331–1338. IEEE (2016)
154. Fischer, T., Pire, T., Čı́žek, P., De Cristóforis, P., Faigl, J.: Stereo vision-based
localization for hexapod walking robots operating in rough terrains. In: 2016
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
pp. 2492–2497. IEEE (2016)
155. Yan, Z., Ye, M., Ren, L.: Dense visual slam with probabilistic surfel map. IEEE
transactions on visualization and computer graphics 23(11), 2389–2398 (2017)
156. Rameau, F., Ha, H., Joo, K., Choi, J., Park, K., Kweon, I.S.: A real-time aug-
mented reality system to see-through cars. IEEE transactions on visualization
and computer graphics 22(11), 2395–2404 (2016)
157. Liu, H., Zhang, G., Bao, H.: Robust keyframe-based monocular slam for aug-
mented reality. In: 2016 IEEE International Symposium on Mixed and Augmented
Reality (ISMAR). pp. 1–10. IEEE (2016)
158. Sjanic, Z., Skoglund, M.A., Gustafsson, F.: Em-slam with inertial/visual appli-
cations. IEEE Transactions on Aerospace and Electronic Systems 53(1), 273–285
(2017)
159. Zhou, H., Ni, K., Zhou, Q., Zhang, T.: An sfm algorithm with good convergence
that addresses outliers for realizing mono-slam. IEEE Transactions on Industrial
Informatics 12(2), 515–523 (2016)
160. Qin, T., Li, P., Shen, S.: Vins-mono: A robust and versatile monocular visual-
inertial state estimator. IEEE Transactions on Robotics 34(4), 1004–1020 (2018)
161. Zienkiewicz, J., Tsiotsios, A., Davison, A., Leutenegger, S.: Monocular, real-time
surface reconstruction using dynamic level of detail. In: 2016 Fourth International
Conference on 3D Vision (3DV). pp. 37–46. IEEE (2016)
162. Tateno, K., Tombari, F., Laina, I., Navab, N.: Cnn-slam: Real-time dense monoc-
ular slam with learned depth prediction. In: Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition. pp. 6243–6252 (2017)
163. Wang, Z., Xu, M., Ye, N., Wang, R., Huang, H.: Rf-mvo: Simultaneous 3d object
localization and camera trajectory recovery using rfid devices and a 2d monocular
camera. In: 2018 IEEE 38th International Conference on Distributed Computing
Systems (ICDCS). pp. 534–544. IEEE (2018)
164. Carlone, L., Karaman, S.: Attention and anticipation in fast visual-inertial navi-
gation. IEEE Transactions on Robotics 35(1), 1–20 (2018)
165. Schmuck, P., Chli, M.: Multi-uav collaborative monocular slam. In: 2017 IEEE
International Conference on Robotics and Automation (ICRA). pp. 3863–3870.
IEEE (2017)
166. Wang, S., Clark, R., Wen, H., Trigoni, N.: Deepvo: Towards end-to-end visual
odometry with deep recurrent convolutional neural networks. In: 2017 IEEE Inter-
national Conference on Robotics and Automation (ICRA). pp. 2043–2050. IEEE
(2017)
167. Teixeira, L., Chli, M.: Real-time mesh-based scene estimation for aerial inspection.
In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS). pp. 4863–4869. IEEE (2016)
168. Lajoie, P.Y., Ramtoula, B., Chang, Y., Carlone, L., Beltrame, G.: Door-slam:
Distributed, online, and outlier resilient slam for robotic teams. IEEE Robotics
and Automation Letters 5(2), 1656–1663 (2020)
169. Huang, W., Liu, H., Wan, W.: An online initialization and self-calibration method
for stereo visual-inertial odometry. IEEE Transactions on Robotics (2020)