Hcis21 030
Hcis21 030
net/publication/351948618
2D Autonomous Robot Localization Using Fast SLAM 2.0 and YOLO in Long
Corridors
CITATIONS READS
3 297
4 authors, including:
Saadane Rachid
Hassania School of Public Works
222 PUBLICATIONS 1,718 CITATIONS
SEE PROFILE
All content following this page was uploaded by Saadane Rachid on 11 December 2021.
1 Introduction
Robotics and autonomous vehicles promise advancements in the world. In intelligent
robotics, one of the most strategic elements is mobility or navigation, which represents
the ability of robots to orient themselves, move around, and recognize their environ-
ment in real-time. The design of an autonomous robot faces many problems since it is
unable to react properly to the real world. This implies a number of constraints, includ-
ing the appropriate design, the choice of sensors, methods of localization, mapping,
navigation, planning, classification of objects, and others [1]-[5].
Locating and mapping the environment of a mobile robot has received the most re-
search attention over the past decade. In order to navigate safely and in a specific way,
an autonomous mobile robot must find its position in its environment.
2
SLAM is a process by which a robot can create a map of an environment and at the
same time use that map to infer its location. In SLAM, the platform's trajectory and the
location of all landmarks are estimated online, without needing to know the location a
priori.
Consider the case of a mobile robot moving through an environment making relative
observations of many unknown landmarks using a sensor located on the robot, as shown
in Figure 1. Simultaneous estimation of robot locations and landmarks is required. True
3
locations are never known or directly measured. Observations are made between the
real robot and the benchmarks. It can be seen that there are errors between the estimated
locations and the actual landmarks. SLAM is relevant. It is considered a fundamental
problem for autonomous robots and autonomous vehicles [2].
2. 2. Probabilistic SLAM
This tracking distribution describes the common posterior density of landmark loca-
tions and vehicle condition (at time 𝑘𝑘), taking into account recorded observations and
control inputs up to and including time 𝑘𝑘, as well as the initial state of the vehicle [9].
In general, a recursive solution to the SLAM problem is desirable. Starting with an
estimate of the distribution 𝑃𝑃(𝑥𝑥𝑘𝑘−1 , 𝑚𝑚|𝑍𝑍0:𝑘𝑘−1 , 𝑈𝑈0:𝑘𝑘−1 ) at time 𝑘𝑘 − 1, following a con-
trol 𝑢𝑢𝑘𝑘 and an observation 𝑧𝑧𝑘𝑘 , this calculation is based on Bayes' theorem. Thus it is
necessary to define a state transition model and an observation model describing the
effect of the control input and the corresponding observation [9].
The observation model describes the publication of making a 𝑧𝑧𝑘𝑘 observation when
the vehicle location and landmarks are known that it is described as:
𝑃𝑃(𝑧𝑧𝑘𝑘 |𝑥𝑥𝑘𝑘 , 𝑚𝑚) (2)
It is reasonable to assume that once space and the vehicle map are permanent, the
observations are conditionally independent given the map and the current state of the
mobile robot.
The mobile robot motion model can be described in terms of the tracking distribution
over state transitions as:
𝑃𝑃(𝑥𝑥𝑘𝑘 |𝑥𝑥𝑘𝑘−1 , 𝑢𝑢𝑘𝑘 ) (3)
The state transition is assumed to be a Markov process in which the next state de-
pends only on the immediately preceding state 𝑥𝑥𝑘𝑘−1 and the applied control 𝑢𝑢𝑘𝑘 and 'it
is independent of both observations and the map.
The SLAM algorithm is now implemented in a standard two-step recursive (sequen-
tial) procedure, prediction (update time), and correction (update measurements):
• Update time:
𝑃𝑃(𝑥𝑥𝑘𝑘 , 𝑚𝑚|𝑍𝑍0:𝑘𝑘−1 , 𝑈𝑈0:𝑘𝑘 , 𝑥𝑥0 ) = ∫ 𝑃𝑃(𝑥𝑥𝑘𝑘 |𝑥𝑥𝑘𝑘−1 , 𝑢𝑢𝑘𝑘 ) ×
𝑃𝑃(𝑥𝑥𝑘𝑘−1 , 𝑚𝑚|𝑍𝑍0:𝑘𝑘−1 , 𝑈𝑈0:𝑘𝑘−1 , 𝑥𝑥0 )𝑑𝑑𝑥𝑥𝑘𝑘−1 (4)
• Update measurements:
𝑃𝑃�𝑧𝑧𝑘𝑘 �𝑥𝑥𝑘𝑘 , 𝑚𝑚�𝑃𝑃�𝑥𝑥𝑘𝑘 , 𝑚𝑚�𝑍𝑍0:𝑘𝑘−1 , 𝑈𝑈0:𝑘𝑘 , 𝑥𝑥0 �
𝑃𝑃(𝑥𝑥𝑘𝑘 , 𝑚𝑚|𝑍𝑍0:𝑘𝑘 , 𝑈𝑈0:𝑘𝑘 , 𝑥𝑥0 ) = (5)
𝑃𝑃�𝑧𝑧𝑘𝑘 �𝑍𝑍0:𝑘𝑘−1 , 𝑈𝑈0:𝑘𝑘 �
Equations (4) and (5) provide a recursive procedure for calculating the posterior joint
𝑃𝑃(𝑥𝑥𝑘𝑘 , 𝑚𝑚|𝑍𝑍0:𝑘𝑘 , 𝑈𝑈0:𝑘𝑘 , 𝑥𝑥0 ) for the state of the robot 𝑥𝑥𝑘𝑘 and the map 𝑚𝑚 at time 𝑘𝑘, based on
all observations 𝑍𝑍0:𝑘𝑘 and all control inputs 𝑈𝑈0:𝑘𝑘 up to time 𝑘𝑘.
Recursion is a function of a vehicle model 𝑃𝑃(𝑥𝑥𝑘𝑘 |𝑥𝑥𝑘𝑘−1 , 𝑢𝑢𝑘𝑘 ) and an observation model
𝑃𝑃(𝑧𝑧𝑘𝑘 |𝑥𝑥𝑘𝑘 , 𝑚𝑚). It is interesting to note that the problem of cartographic construction can
be formulated as the computation of the conditional density 𝑃𝑃(𝑚𝑚|𝑋𝑋0:𝑘𝑘 , 𝑍𝑍0:𝑘𝑘 , 𝑈𝑈0:𝑘𝑘 ). This
assumes that the location of the mobile robot 𝑥𝑥𝑘𝑘 is known (or at least deterministic) at
all times, subject to knowledge of the initial location. An 𝑚𝑚 map is then constructed by
merging observations from different locations. Conversely, the localization problem
can be formulated as the computation of the probability distribution
𝑃𝑃(𝑥𝑥𝑘𝑘 |𝑍𝑍0:𝑘𝑘 , 𝑈𝑈0:𝑘𝑘 , 𝑚𝑚). This assumes that the locations of the landmarks are known with
certainty and that the goal is to calculate an estimate of the location of the robots relative
to these landmarks.
5
2. 3. Fast SLAM
Here, the probability distribution is on the trajectory 𝑋𝑋0:𝑘𝑘 rather than on a simple 𝑥𝑥𝑘𝑘
pose because, if the distribution is conditioned on the trajectory, the map marks become
independent.
4 Simulation Results
This section presents the results and simulations of the methods studied and chosen
in localization, mapping, and intelligent system applied to autonomous robots.
Using MATLAB and a Phyton Linux system, we simulate both FastSLAM and
YOLO algorithms. We will evaluate the performance and efficiency, which the robot
must carry for the localization and mapping combined with object detection.
Portable laser rangefinders, also known as LIDAR, simultaneous location and map-
ping (SLAM) are an efficient method of acquiring ground planes. Real-time ground
plan generation and visualization help the robot assess the quality and coverage of cap-
ture data.
As we said previously, we simulate the FastSLAM 2.0 method studied previously
by using sensory information from LIDAR to the SLAM database of Matlab.
This simulation's contribution is based on a new technique called loop closure. This
technique aims to determine the environment map and the location of the robot with
great precision. This simulation aims to create a map of the environment using lidar
scans and recover the robot's trajectory.
We loaded a sub-sampled Matlab data set comprising laser scans collected from an
autonomous robot in an indoor environment. The average displacement between two
scans is approximately 0.6 meters.
A floor plan and approximate path of the robot are provided for illustration purposes.
This image shows the relative environment being mapped and the approximate route of
the robot.
In this simulation, a "Jackal" robot data from Clearpath Robotics was used. The robot
is equipped with a "SICK TiM-511" laser scanner with a maximum range of 10 meters.
We set the maximum lidar range slightly lower than the maximum scan range (8m)
because laser readings are less accurate near the maximum range. We also set the grid
map's resolution to 20 cells per meter, which gives an accuracy of 5 cm.
The following loop closure parameters are empirically defined. We have used a
higher loop close threshold to reject false positives in the loop close identification pro-
cess. Scans collected in an environment with similar or repeated functionality are more
likely to produce false positives.
We used a larger search radius for loop closures, which allows the algorithm to
search a broader range of the map around the pose estimate. We will adjust the param-
eters of the loop closure as follows:
Loop close threshold = 210
Buckle Closure Search Radius = 8
We will first add the first ten analyzes to test our algorithm. We reconstruct the scene
by tracing the scans and poses followed by FastSLAM 2.0. We will gradually add ana-
lyzes to the FastSLAM algorithm. The loop closures should be automatically detected
when the robot is moving. Graphical pose optimization is performed whenever a buckle
closure is identified. In this sense, we trace the sweeps and poses each time a loop
closure is identified, and we also visually verify the results. Finally, we draw the final
constructed map so that all analyzes have been added to FastSLAM 2.0.
7
An image of the scans and the pose graph is overlaid on the original floor plan. The
results show that the map matches the original floor plan long after adding all the ana-
lytics and optimizing the pose graph (Figure 2 and Figure 3).
This preliminary simulation presented a FastSLAM 2.0 system combining with scan
matching and loop closure detection, which gives us a graphical optimization.
8
The FastSLAM algorithm incrementally processes lidar analyzes and creates a pose
graph linking these analyzes. The robot can establish one or more loop closures along
its moving path. The FastSLAM algorithm uses the loop closure information to update
the map and adjust the estimated robot path.
This method gives us an excellent performance for mapping and locating the robot,
so SLAM with the LIDAR sensor trains the latter to autonomy.
One of the critical tasks required for an intelligent autonomous robot is detecting and
classifying objects.
The integrated YOLOV3 application allows the mobile robot to perform object de-
tection and tracking, motion estimation, 3D point cloud processing, and sensor fusion.
We use deep learning for image classification, regression, and function learning us-
ing convolutional neural networks (CNN).
YOLOV3 is the new version of the YOLO system; it also formed a new classifier
network, better than the others [12].
The system predicts bounding boxes using dimension clusters as anchor boxes [11].
These predictions are relative to the grid position and anchor size instead of the full-
frame for better performance.
YOLOv3 predicts frames at three different scales.
The system extracts the characteristics of these scales using a similar concept to
highlight pyramid networks. The Basic Features Extractor adds multiple convolutional
layers. The last of these predict a selection box encoding 3-dimensional tensor, objec-
tivity, and class predictions.
In the experiment with COCO (Data set used) [13], we predict 3 frames at each scale
so that the tensor is: 𝑁𝑁 × 𝑁𝑁 × [3 × (4 + 1 + 80)], for the 4 shifts of the selection
frame, 1 objectivity prediction and 80 class predictions. Then, we take the map of the
characteristics of 2 previous layers and we resample it twice.
A new network is used to perform feature extraction. This new network is a hybrid
approach between the network used in YOLOv2, Darknet-19, and this new type of re-
sidual network. The network uses successive 3 × 3 and 1 × 1 convulsive layer and has
shortened connections, significantly larger. This network has 53 conventional layers.
That's why it's called Darknet-53.
We have implemented the YOLOv3 algorithm with Ubuntu because YOLO's pro-
gramming has a lot of architecture and our conventional 53-layer neural network, so it
is better to program it under a Linux system.
• mAP: "mean Average Precision" is the metric for measuring the precision of
object detectors.
• Intersection on Union (IOU) measures the overlap between 2 regions. It
measures the quality of the object detector's prediction with the ground truth (the
limit of the real object).
• SSD is applied to have data increase for small objects to improve mAP; SSD
stands for "Single-Shot MultiBox Detector."
9
a robot has to plan and make decisions. That's why we studied localization and simul-
taneous mapping (SLAM) to create a map of the environment and locate the robot.
Using MATLAB software and Python under Linux system, we focused on the sim-
ulation of the algorithms studied, interpreting the results. Regarding SLAM, we have
validated a FastSLAM 2.0 system using sensory information from LIDAR and loop
closure detection, which gave a graphical optimization. This method shows excellent
performance for the mapping and localization of the autonomous robot. We validate
our testing using the YOLOv3 algorithms. This algorithm is a very powerful detector
as well as it is fast and accurate.
Many interesting perspectives could be considered in future work. We can integrate
more efficient filters into SLAM and concentrate on the complexities of computation,
the association of data, and the environment's representation.
References
1. A. Georgiev and P. K. Allen, "Localization methods for a mobile robot in urban environ-
ments," in IEEE Transactions on Robotics, vol. 20, no. 5, pp. 851-864, Oct. 2004.
2. A. Chehri, H. T. Mouftah, “Localization for Vehicular Ad Hoc Network and Autonomous
Vehicles, Are We Done Yet?”, Connected and Autonomous Vehicles in Smart Cities, CRC
Press, Taylor & Francis Group, 2020.
3. A. Chehri and P. Fortier, "Autonomous Vehicles in Underground Mines, Where We Are,
Where We Are Going?" 2020 IEEE 91st Vehicular Technology Conference (VTC2020-
Spring), Antwerp, Belgium, 2020, pp. 1-5, 2020.
4. Chehri A., Quadar N., Saadane R. (2020) Communication and Localization Techniques in
VANET Network for Intelligent Traffic System in Smart Cities: A Review. In: Qu X., Zhen
L., Howlett R., Jain L. (eds) Smart Transportation Systems 2020. Smart Innovation, Systems
and Technologies, vol 185. Springer, Singapore.
5. Kong, Fantian, Chen, Youping, Xie, Jingming, et al. Mobile robot localization based on
extended kalman filter. WCICA. p. 9242-9246, 2006.
6. H. Durrant-Whyte and T. Bailey, "Simultaneous localization and mapping: part I," in IEEE
Robotics & Automation Magazine, vol. 13, no. 2, pp. 99-110, June 2006.
7. M. Bosse, et al., "Simultaneous localization and map building in large-scale cyclic environ-
ments using the Atlas framework", Int. J. Robot. Res., vol. 23, no. 12, pp. 1113-1140, 2004.
8. A. Chehri, P. Fortier, and P. M. Tardif, "Geo-Location with Wireless Sensor Networks using
Non-linear Optimization, " Int. J. of Comput. Sci. and Network Security, vol. 8, no. 1, pp.
145-154, Jan. 2008.
9. S. L. Bowman, N. Atanasov, K. Daniilidis and G. J. Pappas, "Probabilistic data association
for semantic SLAM," 2017 IEEE International Conference on Robotics and Automation
(ICRA), Singapore, 2017, pp. 1722-1729, doi: 10.1109/ICRA.2017.7989203.
10. M. Montemerlo, et al. "Fast-SLAM: A factored solution to the simultaneous localization and
mapping problem", Proc. AAAI Nat. Conf. Artif. Intell., pp. 593-598, 2002.
11. J. Redmon and A. Farhadi, YOLO9000: Better faster stronger, 2016,
12. J. Redmon and A. Farhadi, "YOLOv3: An incremental improvement" in arXiv:1804.02767,
2018, [online] Available: https://arxiv.org/abs/1804.02767.
13. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, et al., "Microsoft COCO:
Common objects in context" in Computer Vision—ECCV, Cham, Switzerland:Springer, pp.
740-755, 2014.